MR

M.R. Rodić

1 records found

Reinforcement learning (RL) agents often achieve impressive results in simulation but can fail catastrophically when facing small deviations at deployment time. In this work, we examine the brittleness of Proximal Policy Optimization (PPO) agents when subjected to test-time obser ...