Evaluating Robustness of Deep Reinforcement Learning for Autonomous Driving
How does entropy maximization affect the training and robustness of final policies under various testing conditions?
B.M. Ortal (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M.A. Zanger – Mentor (TU Delft - Algorithmics)
M.T.J. Spaan – Mentor (TU Delft - Algorithmics)
Elena Congeduti – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This research paper aims to investigate the effect of entropy while training the agent on the robustness of the agent. This is important because robustness is defined as the agent's adaptability to different environments. A self-driving car should adapt to every environment that it is being used in since a mistake could cost someone's life. Therefore, robustness is of great importance in self-driving cars. An increase in entropy would promote the exploration of different strategies and prevent convergence on local maximum results. In order to test entropy values in training, the Soft-Actor Critic algorithm is used. The algorithm is run on a simulated city environment called Carla. In the end, collected data shows that the agent which is trained with a higher entropy value adapts to environments it did not train on better than the low entropy agent. However, a low entropy agent performs better in the environment it is trained in. Therefore, increasing the entropy increases robustness but it lowers the performance in the training environment itself.