Evaluating Robustness of Deep Reinforcement Learning for Autonomous Driving

None, None

Evaluating Robustness of Deep Reinforcement Learning for Autonomous Driving

How does entropy maximization affect the training and robustness of final policies under various testing conditions?

Bachelor Thesis (2023)

Author(s)

B.M. Ortal (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.A. Zanger – Mentor (TU Delft - Algorithmics)

M.T.J. Spaan – Mentor (TU Delft - Algorithmics)

Elena Congeduti – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Reinforcement Learning (RL) Deep Reinforcement Learning Autonomous car

To reference this document use:

https://resolver.tudelft.nl/uuid:5413aae1-3217-424b-8199-172b7111a7b7

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

28-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This research paper aims to investigate the effect of entropy while training the agent on the robustness of the agent. This is important because robustness is defined as the agent's adaptability to different environments. A self-driving car should adapt to every environment that it is being used in since a mistake could cost someone's life. Therefore, robustness is of great importance in self-driving cars. An increase in entropy would promote the exploration of different strategies and prevent convergence on local maximum results. In order to test entropy values in training, the Soft-Actor Critic algorithm is used. The algorithm is run on a simulated city environment called Carla. In the end, collected data shows that the agent which is trained with a higher entropy value adapts to environments it did not train on better than the low entropy agent. However, a low entropy agent performs better in the environment it is trained in. Therefore, increasing the entropy increases robustness but it lowers the performance in the training environment itself.

Files

Final.pdf

(pdf | 0.408 Mb)

License info not available