Effects of action space discretization and DQN extensions on algorithm robustness and efficiency

How do the discretization of the action space and various extensions to the well-known DQN algorithm influence training and the robustness of final policies under various testing conditions?

Bachelor Thesis (2023)
Author(s)

M.A. Sözüdüz (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. T.J. Spaan – Mentor (TU Delft - Algorithmics)

M.A. Zanger – Mentor (TU Delft - Algorithmics)

E. Congeduti – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Mehmet Sözüdüz
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Mehmet Sözüdüz
Graduation Date
28-06-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reinforcement Learning (RL) has gained atten-tion as a way of creating autonomous agents for self-driving cars. This paper explores the adap- tation of the Deep Q Network (DQN), a popular deep RL algorithm, in the Carla traffic simulator for autonomous driving. It investigates the influ- ence of action space discretization and DQN ex-
tensions on training performance and robustness. Results show that action space discretization en- hances behaviour consistency but negatively af- fects Q-values, training performance, and robust- ness. Double Q-Learning decreases training per- formance and leads to suboptimal convergence, re- ducing robustness. Prioritized Experience Replay
also performs worse during training, but consis-tently outperforms in robustness testing, reward es-timation and generalization.

Files

Research.pdf
(pdf | 0.679 Mb)
License info not available