Comparative Analysis of Exploration Algorithms in Deep Reinforcement Learning for Autonomous Driving

How does epsilon-greedy, random network distillation, bootstrapped DQN affect training and the robustness of final policies under various testing conditions in autonomous driving?

Bachelor Thesis (2023)
Author(s)

E. Sozen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.A. Zanger – Mentor (TU Delft - Algorithmics)

M.T.J. Spaan – Mentor (TU Delft - Algorithmics)

E. Congeduti – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2023
Language
English
Graduation Date
28-06-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Autonomous driving is a rapidly evolving field that aims to enhance road safety and reduce accidents through the use of advanced software and hardware technologies. Reinforcement learning (RL) combined with deep neural networks has emerged as a promising approach for training autonomous agents. This research paper investigates three exploration algorithms —Epsilon-Greedy, Random Network Distillation (RND), and Bootstrapped Deep Q-Network (DQN)— within the context of autonomous driving. Performance is assessed based on episodic returns in training and testing environments, as well as the time required to train the networks. The results show significant improvement in learning capability using Bootstrapped DQN without critical differences in training time. There also exists a potential to increase episodic returns further given an increase in the number of steps to train the models.

Files

License info not available