Evaluating the Robustness of DQN and QR-DQN in Traffic Simulation

None, None

Evaluating the Robustness of DQN and QR-DQN in Traffic Simulation

Analyzing the Effect of Quantile Manipulation in Environmental Variability

Bachelor Thesis (2025)

Author(s)

C. Toadere (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.M. Celikok – Mentor (TU Delft - Sequential Decision Making)

F.A. Oliehoek – Graduation committee member (TU Delft - Sequential Decision Making)

A. Panichella – Graduation committee member (TU Delft - Software Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Robustness DQN Highway QR-DQN Quantile Regression Autonomous Driving Risk-sensitive RL Quantile Manipulation

To reference this document use:

https://resolver.tudelft.nl/uuid:736447d4-a824-4361-afe3-971351b4554a

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

As autonomous driving systems advance, ensuring the robustness of underlying decision-making algorithms becomes increasingly critical. This study assesses the performance and reliability of two reinforcement learning models, Deep Q-Network (DQN) and Quantile Regression DQN (QR-DQN), within the context of a simulated highway environment. While DQN has been widely adopted for its simplicity and effectiveness in discrete action spaces, it suffers from overestimation bias and lack of performance in out-of-distribution environments. QR-DQN addresses some of these limitations by modeling the distribution over returns using quantile regression, offering a superior representation of uncertainty. This research focuses on two core objectives: (1) implementing a riskaverse decision-making strategy using the quantiles of QR-DQN to enhance safety and reliability, and (2) evaluating the robustness of DQN and QR-DQN as the test environment deviates from training conditions. Results show the limitations of DQN and demonstrate QR-DQN’s higher robustness in different environments. Moreover, a better performing alternative of QR-DQN is presented, employing a conservative behaviour through the use of its quantiles. This puts emphasis on the implemented model’s trade-off between maximising rewards and avoiding collisions, providing a safer approach.

Files

Research_Project_Paper_Cristia... (pdf)

(pdf | 1.86 Mb)

License info not available