Evaluating the Robustness of DQN and QR-DQN in Traffic Simulation
Analyzing the Effect of Quantile Manipulation in Environmental Variability
C. Toadere (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M.M. Celikok – Mentor (TU Delft - Sequential Decision Making)
FA Oliehoek – Graduation committee member (TU Delft - Sequential Decision Making)
Annibale Panichella – Graduation committee member (TU Delft - Software Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
As autonomous driving systems advance, ensuring the robustness of underlying decision-making algorithms becomes increasingly critical. This study assesses the performance and reliability of two reinforcement learning models, Deep Q-Network (DQN) and Quantile Regression DQN (QR-DQN), within the context of a simulated highway environment. While DQN has been widely adopted for its simplicity and effectiveness in discrete action spaces, it suffers from overestimation bias and lack of performance in out-of-distribution environments. QR-DQN addresses some of these limitations by modeling the distribution over returns using quantile regression, offering a superior representation of uncertainty. This research focuses on two core objectives: (1) implementing a riskaverse decision-making strategy using the quantiles of QR-DQN to enhance safety and reliability, and (2) evaluating the robustness of DQN and QR-DQN as the test environment deviates from training conditions. Results show the limitations of DQN and demonstrate QR-DQN’s higher robustness in different environments. Moreover, a better performing alternative of QR-DQN is presented, employing a conservative behaviour through the use of its quantiles. This puts emphasis on the implemented model’s trade-off between maximising rewards and avoiding collisions, providing a safer approach.