Evaluating the robustness of DQN and QR-DQN under domain randomization

Analyzing the effects of domain variation on value-based methods

Bachelor Thesis (2025)
Author(s)

Y. Zwetsloot (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.M. Celikok – Mentor (TU Delft - Sequential Decision Making)

FA Oliehoek – Mentor (TU Delft - Sequential Decision Making)

Annibale Panichella – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
25-06-2025
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Domain randomization (or DR) is a widely used technique in reinforcement learning to improve robustness and enable sim-to-real transfer. While prior work has focused extensively on DR in combination with algorithms such as PPO and SAC, its effects on value-based methods like DQN and QR-DQN remain underexplored. This paper investigates how varying degrees and types of DR affect the robustness and generalization capabilities of agents trained with DQN and QR-DQN. We identify clear differences in how DQN and QR-DQN respond to domain randomization, suggesting that naive application may hinder performance, whereas well-targeted distributions can enhance robustness and generalization. These findings underscore the importance of tailored DR strategies for different algorithms and contribute to a deeper understanding of DR’s role in DQN-based methods.

Files

License info not available