Evaluating the robustness of DQN and QR-DQN under domain randomization

None, None

Evaluating the robustness of DQN and QR-DQN under domain randomization

Analyzing the effects of domain variation on value-based methods

Bachelor Thesis (2025)

Author(s)

Y. Zwetsloot (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.M. Celikok – Mentor (TU Delft - Sequential Decision Making)

FA Oliehoek – Mentor (TU Delft - Sequential Decision Making)

Annibale Panichella – Graduation committee member (TU Delft - Software Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Robustness DQN QR-DQN Sim-to-Real Domain randomization

To reference this document use:

https://resolver.tudelft.nl/uuid:efd741c8-482b-4874-bbcd-4270397c84fe

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Domain randomization (or DR) is a widely used technique in reinforcement learning to improve robustness and enable sim-to-real transfer. While prior work has focused extensively on DR in combination with algorithms such as PPO and SAC, its effects on value-based methods like DQN and QR-DQN remain underexplored. This paper investigates how varying degrees and types of DR affect the robustness and generalization capabilities of agents trained with DQN and QR-DQN. We identify clear differences in how DQN and QR-DQN respond to domain randomization, suggesting that naive application may hinder performance, whereas well-targeted distributions can enhance robustness and generalization. These findings underscore the importance of tailored DR strategies for different algorithms and contribute to a deeper understanding of DR’s role in DQN-based methods.

Files

Research_paper_Y_Zwetsloot.pdf

(pdf | 0.892 Mb)

License info not available