Gradient based adversarial domain randomization

None, None

Gradient based adversarial domain randomization

Master Thesis (2024)

Author(s)

G. Koning (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Matthijs T. J. Spaan – Mentor (TU Delft - Sequential Decision Making)

J.W. Böhmer – Mentor (TU Delft - Sequential Decision Making)

D.S. van der Heijden – Mentor (TU Delft - Learning & Autonomous Control)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:5d42e278-67b3-4def-beb6-41f4b9ae3694

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

04-06-2024

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Algorithmics']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recent advancements in differential simulators offer a promising approach to enhancing the sim2real transfer of reinforcement learning (RL) agents by enabling the computation of gradients of the simulator’s dynamics with respect to its parameters. However, the application of these gradients is often limited to specific scenarios. In this thesis, we address these limitations by proposing methods to obtain accurate gradients through the use of a privileged value function. This approach provides valuable insights into the effectiveness of differential gradients and demonstrates that, in certain cases, it can significantly improve sim2real performance. To illustrate this, we develop an adversary that identifies the worst-case domain parameters for a given policy using local gradients. Our experiments are conducted on the Pendulum swing-up environment. This thesis forms the basis for the exploration of further possibilities of leveraging differential simulator gradients.

Files

Report_final_14_06_24.pdf

(pdf | 1.14 Mb)

License info not available