Disentangling Latent Representations in Non-Stationary Reinforcement Learning

None, None

Disentangling Latent Representations in Non-Stationary Reinforcement Learning

Master Thesis (2025)

Author(s)

R. Haket (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.T.J. Spaan – Mentor (TU Delft - Sequential Decision Making)

J.H. Krijthe – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

O. Azizi – Mentor (TU Delft - Sequential Decision Making)

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Deep Reinforcement Learning Generalization Representation Learning Contrastive Learning Disentangled Representation Learning Online Reinforcement Learning

To reference this document use:

https://resolver.tudelft.nl/uuid:d508009c-df9b-4848-983e-0f07e641eb1d

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

15-06-2025

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Model-free deep reinforcement learning has shown remarkable promise in solving highly complex sequential decision-making problems. However, the widespread adoption of reinforcement learning algorithms has not materialized in real-world applications such as robotics. A primary challenge is the general assumption that environments remain stationary at deployment. This problem is exacerbated when agents rely on pixel-based observations, dramatically increasing task complexity. As a result, these algorithms often fail when environments change over time. The perspective that agents should learn a disentangled representation has already been shown to be effective in improving generalization to domain shifts. We extend previous work by introducing Generalized Disentanglement (GED), an auxiliary contrastive learning task that encourages pixel-based deep reinforcement learning algorithms to isolate factors of variation in the data by leveraging temporal structure. We show that our methodology can improve generalization to unseen domains in several environments.

Files

Thesis_Report_Final.pdf

(pdf | 2.95 Mb)

License info not available