Disentangling Latent Representations in Non-Stationary Reinforcement Learning
R. Haket (TU Delft - Electrical Engineering, Mathematics and Computer Science)
MTJ Spaan – Mentor (TU Delft - Sequential Decision Making)
J.H. Krijthe – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
O. Azizi – Mentor (TU Delft - Sequential Decision Making)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Model-free deep reinforcement learning has shown remarkable promise in solving highly complex sequential decision-making problems. However, the widespread adoption of reinforcement learning algorithms has not materialized in real-world applications such as robotics. A primary challenge is the general assumption that environments remain stationary at deployment. This problem is exacerbated when agents rely on pixel-based observations, dramatically increasing task complexity. As a result, these algorithms often fail when environments change over time. The perspective that agents should learn a disentangled representation has already been shown to be effective in improving generalization to domain shifts. We extend previous work by introducing Generalized Disentanglement (GED), an auxiliary contrastive learning task that encourages pixel-based deep reinforcement learning algorithms to isolate factors of variation in the data by leveraging temporal structure. We show that our methodology can improve generalization to unseen domains in several environments.