Investigation into the Effect of Replay Buffer Diversity on Generalizability
F. Kaubek (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J.W. Böhmer – Mentor (TU Delft - Sequential Decision Making)
David Tax – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In reinforcement learning, the ability to generalize to unseen situations is pivotal to an agent’s success. In this thesis, two novel methods that aim to enhance the generalizability of an agent will be introduced. Both of the methods rely on the idea that the diversity of a replay buffer increases an agent’s ability to generalize. The first utilizes the agent’s exploration strategies to reach interesting states. The second aims to reach further using an additional goal-conditioned agent. Both methods demonstrate improved adaptability without relying on domain-specific knowledge and show promising results.