A. Keijzer

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Master thesis (1)

1 records found

Prioritizing states with action sensitive return in experience replay

Master thesis (2023) - A. Keijzer, J. Kober, D.S. van der Heijden, R. Babuska, J.W. Böhmer

Experience replay for off-policy reinforcement learning has been shown to improve sample efficiency and stabilize training. However, typical uniformly sampled replay includes many irrelevant samples for the agent to reach good performance. We introduce Action Sensitive Experience Replay (ASER), a method to prioritize samples in the replay buffer and selectively model parts of the state-space more accurately where choosing sub-optimal actions has a larger effect on the return. We experimentally show that this can make training more sample efficient and that this allows smaller function approximators -- like neural networks with few neurons -- to achieve good performance in environments where they would otherwise struggle. ...