Off-policy experience retention for deep actor-critic learning

Conference Paper (2016)
Author(s)

Tim de Bruin (TU Delft - OLD Intelligent Control & Robotics)

Jens Kober (TU Delft - OLD Intelligent Control & Robotics)

K.P. Tuyls (TU Delft - OLD Intelligent Control & Robotics, University of Liverpool)

Robert Babuska (TU Delft - OLD Intelligent Control & Robotics)

Research Group
OLD Intelligent Control & Robotics
More Info
expand_more
Publication Year
2016
Language
English
Related content
Research Group
OLD Intelligent Control & Robotics

Abstract

When a limited number of experiences is kept in memory to train a reinforcement learning agent, the criterion that determines which experiences are retained can have a strong impact on the learning performance. In this paper, we argue that for actor critic learning in domains with significant momentum, it is important to retain experiences with off-policy actions when the amount of exploration is reduced over time. This claim is supported by simulation experiments with a pendulum swing-up problem and a magnetic manipulation task. Additionally, we compare our strategy to database overwriting policies based on obtaining experiences spread out over the state-action space, and also to using the temporal difference error as a proxy for the value of experiences.

No files available

Metadata only record. There are no files for this record.