Bayesian Model-Free Deep Reinforcement Learning

None, None

Bayesian Model-Free Deep Reinforcement Learning

Conference Paper (2024)

Author(s)

P.R. van der Vaart (TU Delft - Sequential Decision Making)

Research Group

Sequential Decision Making

To reference this document use:

https://resolver.tudelft.nl/uuid:d51fb0b9-c27e-440e-8f49-f26638943c4b

More Info

expand_more

Publication Year

2024

Language

English

Research Group

Sequential Decision Making

Pages (from-to)

2782-1784

ISBN (electronic)

979-8-4007-0486-4

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Exploration in reinforcement learning remains a difficult challenge. In order to drive exploration, ensembles with randomized prior functions have recently been popularized to quantify uncertainty in the value model. However these ensembles have no theoretical reason to resemble the actual Bayesian posterior, which is known to provide strong performance in theory under certain conditions. In this thesis work, we view training ensembles from the perspective of Sequential Monte Carlo, a Monte Carlo method that approximates a sequence of distributions with a set of particles, and propose an algorithm that exploits both the practical flexibility of ensembles and theory of the Bayesian paradigm. We incorporate this method into a standard DQN agent and experimentally show qualitatively good uncertainty quantification and improved exploration capabilities over a regular ensemble. In the future, we will investigate the impact of likelihood and prior choices in Bayesian model-free reinforcement learning methods.

Files

P2782.pdf

(pdf | 0.926 Mb)