BADDr

None, None; None, None; None, None; None, None

BADDr

Bayes-Adaptive Deep Dropout RL for POMDPs

Conference Paper (2022)

Author(s)

Sammie Katt (Northeastern University)

Hai Nguyen (Northeastern University)

FA Oliehoek (TU Delft - Interactive Intelligence)

Christopher Amato (Northeastern University)

Research Group

Interactive Intelligence

Copyright

POMDP MCTS Bayesian RL

To reference this document use:

https://resolver.tudelft.nl/uuid:4f6dd8eb-f5a9-4fac-b21c-6639cf98b3eb

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Interactive Intelligence

Pages (from-to)

723-731

ISBN (electronic)

978-171385433-3

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

While reinforcement learning (RL) has made great advances in scalability, exploration and partial observability are still active research topics. In contrast, Bayesian RL (BRL) provides a principled answer to both state estimation and the exploration-exploitation trade-off, but struggles to scale. To tackle this challenge, BRL frameworks with various prior assumptions have been proposed, with varied success. This work presents a representation-agnostic formulation of BRL under partially observability, unifying the previous models under one theoretical umbrella. To demonstrate its practical significance we also propose a novel derivation, Bayes-Adaptive Deep Dropout rl (BADDr), based on dropout networks. Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem. We choose actions through Monte-Carlo tree search and empirically show that our method is competitive with state-of-the-art BRL methods on small domains while being able to solve much larger ones.

Files

3535850.3535932.pdf

(pdf | 2.09 Mb)

- Embargo expired in 05-12-2022

License info not available