An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

None, None; None, None; None, None; None, None

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

Journal Article (2023)

Author(s)

R.A.N. Starre (TU Delft - Interactive Intelligence)

M. Loog (Radboud Universiteit Nijmegen)

E. Congeduti (TU Delft - Computer Science & Engineering-Teaching Team, TU Delft - Interactive Intelligence)

F.A. Oliehoek (TU Delft - Interactive Intelligence)

Research Group

Interactive Intelligence

Abstraction State Abstraction Reinforcement learning (RL) Model-based reinforcement learning

To reference this document use:

https://resolver.tudelft.nl/uuid:bb2e0f90-1d24-4860-afa6-b0ae37dff1fe

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Interactive Intelligence

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of an MDP while maintaining a bounded loss with respect to the original problem. Therefore, it may come as a surprise that no such guarantees are available when combining both techniques, i.e., where MBRL merely observes abstract states. Our theoretical analysis shows that abstraction can introduce a dependence between samples collected online (e.g., in the real world). That means that, without taking this dependence into
account, results for MBRL do not directly extend to this setting. Our result shows that we can use concentration inequalities for martingales to overcome this problem. This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction. We illustrate this by combining R-MAX, a prototypical MBRL algorithm, with abstraction, thus producing the first performance guarantees for model-based ‘RL from Abstracted Observations’: model-based reinforcement learning with an abstract model.

Files

796_An_Analysis_of_Model_Based... (pdf)

(pdf | 1.01 Mb)