Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments

None, None

Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments

Conference Paper (2019)

Author(s)

Thiago D. Simão (TU Delft - Algorithmics)

Research Group

Algorithmics

DOI related publication

https://doi.org/10.24963/ijcai.2019/919

To reference this document use:

https://resolver.tudelft.nl/uuid:e3f8e2d9-dbd2-4cc1-94c7-4698948ac853

More Info

expand_more

Publication Year

2019

Language

English

Abstract

Reinforcement Learning (RL) deals with problems that can be modeled as a Markov Decision Process (MDP) where the transition function is unknown. In situations where an arbitrary policy π is already in execution and the experiences with the environment were recorded in a batch D, an RL algorithm can use D to compute a new policy π
⁰. However, the policy computed by traditional RL algorithms might have worse performance compared to π. Our goal is to develop safe RL algorithms, where the agent has a high confidence that the performance of π
⁰ is better than the performance of π given D. To develop sample-efficient and safe RL algorithms we combine ideas from exploration strategies in RL with a safe policy improvement method.

No files available

Metadata only record. There are no files for this record.