The Actor-Judge Method

Safe state exploration for Hierarchical Reinforcement Learning Controllers

Conference Paper (2018)
Author(s)

Stephen Verbist

T. Mannucci (TU Delft - Control & Simulation)

EJ Kampen (TU Delft - Control & Simulation)

Research Group
Control & Simulation
Copyright
© 2018 Stephen Verbist, T. Mannucci, E. van Kampen
DOI related publication
https://doi.org/10.2514/6.2018-1634
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 Stephen Verbist, T. Mannucci, E. van Kampen
Research Group
Control & Simulation
ISBN (electronic)
978-1-62410-527-2
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reinforcement Learning is a much researched topic for autonomous machine behavior and is often applied to navigation problems. In order to deal with growing environments and larger state/action spaces, Hierarchical Reinforcement Learning has been introduced. Unfortunately learning from experience, which is central to Reinforcement Learning, makes guaranteeing safety a complex problem. This paper demonstrates an approach, named the actor-judge approach, to make the exploration safer while imposing as few as possible restrictions on the agent. The approach combines ideas from the
elds of Hierarchical Reinforcement Learning and Safe Reinforcement Learning to develop a Safe Hierarchical Reinforcement Learning algorithm. The algorithm is tested in a simulated environment where the agent represents an Unmanned Aerial Vehicle able to move laterally in four directions using quadridirectional range sensors to establish a relative position. Although this approach does not guarantee the agent to never explore unsafe areas of the state domain, results show the actor-judge method increases agent safety and can be used on multiple levels an HRL agent hierarchy.

Files

The_Actor_Judge_Method_safe_st... (pdf)
(pdf | 1.19 Mb)
- Embargo expired in 31-01-2019
License info not available