Risk Aversion and Guided Exploration in Safety-Constrained Reinforcement Learning

None, None

doi:10.4233/uuid:ca5a81c2-f895-4638-bce5-1423a5943381

Risk Aversion and Guided Exploration in Safety-Constrained Reinforcement Learning

Doctoral Thesis (2023)

Author(s)

Q. Yang (TU Delft - Algorithmics)

Contributor(s)

M.T.J. Spaan – Promotor (TU Delft - Algorithmics)

Simon Tindemans – Copromotor (TU Delft - Intelligent Electrical Power Grids)

Research Group

Algorithmics

Copyright

Quantile regression Constrained optimization Reinforcement Leaning (RL) Taskagnostic exploration

To reference this document use:

https://doi.org/10.4233/uuid:ca5a81c2-f895-4638-bce5-1423a5943381

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Algorithmics

ISBN (electronic)

978-94-6384-458-1

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In traditional reinforcement learning (RL) problems, agents can explore environments to learn optimal policies through trials and errors that are sometimes unsafe. However, unsafe interactions with environments are unacceptable in many safety-critical problems, for instance in robot navigation tasks. Even though RL agents can be trained in simulators, there are many real-world problems without simulators of sufficient fidelity. Constructing safe exploration algorithms for dangerous environments is challenging because we have to optimize policies under the premise of safety. In general, safety is still an open problem that hinders the wider application of RL.

Files

Dissertation_QisongYang_1_.pdf

(pdf | 28.4 Mb)

License info not available

PhD_propositions_qisong.pdf

(pdf | 0.109 Mb)

Unspecified