Potential Field Methods for Safe Reinforcement Learning

None, None

Potential Field Methods for Safe Reinforcement Learning

Exploring Q-Learning and Potential Fields

Master Thesis (2017)

Author(s)

A. Bhowal (TU Delft - Aerospace Engineering)

Contributor(s)

Erik-Jan van Kampen – Mentor

T Mannucci – Mentor

Faculty

Aerospace Engineering

Copyright

Reinforcement Learning RL AI Drone Q-Learning UAV Path Planning Artificial Potential Fields Potential Fields

To reference this document use:

https://resolver.tudelft.nl/uuid:767537cb-c54f-4496-b812-55d11d150f98

More Info

expand_more

Publication Year

2017

Language

English

Copyright

Graduation Date

11-08-2017

Awarding Institution

Delft University of Technology

Programme

['Aerospace Engineering | Control & Simulation']

Faculty

Aerospace Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A Reinforcement Learning (RL) agent learns about its environment through exploration. For most physical applications such as search and rescue UAVs, this exploration must take place with safety in mind. Unregulated exploration, especially at the beginning of a run, will lead to fatal situations such as crashes. One approach to mitigating these risks is by using Artificial Potential Fields (APFs). Various approaches to effectively use the potential information gathered by the agent are proposed, tested and discussed. The agent is placed in an environment-model-free setting, where it is still provided with knowledge of its own dynamics. A gridworld simulation is developed using MATLAB to test the interoperability of APFs with Q-learning. It is shown that safety of exploration benefits from adding this layer of information to the agents’ decision making process. In effect, the Q-table gets updated more efficiently due to the agent explicitly knowing of high potential ‘dangerous’ states.

Files

Thesis_AB_July2017_v4.pdf

(pdf | 5.79 Mb)

License info not available