AB
A. Bhowal
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Potential Field Methods for Safe Reinforcement Learning
Exploring Q-Learning and Potential Fields
A Reinforcement Learning (RL) agent learns about its environment through exploration. For most physical applications such as search and rescue UAVs, this exploration must take place with safety in mind. Unregulated exploration, especially at the beginning of a run, will lead to fatal situations such as crashes. One approach to mitigating these risks is by using Artificial Potential Fields (APFs). Various approaches to effectively use the potential information gathered by the agent are proposed, tested and discussed. The agent is placed in an environment-model-free setting, where it is still provided with knowledge of its own dynamics. A gridworld simulation is developed using MATLAB to test the interoperability of APFs with Q-learning. It is shown that safety of exploration benefits from adding this layer of information to the agents’ decision making process. In effect, the Q-table gets updated more efficiently due to the agent explicitly knowing of high potential ‘dangerous’ states.
...
...
A Reinforcement Learning (RL) agent learns about its environment through exploration. For most physical applications such as search and rescue UAVs, this exploration must take place with safety in mind. Unregulated exploration, especially at the beginning of a run, will lead to fatal situations such as crashes. One approach to mitigating these risks is by using Artificial Potential Fields (APFs). Various approaches to effectively use the potential information gathered by the agent are proposed, tested and discussed. The agent is placed in an environment-model-free setting, where it is still provided with knowledge of its own dynamics. A gridworld simulation is developed using MATLAB to test the interoperability of APFs with Q-learning. It is shown that safety of exploration benefits from adding this layer of information to the agents’ decision making process. In effect, the Q-table gets updated more efficiently due to the agent explicitly knowing of high potential ‘dangerous’ states.
Bachelor thesis
(2015)
-
F.T.H. Wong, V. Margos, A. Bhowal, J. Peeters Salazar, T.E.H. Noortman, J.I. Nijsse, M.J.C. Kolff, L.E. van den Ende, R.F.H. van Maris, M.P. van Hoorn, M. Voskuijl, D.M.J. Peeters, O. Stroosma