Self-corrective Apprenticeship Learning for Quadrotor Control

Master thesis (2019)

Authors

H. Yuan Aerospace Engineering

Contributors

E. van Kampen (mentor)

Q. P. Chu (graduation committee member)

D. Dirkx Astrodynamics & Space Missions - Aerospace Engineering (graduation committee member)

B. Sun (graduation committee member)

Faculty

Aerospace Engineering, Aerospace Engineering

To reference this document use:

http://resolver.tudelft.nl/uuid:fd8ab65d-7869-4b21-b09d-bdba9a74cb36

More Info

expand_more

Published Date

27-11-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Aerospace Engineering

Abstract

The control of aircraft can be carried out by Reinforcement Learning agents; however, the difficulty of obtaining sufficient training samples often makes this approach infeasible. Demonstrations can be used to facilitate the learning process, yet algorithms such as Apprenticeship Learning generally fail to produce a policy that outperforms the demonstrator, and thus cannot efficiently generate policies. In this paper, a model-free learning algorithm with Reinforcement Learning in the loop, based on Apprenticeship Learning, is therefore proposed. This algorithm uses external measurement to improve on the initial demonstration, finally producing a policy that surpasses the demonstration. Efficiency is further improved by utilising the policies produced during the learning process. The empirical results for simulated quadrotor control show that the proposed algorithm is effective and can even learn good policies from a bad demonstration.

Files

MScThesis_HaoranYuan.pdf

(pdf | 3.21 Mb)