Sample-efficient reinforcement learning for quadcopter flight control

Master thesis (2020)

Authors

L.T.J. Koomen Aerospace Engineering

Contributors

E. van Kampen (mentor)

Faculty

Aerospace Engineering, Aerospace Engineering

To reference this document use:

http://resolver.tudelft.nl/uuid:1f9b8378-b2b1-4dc3-98c2-d16a87e493e6

More Info

expand_more

Published Date

25-06-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Aerospace Engineering

Abstract

The combination of reinforcement learning and deep neural networks has the potential to train intelligent autonomous agents on high dimensional sensory inputs, with applications in flight control. However, the amount of samples needed by these methods is often too large to use real-world interaction. In this work, mirror-descent guided policy search is identified as a promising algorithm to train high-dimensional policies on real-world samples. Several experiments are conducted to investigate how the use of expert-demonstrations can further improve the sample-efficiency of this algorithm when applied to the control of a quadcopter in simulation. It is shown how demonstrations, when combined with certain alterations in the mirror descent guided policy search algorithm, can significantly reduce the amount of samples needed to achieve good performance. Additionally, it is shown how these improvements are robust to sub-optimal demonstrations.

Files

Master_thesis_LenardKoomen.pdf

(.pdf | 4.38 Mb)