Sample-efficient reinforcement learning for quadcopter flight control

None, None

Sample-efficient reinforcement learning for quadcopter flight control

Master Thesis (2020)

Author(s)

L.T.J. Koomen (TU Delft - Aerospace Engineering)

Contributor(s)

EJ van Kampen – Mentor (TU Delft - Control & Simulation)

Faculty

Aerospace Engineering

Copyright

Reinforcement learning Flight control Quadcopter Sample efficiency Learning from demonstrations Guided policy search

To reference this document use:

https://resolver.tudelft.nl/uuid:1f9b8378-b2b1-4dc3-98c2-d16a87e493e6

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

25-06-2020

Awarding Institution

Delft University of Technology

Programme

['Aerospace Engineering']

Faculty

Aerospace Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The combination of reinforcement learning and deep neural networks has the potential to train intelligent autonomous agents on high dimensional sensory inputs, with applications in flight control. However, the amount of samples needed by these methods is often too large to use real-world interaction. In this work, mirror-descent guided policy search is identified as a promising algorithm to train high-dimensional policies on real-world samples. Several experiments are conducted to investigate how the use of expert-demonstrations can further improve the sample-efficiency of this algorithm when applied to the control of a quadcopter in simulation. It is shown how demonstrations, when combined with certain alterations in the mirror descent guided policy search algorithm, can significantly reduce the amount of samples needed to achieve good performance. Additionally, it is shown how these improvements are robust to sub-optimal demonstrations.

Files

Master_thesis_LenardKoomen.pdf

(pdf | 4.38 Mb)

License info not available