Print Email Facebook Twitter Sample-efficient reinforcement learning for quadcopter flight control Title Sample-efficient reinforcement learning for quadcopter flight control Author Koomen, Lenard (TU Delft Aerospace Engineering) Contributor van Kampen, Erik-jan (mentor) Degree granting institution Delft University of Technology Programme Aerospace Engineering Date 2020-06-25 Abstract The combination of reinforcement learning and deep neural networks has the potential to train intelligent autonomous agents on high dimensional sensory inputs, with applications in flight control. However, the amount of samples needed by these methods is often too large to use real-world interaction. In this work, mirror-descent guided policy search is identified as a promising algorithm to train high-dimensional policies on real-world samples. Several experiments are conducted to investigate how the use of expert-demonstrations can further improve the sample-efficiency of this algorithm when applied to the control of a quadcopter in simulation. It is shown how demonstrations, when combined with certain alterations in the mirror descent guided policy search algorithm, can significantly reduce the amount of samples needed to achieve good performance. Additionally, it is shown how these improvements are robust to sub-optimal demonstrations. Subject quadcopterflight controlreinforcement learningguided policy searchsample efficiencylearning from demonstrations To reference this document use: http://resolver.tudelft.nl/uuid:1f9b8378-b2b1-4dc3-98c2-d16a87e493e6 Part of collection Student theses Document type master thesis Rights © 2020 Lenard Koomen Files PDF master_thesis_LenardKoomen.pdf 4.38 MB Close viewer /islandora/object/uuid:1f9b8378-b2b1-4dc3-98c2-d16a87e493e6/datastream/OBJ/view