End-to-end neural network based optimal quadcopter control

None, None; None, None; None, None; None, None

End-to-end neural network based optimal quadcopter control

Journal Article (2024)

Author(s)

R. Ferede (TU Delft - Control & Simulation)

Guido Cornelis Henricus Eugene de Croon (TU Delft - Control & Simulation)

C de Wagter (TU Delft - Control & Simulation)

Dario Izzo (European Space Agency (ESA))

Research Group

Control & Simulation

Copyright

DOI related publication

https://doi.org/10.1016/j.robot.2023.104588

Supervised learning Optimal control Reality gap End-to-end control G&CNet Sim-to-real transfer

To reference this document use:

https://resolver.tudelft.nl/uuid:16169a19-bf6b-4681-8ecc-18a9f2bd5e0f

More Info

expand_more

Publication Year

2024

Language

English

Copyright

Research Group

Control & Simulation

Volume number

172

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Developing optimal controllers for aggressive high-speed quadcopter flight poses significant challenges in robotics. Recent trends in the field involve utilizing neural network controllers trained through supervised or reinforcement learning. However, the sim-to-real transfer introduces a reality gap, requiring the use of robust inner loop controllers during real flights, which limits the network's control authority and flight performance. In this paper, we investigate for the first time, an end-to-end neural network controller, addressing the reality gap issue without being restricted by an inner-loop controller. The networks, referred to as G&CNets, are trained to learn an energy-optimal policy mapping the quadcopter's state to rpm commands using an optimal trajectory dataset. In hover-to-hover flights, we identified the unmodeled moments as a significant contributor to the reality gap. To mitigate this, we propose an adaptive control strategy that works by learning from optimal trajectories of a system affected by constant external pitch, roll and yaw moments. In real test flights, this model mismatch is estimated onboard and fed to the network to obtain the optimal rpm command. We demonstrate the effectiveness of our method by performing energy-optimal hover-to-hover flights with and without moment feedback. Finally, we compare the adaptive controller to a state-of-the-art differential-flatness-based controller in a consecutive waypoint flight and demonstrate the advantages of our method in terms of energy optimality and robustness.

Files

1_s2.0_S0921889023002270_main.... (pdf)

(pdf | 4.58 Mb)