Shielded reinforcement learning for flight control
G.G. Gatti (TU Delft - Aerospace Engineering)
E. van Kampen – Mentor (TU Delft - Control & Simulation)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In-flight loss of control has been consistently identified as the main cause of airborne fatalities over the last 15 years. Recent research has been focusing on improving current automatic flight controllers by introducing reinforcement learning and developing techniques to enhance in-flight safety. In this research, an offline Deep Deterministic Policy Gradient (DDPG) controller is equipped with a shield, an additional controller able to monitor the flight path angle and suggest safe actions if a risky state space is reached. The safe actions are proposed by the Safe Initial Policy (SIP) model, a pre-trained agent with knowledge about safe states and imposed by the Safety Range "(' model, a simple rule based system. The shielded DDPG controller is successful in a conventional step down approach from top of descent with a normalized Mean Absolute Error of 24.0%. The controller is robust to many different initial flight conditions, reference signals, biased sensor noise and severe turbulent flow applied with a realistic patchy turbulence model.