Online Reinforcement Learning for Flight Control

An Adaptive Critic Design without prior model knowledge

More Info
expand_more

Abstract

Online Reinforcement Learning is a possible solution for adaptive nonlinear flight control. In this research an Adaptive Critic Design (ACD) based on Dual Heuristic Dynamic Programming (DHP) is developed and implemented on a simulated Cessna Citation 550 aircraft. Using an online identified system model approximation, the method is independent of prior model knowledge. The agent consists of two Artificial Neural Networks (ANNs) which form the Adaptive Critic Design and is supplemented with a Recursive Least Squares (RLS) online model estimation. The implemented agent is demonstrated to learn a near optimal control policy for different operating points, which is capable of tracking pitch and roll rate while actively minimizing the sideslip angle in a faster than real-time simulation. Providing limited model knowledge is shown to increase the learning, performance and robustness of the controller.