Distributional Reinforcement Learning for Flight Control

A risk-sensitive approach to aircraft attitude control using Distributional RL

More Info
expand_more

Abstract

With the recent increase in the complexity of aerospace systems and autonomous operations, there is a need for an increased level of adaptability and model-free controller synthesis. Such operations require the controller to maintain safety and performance without human intervention in non-static environments with partial observability and uncertainty. Deep Reinforcement Learning (DRL) algorithms have the potential to increase the safety and autonomy of aerospace control systems. It has been shown that the soft actor-critic (SAC) algorithm can achieve robust control of a CS-25 certified aircraft and has the generalization power to react to failure scenarios. Traditional DRL approaches, such as the state-of-the-art SAC algorithm struggle with inconsistent learning in high-dimensional tasks and fall short of modelling uncertainty and risk in the environment. In contrast, distributional RL algorithms estimate the entire probability distribution of rewards, improve the learning characteristics and enable the synthesis of risk- sensitive policies. This paper demonstrates the improved learning characteristics of distributional soft actor-critic (DSAC) compared to traditional SAC and discusses the benefits of risk-sensitive learning applied to flight control. We show that the addition of distributional critics significantly improves learning consistency, and successfully approximates the uncertainty when applied to a fully-coupled attitude control task of a jet aircraft.