Distributional Reinforcement Learning for Flight Control

A risk-sensitive approach to aircraft attitude control using Distributional RL

Master thesis (2022)

Authors

P. Seres Aerospace Engineering

Contributors

E. van Kampen (supervisor 1)

C. Liu (supervisor 1)

Faculty

Aerospace Engineering

Reinforcement Learning (RL) Deep Reinforcement Learning Flight Control Autonomous Control Distributional Reinforcement Learning

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:6cd3efd1-b755-4b04-8b9b-93f9dabb6108

Published Date

09-11-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Aerospace Engineering

Abstract

With the recent increase in the complexity of aerospace systems and autonomous operations, there is a need for an increased level of adaptability and model-free controller synthesis. Such operations require the controller to maintain safety and performance without human intervention in non-static environments with partial observability and uncertainty. Deep Reinforcement Learning (DRL) algorithms have the potential to increase the safety and autonomy of aerospace control systems. It has been shown that the soft actor-critic (SAC) algorithm can achieve robust control of a CS-25 certified aircraft and has the generalization power to react to failure scenarios. Traditional DRL approaches, such as the state-of-the-art SAC algorithm struggle with inconsistent learning in high-dimensional tasks and fall short of modelling uncertainty and risk in the environment. In contrast, distributional RL algorithms estimate the entire probability distribution of rewards, improve the learning characteristics and enable the synthesis of risk- sensitive policies. This paper demonstrates the improved learning characteristics of distributional soft actor-critic (DSAC) compared to traditional SAC and discusses the benefits of risk-sensitive learning applied to flight control. We show that the addition of distributional critics significantly improves learning consistency, and successfully approximates the uncertainty when applied to a fully-coupled attitude control task of a jet aircraft.

Files

Msc_Thesis_Report_Peter_Seres.... (.pdf)

(.pdf | 12.8 Mb)