Explainable Reinforcement Learning in Flight Control through Reward Decomposition

More Info
expand_more

Abstract

Even though Deep Reinforcement Learning (DRL) techniques have proven their ability to solve highly complex control tasks, the opaqueness and inexplicability associated with these solutions many times stops them from being applied to real flight control applications. In this research, reward decomposition explanations are used to tackle this issue and augment DRL end-user explainability. A reward decomposition-based DRL controller is deployed in a longitudinal state-space model of the Cessna Citation 500 aircraft, and it is assessed on two attitude flight control tasks. Furthermore, a new explanation type called Dominant Reward eXplanations (DRX) is presented, which allows users to obtain more global insights than the ones generated by Reward Difference eXplanations (RDX). Results show that the explanations produced lead to straightforward and intuitive insights about the controller’s behaviour, capable of improving end-user explainability. Moreover, a small analysis seems to indicate that the decomposed method has similar performance to the one obtained without reward decomposition, however, training time increases considerably. To the author’s best knowledge, this is the first application of reward decomposition explanations to the flight control domain.