T.M. Monteiro Nunes | TU Delft Repository

Using Explainable Artificial Intelligence to Improve Transparency of Reinforcement Learning for Online Adaptive Flight Control

Breaking Open the Black Box

Master thesis (2022) - J.A.J. van Zijl, E. van Kampen, T.M. Monteiro Nunes

Deep Reinforcement Learning (DRL) shows great potential for flight control, due to its adaptability, fault-tolerance, and as it does not require an accurate system model. However, these techniques, like many machine learning applications, are considered black-box as their inner workings are hidden. This paper aims to break open the black box of RL for adaptive flight control by applying Shapley Additive Explanations (SHAP). The generated explanations are aimed at control experts, but can be useful for anyone interested in RL for adaptive flight control. This research proposes a novel Constant Weight Segment Detection (CWSD) algorithm, facilitating the use of eXplainable Artificial Intelligence techniques to adaptive RL. The algorithm and its usefulness are tested on an Adaptive Critic Design controlling a high-fidelity model of a Cessna Citation aircraft. It is demonstrated that SHAP in combination with CWSD provides detailed and useful insights into the relation between input and output of the RL algorithm. Using SHAP, linear relations between input and output are discovered, simplifying the understanding of the learned strategy. ...

Individual Prediction Modelling for Air Traffic Control using Supervised Learning

Master thesis (2022) - L.J.A. Kloosterman, M. Mulder, C. Borst, E. van Kampen, E. Mooij, T.M. Monteiro Nunes

In the future, Air Traffic Controllers are expected to work together with more advanced computer-based automation that can automatically take action. The main challenge is then how to design computer-based tools such that they foster acceptance among air traffic controllers. One possible approach to foster acceptance is by matching the automated decisions and actions to individual human problem-solving styles, the so-called strategic conformance. Another approach is by making the automated tool more transparent and thus interpretable. Previous research aimed to combine these two approaches by making use of the Solution Space Diagram, a decision-support tool for Conflict Detection and Resolution, as a visual feature for a supervised machine learning method that aimed to generate individual human prediction models. Results were promising, but prediction accuracy could be significantly improved. In this study, the impact of feature engineering and a revised machine learning architecture on prediction accuracy will be investigated. This is done by evaluating different feature engineering and architecture options using data generated by a simulation in which Conflict Detection and Resolution is performed. It was found that a Convolutional Neural Network can accurately predict exact resolutions using regression and a more optimized architecture is introduced which significantly increases predictive performance. Furthermore, it is concluded that a larger solution space results in a slight increase in predictive performance whereas the use of a color scheme with more colors does not necessarily result in a higher predictive performance. ...

Ecological Approach to Increase Agent Transparency in Semi-Automated Air Traffic Control

Master thesis (2021) - S. Berning, C. Borst, M. Mulder, G. de Rooij, T.M. Monteiro Nunes

Future ATM systems will rely on automation to make operations more efficient. Creating insight into the inner-workings of automation, also known as agent transparency, is expected to play an important role for effective human-machine collaboration. This research proposes an ecological approach to increase agent transparency in automated rerouting for en-route traffic. For the purpose of this study, an ecological interface for the rerouting task, developed in a previous study, was visually augmented with the constraints guiding the behavior of an experimental path-planning algorithm. This was done in two different ways: a top-down and bottom-up approach. The top-down approach starts at the goal of the system and subsequently adds information related to the physical implications, while the bottom-up approach has the reversed order. The design was tested in a human-in-the-loop experiment with ten participants. Results show that higher levels of transparency significantly increased actual and perceived understanding of the agent’s decisions. Furthermore, the top-down approach performed significantly better in questions related to the strategy of automation, while the bottom-up approach was found more useful for making predictions about the agent’s rationale for making certain decisions. Future research should investigate how agent and domain transparency could be combined and should test situation awareness in addition to understanding of automation. Additionally, because only static situations were investigated in this study, the effects of a dynamic work domain featuring various time-critical situations should be analyzed in future research. ...

Towards Explainable Automation for Air Traffic Control Using Deep Q-learning from Demonstrations and Reward Decomposition

Master thesis (2021) - M.C. Hermans, E. van Kampen, C. Borst, T.M. Monteiro Nunes

The current ATC system is seen as the most significant limitation to coping with an increased air traffic density. Transitioning towards an ATC system with a high degree of automation is essential to cope with future traffic demand of the airspace. In recent studies, reinforcement learning has shown promising results automating Conflict Detection and Resolution (CD&R) in Air Traffic Control. The acceptance of automation by Air Traffic Controllers (ATCos) remains a critical limiting factor to its implementation. This work explores how automation can be developed using Deep Q-Learning from Demonstrations (DQfD), which aims to be transparent and conforms with strategies applied by ATCos to increase acceptance of automation. Reward decomposition (RDX) is used to monitor the learning and to understand what the agent has learned. This study focuses on two-aircraft conflicts, in which the state of the controlled and observed aircraft is represented by raw pixel data of the Solution Space Diagram. It was concluded that pre-training on demonstrations speeds up learning and can increase strategic conformance between the solutions provided by the RL agent and the demonstrator. Next to increasing conformance, results also show that DQfD can improve its policy with respect to the suboptimal demonstrations used during training. Finally, RDX has allowed the designer to examine the policy learned by the RL agent in more detail. ...