Conflict Prioritization with Multi-Agent Deep Reinforcement Learning

Cuppen, Daan

Conflict Prioritization with Multi-Agent Deep Reinforcement Learning

Title

Conflict Prioritization with Multi-Agent Deep Reinforcement Learning

Author

Cuppen, Daan (TU Delft Aerospace Engineering)

Contributor

Ellerbroek, J. (mentor)
Hoekstra, J.M. (mentor)
Ribeiro, M.J. (mentor)

Degree granting institution

Delft University of Technology

Programme

Aerospace Engineering

Date

2022-07-04

Abstract

To facilitate an increase in air traffic volume and to allow for more flexibility in the flight paths of aircraft, an abundance of decentralized conflict resolution (CR) algorithms have been developed. The efficiency of such algorithms often deteriorates when employed in high traffic densities. Several methods have tried to prioritize certain conflicts to alleviate part of the problems introduced at high traffic densities. However, manually establishing rules for prioritizing intruders is a difficult task due to the complex traffic patterns that emerge in multi-actor conflicts. Reinforcement Learning (RL) has demonstrated its ability to synthesize strategies while approximating the system dynamics. This research shows how RL can be employed to improve conflict prioritization in multi-actor conflicts. We employ the Proximal Policy Optimization algorithm with an actor-critic network. The RL model decides on intruder selection based on the local observations of an aircraft. It was trained on a limited number of conflict geometries in which it was able to significantly reduce the number of intrusions. A conflict prioritization strategy was then formulated based on the decisions taken by the RL model during training. We show that the efficacy of a conflict resolution algorithm that adopts a global solution, the solution space diagram (SSD) in this research, can be improved when utilizing this conflict prioritization strategy. Finally, these results were compared to the performance of a pairwise CR method, the Modified Voltage Potential (MVP). Even though MVP resulted in a smaller number of intrusions compared to SSD with conflict prioritization, the prioritization strategy did reduce the gap between the two CR methods.

To reference this document use:

http://resolver.tudelft.nl/uuid:02c5d5a2-a203-46f9-8f75-1be017d5675e

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Report_Daan_Cuppen.pdf

64.07 MB

Close viewer