Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning

Ribeiro, M.J.; Ellerbroek, Joost; Hoekstra, J.M.

doi:10.3390/aerospace9090472

Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning

Title

Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning

Author

Ribeiro, M.J. (TU Delft Control & Simulation)
Ellerbroek, Joost (TU Delft Control & Simulation)
Hoekstra, J.M. (TU Delft Control & Simulation)

Date

2022

Abstract

Future operations involving drones are expected to result in traffic densities that are orders of magnitude higher than any observed in manned aviation. Current geometric conflict resolution (CR) methods have proven to be very efficient at relatively moderate densities. However, at higher densities, performance is hindered by the unpredictable emergent behaviour from neighbouring aircraft. Reinforcement learning (RL) techniques are often capable of identifying emerging patterns through training in the environment. Although some work has started introducing RL to resolve conflicts and ensure separation between aircraft, it is not clear how to employ these methods with a higher number of aircraft, and whether these can compare to or even surpass the performance of current CR geometric methods. In this work, we employ an RL method for distributed conflict resolution; the method is completely responsible for guaranteeing minimum separation of all aircraft during operation. Two different action formulations are tested: (1) where the RL method controls heading, and speed variation; (2) where the RL method controls heading, speed, and altitude variation. The final safety values are directly compared to a state-of-the-art distributed CR algorithm, the Modified Voltage Potential (MVP) method. Although, overall, the RL method is not as efficient as MVP in reducing the total number of losses of minimum separation, its actions help identify favourable patterns to avoid conflicts. The RL method has a more preventive behaviour, defending in advance against nearby neighbouring aircraft not yet in conflict, and head-on conflicts while intruders are still far away.

Subject

air traffic control (ATC)
BlueSky ATC simulator
conflict detection and resolution (CD&ampR)
modified voltage potential (MVP)
reinforcementlearning (RL)
self-separation
soft actor–critic (SAC)
U-space
velocity obstacles (VO)

To reference this document use:

http://resolver.tudelft.nl/uuid:b46fe4f3-aea7-46c7-8c3b-48133bb651d5

DOI

https://doi.org/10.3390/aerospace9090472

ISSN

2226-4310

Source

Aerospace, 9 (9)

Bibliographical note

https://github.com/TUDelft-CNS-ATM/bluesky

Part of collection

Institutional Repository

Document type

journal article

Rights

Files

PDF

aerospace_09_00472.pdf

2 MB

Close viewer