Print Email Facebook Twitter Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning Title Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning Author Ribeiro, M.J. (TU Delft Control & Simulation) Ellerbroek, Joost (TU Delft Control & Simulation) Hoekstra, J.M. (TU Delft Control & Simulation) Date 2022 Abstract Future operations involving drones are expected to result in traffic densities that are orders of magnitude higher than any observed in manned aviation. Current geometric conflict resolution (CR) methods have proven to be very efficient at relatively moderate densities. However, at higher densities, performance is hindered by the unpredictable emergent behaviour from neighbouring aircraft. Reinforcement learning (RL) techniques are often capable of identifying emerging patterns through training in the environment. Although some work has started introducing RL to resolve conflicts and ensure separation between aircraft, it is not clear how to employ these methods with a higher number of aircraft, and whether these can compare to or even surpass the performance of current CR geometric methods. In this work, we employ an RL method for distributed conflict resolution; the method is completely responsible for guaranteeing minimum separation of all aircraft during operation. Two different action formulations are tested: (1) where the RL method controls heading, and speed variation; (2) where the RL method controls heading, speed, and altitude variation. The final safety values are directly compared to a state-of-the-art distributed CR algorithm, the Modified Voltage Potential (MVP) method. Although, overall, the RL method is not as efficient as MVP in reducing the total number of losses of minimum separation, its actions help identify favourable patterns to avoid conflicts. The RL method has a more preventive behaviour, defending in advance against nearby neighbouring aircraft not yet in conflict, and head-on conflicts while intruders are still far away. Subject air traffic control (ATC)BlueSky ATC simulatorconflict detection and resolution (CD&R)modified voltage potential (MVP)reinforcementlearning (RL)self-separationsoft actor–critic (SAC)U-spacevelocity obstacles (VO) To reference this document use: http://resolver.tudelft.nl/uuid:b46fe4f3-aea7-46c7-8c3b-48133bb651d5 DOI https://doi.org/10.3390/aerospace9090472 ISSN 2226-4310 Source Aerospace — Open Access Aeronautics and Astronautics Journal, 9 (9) Part of collection Institutional Repository Document type journal article Rights © 2022 M.J. Ribeiro, Joost Ellerbroek, J.M. Hoekstra Files PDF aerospace_09_00472.pdf 2 MB Close viewer /islandora/object/uuid:b46fe4f3-aea7-46c7-8c3b-48133bb651d5/datastream/OBJ/view