Print Email Facebook Twitter Determining Optimal Conflict Avoidance Manoeuvres At High Densities With Reinforcement Learning Title Determining Optimal Conflict Avoidance Manoeuvres At High Densities With Reinforcement Learning Author Ribeiro, M.J. (TU Delft Control & Simulation) Ellerbroek, J. (TU Delft Control & Simulation) Hoekstra, J.M. (TU Delft Control & Operations) Department Control & Operations Date 2020 Abstract The use of drones for applications such as package delivery, in an urban setting, would result in traffic densities that are orders of magnitude higher than any observed in manned aviation. Current geometric resolution models have proven to be very efficient at relatively moderate densities. However, at higher densities, performance is hindered by the unpredictable emergent behaviour from neighbouring aircraft. In this paper, we use a hybrid solution between existing geometric resolution approaches and reinforcement learning (RL), directed at improving conflict resolution performance at high densities. We resort to a Deep Deterministic Policy Gradient (DDPG) model to improve the behaviour of the Modified Voltage Potential (MVP) geometric conflict resolution method. By default, the MVP method generates avoidance manoeuvres of a geometrically-defined type, using a fixed look-ahead time. In the current study, we instead aim to use RL to determine the values for these variables, based on intruder position and traffic density. The analysis in this paper specifically addresses the difficulty of training algorithms in a cooperative multi-agent case to converge to optimal values. We prove that finding the right representation of state/rewards in a nonstationary environment is non-trivial and highly influences the learning process. Finally, we show that a variation of resolution manoeuvres can improve the safety of several scenarios at high traffic densities. Subject Conflict Detection and Resolution (CD&R)Reinforcement Leaning (RL)), Deep Deterministic Policy Gradient (DDPG)U-SpaceUnmanned Traffic Management (UTM)Modified Voltage Potential (MVP)BlueSkyATC Simulator To reference this document use: http://resolver.tudelft.nl/uuid:31d670ae-1799-4e2c-8f02-b4d06bde186d Source 10th SESAR Innovation Days Event 10th SESAR Innovation Days, 2020-12-07 → 2020-12-10, Virtual/online event due to COVID-19 Bibliographical note Virtual/online event due to COVID-19 Part of collection Institutional Repository Document type conference paper Rights © 2020 M.J. Ribeiro, J. Ellerbroek, J.M. Hoekstra Files PDF SIDs_2020_paper_60red.pdf 389.99 KB Close viewer /islandora/object/uuid:31d670ae-1799-4e2c-8f02-b4d06bde186d/datastream/OBJ/view