Lateral and Vertical Air Traffic Control Under Uncertainty Using Reinforcement Learning
More Info
expand_more
Abstract
Air traffic demand has increased at an unprecedented rate in the last decade (albeit interrupted by the COVID pandemic), but capacity has not increased at the same rate. Higher levels of automation and the implementation of decision-support tools for air traffic controllers could help increase capacity and catch up with demand. The air traffic control problem can be effectively modelled as a Markov game, where a team of aircraft (the agents) interact in the airspace (the environment) and cooperatively take resolution actions to achieve a common goal: safe separation in the most efficient way. As in any Markov game, the optimal policy for the team could be learnt through trial and error in a simulated environment using reinforcement learning algorithms. In this paper, we use the soft actor-critic algorithm to unravel the optimal air traffic control policy. Unlike some previous works, we propose a global (i.e., shared) reward that encourages cooperative behaviour. Furthermore, we propose a versatile policy model capable of performing heading, speed, and/or altitude resolution actions. We also demonstrate that the policy is robust and can maintain safe separation even in the presence of uncertainty regarding aircraft position, delays in implementing resolution actions, and wind. The findings of this paper also suggest that there is still significant room for improvement when controlling three degrees of freedom at the same time.