Dynamically forecasting airline departure delay probability distributions for individual flights using supervised learning
More Info
expand_more
Abstract
Punctuality is a key performance indicator for any airline, especially hub-and-spoke airlines, given their focus on short passenger connections. Flights that are delayed at departure need to compensate for lost time whilst airborne. Because fuelling takes place well before scheduled departure, predicted departure delays determine the planned fuel amounts for en-route speed optimization. To prevent unnecessary fuel burn, airlines benefit from highly accurate departure delay predictions. This study aims to extend previous work on airline departure delay forecasting to a dynamic and probabilistic domain, whilst incorporating novel day-of-operations airline information to further minimize prediction errors. Random Forest, CatBoost, and Deep Neural Network models are proposed for a case study on departure flights of a major hub-and-spoke airline from its hub airport between 1 January 2020 and 1 August 2023. The Random Forest model is selected for its probabilistic performance and high accuracy in predicting delays between 5 and 25 min, for which en-route speed optimization has the largest effect. At the 90 min prediction horizon, the model reaches a Mean Absolute Error of 8.46 min and a Root Mean Square Error of 11.91 min. For 76% of flights, the actual delay is within the predicted probability distribution range. Finally, this study puts a strong emphasis on explainability. Flight dispatchers are therefore provided with the main factors impacting the prediction, explaining the context of the flight. The versatility of the model is demonstrated in two shadow runs within the procedures of an international airline, where delays caused by familiar and unfamiliar factors were successfully predicted.