Air traffic delays have a major impact on the aviation industry, affecting airlines, passengers, and the broader ecosystem. With increasing regulatory and sustainability pressures, accurate delay predictions are critical as they allow for precise determination of the contingency
...
Air traffic delays have a major impact on the aviation industry, affecting airlines, passengers, and the broader ecosystem. With increasing regulatory and sustainability pressures, accurate delay predictions are critical as they allow for precise determination of the contingency and discretionary fuel required for flights. This research aims to develop an explainable supervised learning model to improve existing en route delay predictions, focusing on intercontinental flights from North America to Amsterdam Schiphol Airport. While prior studies have explored flight delay prediction, they have not addressed two critical research gaps identified in this research: the inclusion of day-of-operations features, such as passenger information, aircraft weights, and cost index, and the use of transatlantic flight data for predictions 90 minutes before departure. To address these gaps, two Gradient-Boosted models, CatBoost and LightGBM, were trained using internal airline, airport, and METAR data. Both models outperformed the airline’s current in-use statistical model, with CatBoost achieving an MAE of 3.44 minutes and RMSE of 4.61 minutes and LightGBM achieving an MAE of 3.43 minutes and RMSE of 4.56 minutes. The most significant performance increase over the current model was observed under adverse weather conditions. This research advances en route delay prediction by providing more accurate delay forecasts, particularly in critical weather conditions, and proposes practical improvements to support future studies focused on enhancing model adaptability across diverse operational contexts.