Predicting Probabilistic Flight Delay for Individual Flights using Machine Learning Models

Master Thesis (2021)
Author(s)

L.S.A.B. Vorage (TU Delft - Aerospace Engineering)

Contributor(s)

M. Mitici – Mentor (TU Delft - Air Transport & Operations)

Faculty
Aerospace Engineering
Copyright
© 2021 Laurence Vorage
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Laurence Vorage
Graduation Date
23-02-2021
Awarding Institution
Delft University of Technology
Programme
['Aerospace Engineering']
Faculty
Aerospace Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To ensure continuous operations at airports, operational schedules need to be able to cope with early and late arriving/departing flights. To optimize such schedules for these events, flight delay predictions are necessary. Till now, flight delay has been studied mainly from a binary and regression standpoint. These type of predictions lack a certainty indication, which is needed by decision-makers to optimize a schedule. This research focuses on probabilistic predictions of flight delay on an individual flight level. Flight operation data from Amsterdam Schiphol airport and weather data from METAR is used to train four machine learning models that predict probabilistic flight delay distributions on a prediction horizon of one day. The first two developed models, a Dropout neural network and a Random Forest, predict empirical distributions. The other two models, a Mean Variance Estimator (MVE) and a Mixture Density Network (MDN) model, predict continuous distributions by predicting the parameters of an assumed distribution. The MVE predicts a single normal distribution, while the MDN model predicts a mixture of multiple normal distributions. Two interval-based performance metrics are suggested which assess the probabilistic predictions from different perspectives. It is concluded that arrival delay is best modelled with a MDN model with ten components, while departure delay is best modelled with only three components. Both models outperform a baseline statistical method on all considered interval widths. The MDN models outperform the other machine learning models by predicting more early arrivals and early departures correctly. The proposed models show that they can provide certainty indications of flight delay on an individual flight level, as opposed to binary and regression models. In the future, these certainty indications can assist airport operators with optimizing operational schedules, ensuring continuous operations and making ultimately air travel more pleasant.

Files

License info not available