Short Term Delay Prediction in Passenger Railways

Using Machine Learning; Applied in the Dutch Rail Network

More Info
expand_more

Abstract

We test the effect of a variety of feature sets representing passenger volumes, weather conditions and train interactions, when defined as features and used in a gradient boosting model to predict passenger train delays 20 minutes to the future from the last registration point. Effects of the features and their combinations on the prediction quality are analyzed and the best performing feature sets selected. The results showed that the passenger volumes features (in the form as defined in our work) do not have any prediction power and rather introduced noise in the predictions. The weather features resulted in reduced expected delay change with a slight positive effect on precision of the classification task while worsening the recall. The largest positive effect was observed when train interaction features were introduced despite their highly simplified form. Considering the low computational efforts necessary to retrieve the features, we conclude there is a potential for application of similarly defined train interactions features in other models.