Effectiveness of trip planner data in predicting short-term bus ridership

None, None; None, None; None, None; None, None; None, None; None, None

Effectiveness of trip planner data in predicting short-term bus ridership

Conference Paper (2022)

Author(s)

Z. Wang (TU Delft - Transport and Planning)

A.J. Pel (TU Delft - Transport and Planning)

Trivik Verma (TU Delft - Policy Analysis)

P.K. Krishnakumari (TU Delft - Transport and Planning)

Peter van Brakel (REISinformatiegroep)

Niels Oort (TU Delft - Transport and Planning)

Research Group

Transport and Planning

Copyright

Machine Learning Public Transport Trip Planner Bus Ridership Prediction

To reference this document use:

https://resolver.tudelft.nl/uuid:9c3e1d1e-9a5b-47a3-b5f2-21bf4b3af6eb

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Transport and Planning

Pages (from-to)

1-24

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Predictions on public transport ridership are beneficial as they allow for sufficient and cost-efficient deployment of vehicles. At an operational level, this relates to short-term predictions with lead times of less than an hour. Where conventional data sources on ridership, such as Automatic Fare Collection (AFC) data, may have longer lag times, in contrast, trip planner data is often available in (near) real-time. This paper analyzes how such data from a trip planner app can be utilized for short-term bus ridership predictions. This is combined with AFC data (in this case smart card data) to construct a ground-truth on actual ridership. The trip planner data is studied using correlation analysis to select informative variables, that are then used to develop 4 supervised machine learning models (linear, k-nearest neighbors, random forest, and gradient boosting decision tree). The best performing model relies on random forest regression and reduces the error by approximately half compared to a baseline model based on the weekly trend. We show that this model performance is maintained even for prediction lead times up to 30 minutes ahead, and for different periods of the day.

Files

CASPT_2021_paper_6.pdf

(pdf | 2.71 Mb)

- Embargo expired in 01-07-2023

License info not available