Estimate the limit of predictability in short-term traffic forecasting

None, None; None, None; None, None

Estimate the limit of predictability in short-term traffic forecasting

An entropy-based approach

Journal Article (2022)

Author(s)

G. Li (TU Delft - Transport and Planning)

V.L. Knoop (TU Delft - Transport and Planning)

J.W.C. van Lint (TU Delft - Transport and Planning)

Transport and Planning

Copyright

DOI related publication

https://doi.org/10.1016/j.trc.2022.103607

Traffic forecasting Information theory Conditional differential entropy Predictability analysis

To reference this document use:

https://resolver.tudelft.nl/uuid:68be5168-c72d-4e64-815f-7672923917bf

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Transport and Planning

Volume number

138

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Accurate short-term traffic forecasting is the cornerstone for Intelligent Transportation Systems. In the past several decades, many models have been proposed to continuously improve the predictive accuracy. A key but unsolved question is whether there is a theoretical bound to the accuracy with which traffic can be predicted and whether that limit can be directly estimated from data. To answer this question, we use core concepts in information theory to derive the limit of predictability in short-term traffic forecasting. Theoretical analysis proves that conditional differential entropy poses a rigorous lower bound of negative-log-likelihood (NLL) for probabilistic models. And the continuous form of Fano's theorem further gives a loose lower bound of mean-square-error (MSE) for deterministic models. Based on the special properties of traffic dynamics, two assumptions are made in the estimate of entropy metrics: cyclostationarity (traffic phenomena show strong periodicity) and localized spatial correlation (due to kinematic wave propagation). They allow formulating the limit of predictability as a function of longitudinal space and time-of-day which finds the most uncertain locations and periods solely from data. Experiments on univariate traffic accumulation forecasting and network-level speed forecasting show that the selected models, including some state-of-the-art deep learning models, indeed cannot outperform the estimated lower bounds but just approach them. The limit of predictability depends on time-of-day, network locations, observation range, and prediction horizon. The results reveal that the stochastic nature of traffic dynamics and improper assumptions on the prior distribution of output are two major factors that restrict the predictive performance. In summary, the proposed method estimates a trustworthy performance boundary for most traffic forecasting models. These conclusions are helpful for further studies in this domain.

Files

1_s2.0_S0968090X22000535_main.... (pdf)

(pdf | 2.3 Mb)