Assessing Methods for Handling Missing Data Using an LSTM Deep Learning Model in Traffic Forecasting

None, None

Assessing Methods for Handling Missing Data Using an LSTM Deep Learning Model in Traffic Forecasting

Bachelor Thesis (2023)

Author(s)

W.W. Büthker (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

E. Congeduti – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

G. Iosifidis – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

LSTM Traffic forecasting Missing data

To reference this document use

https://resolver.tudelft.nl/uuid:bbff479b-5653-4406-b4b9-b66591cb410d

More Info

expand_more

Publication Year

2023

Language

English

Graduation Date

28-06-2023

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

435

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Due to the increasing popularity of various types of sensors in traffic management, it has become significantly easier to collect data on traffic flow. However, the integrity of these data sets is often compromised due to missing values resulting from sensor failures, communication errors, and other malfunctions. This study investigates the effect of missing data on the performance of Long Short-Term Memory (LSTM) models in traffic flow prediction and assesses strategies to handle these missing values. By actively removing values from a complete data set, three strategies to handle these missing values are evaluated: dropping null values, replacing them with zero, and linear interpolation. We show that LSTM models are surprisingly resilient to missing data, with little impact on prediction accuracy for up to 40% of missing data, irrespective of the strategy used. For higher proportions of missing data, dropping null values leads to significant performance degradation, while zero-filling and interpolation maintain predictive accuracy. This paper provides insights into the choice of missing data handling strategies in time-series prediction tasks, demonstrating the potential of LSTM models for traffic forecasting under less-than-ideal data conditions

Files

CSE3000_Traffic_Forecasting_Fi... (pdf)

(pdf | 0.531 Mb)

License info not available