Graph Neural Networks Training Set Analysis
Effect of Training Data Size
A.V. Păcurar (TU Delft - Electrical Engineering, Mathematics and Computer Science)
E. Congeduti – Mentor (TU Delft - Computer Science & Engineering-Teaching Team)
E.A. Markatou – Graduation committee member (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
With the rapid increase in popularity of graph neural networks (GNNs) for the task of traffic forecasting, understanding the inner workings of these complex models becomes more important. This experiment aims to deepen our understanding of the importance that the training data has in regards to the ability of GNNs to accurately predict traffic. By repeatedly training the same GNN model with different training datasets spanning over various time frames and comparing standard performance metrics computed based on the predictions performed by the model, this paper concludes that while using less training data leads to a slight decrease in performance, this is heavily dependent on the quality of the dataset. If the data gathering process is short and the sensors are not properly maintained, GNNs are not able to accurately predict traffic. On the other hand, if the data gathering process goes well and there are few missing values, GNNs perform well even when trained with smaller amounts of historical data.