Impact of Pre-training on Deep Reinforcement Learning Ramp Metering Systems

Conference Paper (2025)
Author(s)

Callum Evans (TU Delft - Traffic Systems Engineering)

Marco Rinaldi (TU Delft - Traffic Systems Engineering)

Henk Taale (TU Delft - Traffic Systems Engineering)

Serge Hoogendoorn (TU Delft - Traffic Systems Engineering)

Research Group
Traffic Systems Engineering
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Traffic Systems Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Pre-training is a process used to enhance the learning of deep reinforcement learning (RL) algorithms through initial guidance from an expert demonstrator. This involves training a neural network to replicate the outputs of the selected expert before allowing the RL agent to specialise and develop its own policy. This paper outlines a study that aims to analyse the impact of pre-training on deep RL algorithms used in ramp metering. Specifically, behaviour cloning is performed for increasing lengths of time (0-10,000 epochs), with ALINEA as the chosen expert algorithm guiding a proposed Proximal Policy Optimisation (PPO)-based system. The results confirm that, with the same length of training, some initial guidance through pre-training can significantly improve the system’s effectiveness in reducing congestion compared to no pre-training. Otherwise, excessive pre-training may lead to overfitting and reduced generalisability. Design issues resulting in weak model convergence, however, limit the algorithm’s overall performance in the chosen scenario.

Files

TRBAM-25-01884_IN01_0927202410... (pdf)
(pdf | 0 Mb)
- Embargo expired in 09-07-2025
Taverne