Advancing Deep Reinforcement Learning for Real-World Traffic Signal Control

None, None

Advancing Deep Reinforcement Learning for Real-World Traffic Signal Control

Addressing Sampling Challenges and Multi-Modal Traffic Dynamics

Master Thesis (2024)

Author(s)

K.F. Ceton (TU Delft - Mechanical Engineering)

Contributor(s)

S. Grammatico – Mentor (TU Delft - Mechanical Engineering)

Tijs van Bakel – Mentor (Technolution)

G. Pantazis – Mentor (TU Delft - Mechanical Engineering)

A. Dabiri – Graduation committee member (TU Delft - Mechanical Engineering)

Faculty

Mechanical Engineering

Proximal Policy Optimization Deep Reinforcement Learning Traffic Signal Control Invalid Action Masking Multi-Modal Traffic Scenarios Four-Legged Intersection Traffic Management Traffic Volume Sensitivity Analysis Disturbance Rejection in Traffic Systems Integration of Vehicle and Pedestrian Traffic

To reference this document use

https://resolver.tudelft.nl/uuid:80a3f55a-4886-40a8-a096-4966c9a58c13

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

09-12-2024

Awarding Institution

Delft University of Technology

Programme

Mechanical Engineering

Faculty

Mechanical Engineering

Downloads counter

209

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep Reinforcement Learning (DRL) is a promising approach to Traffic Signal Control (TSC). However, significant challenges remain in translating this potential into real-world traffic management solutions. This thesis investigates obstacles hindering the application of DRL in real-world TSC, focusing on low sampling frequencies and the complexities of multi-modal traffic scenarios.

We developed a high-frequency sampling Proximal Policy Optimization (PPO) approach for TSC at a four-legged intersection, integrating both vehicle and pedestrian traffic in a multimodal setting. By employing Invalid Action Masking (IAM), we effectively handle signal timing constraints across these modalities. The framework was evaluated through traffic volume sensitivity analyses, assessments of generalization capabilities, disturbance rejection tests, and comparisons of methods for handling invalid actions.

The results indicate that short sampling intervals, such as 1 second, do not improve performance in terms of time-loss, with 4 to 6 seconds identified as the optimal range for PPO in TSC of a four-legged intersection. The findings also demonstrate that IAM can effectively be incorporated without compromising performance. When evaluating the ability to handle sudden spikes in traffic volume, PPO demonstrated superior performance, outperforming baseline methods such as max-pressure and fixed-time strategies in terms of both overshoot and settling time. Also, the results show that PPO can effectively prioritize vehicle and pedestrian modalities, displaying a clear proportional decrease in time-loss for the prioritized modality.

Files

THESIS_REPORT_KOEN_CETON.pdf

(pdf | 15.2 Mb)

- Embargo expired in 31-01-2025

License info not available