Deep Reinforcement Learning (DRL) is a promising approach to Traffic Signal Control (TSC). However, significant challenges remain in translating this potential into real-world traffic management solutions. This thesis investigates obstacles hindering the application of DRL in rea
...
Deep Reinforcement Learning (DRL) is a promising approach to Traffic Signal Control (TSC). However, significant challenges remain in translating this potential into real-world traffic management solutions. This thesis investigates obstacles hindering the application of DRL in real-world TSC, focusing on low sampling frequencies and the complexities of multi-modal traffic scenarios.
We developed a high-frequency sampling Proximal Policy Optimization (PPO) approach for TSC at a four-legged intersection, integrating both vehicle and pedestrian traffic in a multimodal setting. By employing Invalid Action Masking (IAM), we effectively handle signal timing constraints across these modalities. The framework was evaluated through traffic volume sensitivity analyses, assessments of generalization capabilities, disturbance rejection tests, and comparisons of methods for handling invalid actions.
The results indicate that short sampling intervals, such as 1 second, do not improve performance in terms of time-loss, with 4 to 6 seconds identified as the optimal range for PPO in TSC of a four-legged intersection. The findings also demonstrate that IAM can effectively be incorporated without compromising performance. When evaluating the ability to handle sudden spikes in traffic volume, PPO demonstrated superior performance, outperforming baseline methods such as max-pressure and fixed-time strategies in terms of both overshoot and settling time. Also, the results show that PPO can effectively prioritize vehicle and pedestrian modalities, displaying a clear proportional decrease in time-loss for the prioritized modality.