A Novel Framework Combining MPC and Deep Reinforcement Learning With Application to Freeway Traffic Control

None, None; None, None; None, None

A Novel Framework Combining MPC and Deep Reinforcement Learning With Application to Freeway Traffic Control

Journal Article (2024)

Author(s)

Dingshan Sun (TU Delft - Transport and Planning)

Anahita Jamshidnejad (TU Delft - Control & Simulation)

Bart De Schutter (TU Delft - Delft Center for Systems and Control)

Research Group

Team Bart De Schutter

DOI related publication

https://doi.org/10.1109/TITS.2023.3342651

To reference this document use:

https://resolver.tudelft.nl/uuid:0b707e53-5b62-4be1-8eb8-eaefa30fd164

More Info

expand_more

Publication Year

2024

Language

English

Research Group

Team Bart De Schutter

Issue number

7

Volume number

25

Pages (from-to)

6756-6769

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Model predictive control (MPC) and deep reinforcement learning (DRL) have been developed extensively as two independent techniques for traffic management. Although the features of MPC and DRL complement each other very well, few of the current studies consider combining these two methods for application in the field of freeway traffic control. This paper proposes a novel framework for integrating MPC and DRL methods for freeway traffic control that is different from existing MPC-(D)RL methods. Specifically, the proposed framework adopts a hierarchical structure, where a high-level efficient MPC component works at a low frequency to provide a baseline control input, while the DRL component works at a high frequency to modify online the output generated by MPC. The control framework, therefore, needs only limited online computational resources and is able to handle uncertainties and external disturbances after proper learning with enough training data. The proposed framework is implemented on a benchmark freeway network in order to coordinate ramp metering and variable speed limits, and the performance is compared with standard MPC and DRL approaches. The simulation results show that the proposed framework outperforms standalone MPC and DRL methods in terms of total time spent (TTS) and constraint satisfaction, despite model uncertainties and external disturbances.

Files

A_Novel_Framework_Combining_MP... (pdf)

(pdf | 8.43 Mb)