Combining MPC and reinforcement learning in a model-reference framework for urban traffic signal control

None, None

Combining MPC and reinforcement learning in a model-reference framework for urban traffic signal control

Master Thesis (2022)

Author(s)

W.J. Remmerswaal (TU Delft - Mechanical Engineering)

Contributor(s)

B.H.K. de Schutter – Graduation committee member (TU Delft - Team Bart De Schutter)

Dingshan Sun – Mentor (TU Delft - Team Bart De Schutter)

Faculty

Mechanical Engineering

Copyright

Reinforcement Learning Model Predictive Control Model-reference adaptive control Urban Traffic Control

To reference this document use:

https://resolver.tudelft.nl/uuid:791a060f-77b9-4110-a8c4-dffd6fb5d51e

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

26-01-2022

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering | Systems and Control']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Both model predictive control (MPC) and reinforcement learning (RL) have shown promising results in the control of traffic signals in urban traffic networks. There are, however, a few drawbacks. MPC controllers are not adaptive and therefore perform suboptimal in the presence of the uncertainties that always occur in urban traffic systems. Although very advanced prediction models for urban traffic signal control systems exist, these models also come with a price: the computational complexity of MPC controllers increases with the accuracy of the model. RL techniques involve a time-consuming and data-dependent offline computation, as the agent needs to pursue a training process. The training process is also the main reason why RL techniques have not been employed in real-world urban traffic systems. Through exploration in the training phase the controller may cause a suboptimal and potentially unacceptable bad performance in the system. Besides, most RL techniques do not have any stability and feasibility guarantees. With the goal of mitigating these drawbacks, the model-reference RL adaptive control framework is introduced. RL is used to obtain an adaptive law to adjust a stable baseline controller to follow a set reference. This thesis focusses on the design and analysis of this scheme where MPC control is used to obtain the baseline control input. The computed baseline control input combined with the traffic model used, determines the reference state to be followed. By performing a case study, the training characteristics of the framework are compared to those of a conventional RL-based controller. Besides, the system performance framework is compared to that of a fixed-time controller a conventional MPC controller and a conventional RL-based controller. The simulation shows that the framework outperforms the RL-based controller in terms of performance during training and the general simulation performance of the MPC controller.

Files

Thesis_report_WJR.pdf

(pdf | 16.4 Mb)

License info not available