Bus management using multi-agent reinforcement learning

Master thesis (2021)

Authors

G.R.J. Weijs Civil Engineering & Geosciences

Contributors

O. Cats Transport and Planning - (supervisor 1)

A.J. Pel Transport and Planning - (supervisor 2)

M.T.J. Spaan Algorithmics - (supervisor 2)

Peter Nijhuis (supervisor 2)

Faculty

Civil Engineering & Geosciences

Reinforcement Learning Multi-agent reinforcement learning Bus bunching Bus management

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:6e6b280e-86a1-42c0-b0cf-fc38c12aec76

Published Date

28-07-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Civil Engineering & Geosciences

Abstract

Bus bunching is a problem that occurs in many high frequent bus systems. This can be averted by several countermeasures of which holding control is the most popular one in practice. Holding control strategies are often implemented using predefined rules. In this study, multi-agent reinforcement learning is selected to develop an effective holding strategy since it is more flexible and can adjust its strategy to the conditions of the system. This approach is tested on five bus systems with different characteristics. The first system is idealised with deterministic driving times, deterministic dwell times, where all route sections have equal length and all stops have the same demand. In the other systems several complexities are added: stochastic driving times, stochastic dwell times, route sections of varying length, varying demand at the stops and traffic lights on the route. Additionally, in two systems the traffic lights are modelled as reinforcement learning agents. The results show that this approach can outperform the rule-based benchmark in four out of five systems. Furthermore, the MARL approach can handle the stochastic processes better than the rule-based benchmark. This difference is not observed for the varying route characteristics. Lastly, the MARL approach is able to learn an effective cooperation between the bus agents and traffic light agents. This indicates the potential of system wide cooperation and optimisation.

Files

ThesisV2.pdf

(.pdf | 4.47 Mb)