Bus bunching is a problem that occurs in many high frequent bus systems. This can be averted by several countermeasures of which holding control is the most popular one in practice. Holding control strategies are often implemented using predefined rules. In this study, multi-agen
...
Bus bunching is a problem that occurs in many high frequent bus systems. This can be averted by several countermeasures of which holding control is the most popular one in practice. Holding control strategies are often implemented using predefined rules. In this study, multi-agent reinforcement learning is selected to develop an effective holding strategy since it is more flexible and can adjust its strategy to the conditions of the system. This approach is tested on five bus systems with different characteristics. The first system is idealised with deterministic driving times, deterministic dwell times, where all route sections have equal length and all stops have the same demand. In the other systems several complexities are added: stochastic driving times, stochastic dwell times, route sections of varying length, varying demand at the stops and traffic lights on the route. Additionally, in two systems the traffic lights are modelled as reinforcement learning agents. The results show that this approach can outperform the rule-based benchmark in four out of five systems. Furthermore, the MARL approach can handle the stochastic processes better than the rule-based benchmark. This difference is not observed for the varying route characteristics. Lastly, the MARL approach is able to learn an effective cooperation between the bus agents and traffic light agents. This indicates the potential of system wide cooperation and optimisation.