Routing Optimization for the Train Unit Shunting Problem using a Multi-Agent Deep Reinforcement Learning Framework

Master thesis (2021)

Authors

J. Trepat Borecka Civil Engineering & Geosciences

Contributors

R.M.P. Goverde (coach)

Nikola Bešinović (mentor)

M.Y. Maknoon Transport and Logistics - (graduation committee member)

W.J. Lee Nederlandse Spoorwegen (mentor)

Faculty

Civil Engineering & Geosciences, Civil Engineering & Geosciences

To reference this document use:

http://resolver.tudelft.nl/uuid:b08fc160-934f-4000-8b09-3308adce3b6d

More Info

expand_more

Published Date

26-03-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Civil Engineering & Geosciences

Abstract

In busy passenger railway networks, a large amount of trains have to be parked in shunting yards off the mainline every night, where they will be cleaned, maintained, sorted and parked. This problem is known as the Train Unit Shunting Problem (TUSP), which is a hard combinatorial optimization problem faced by railway operators. The TUSP is currently solved using human heuristics, which is difficult and time-consuming. Reinforcement learning approaches have been developed in the last few years to efficiently approach this problem. In this research we develop, in a multi-agent deep reinforcement learning framework, an heuristic for random exploration to efficiently search the state-action space and two heuristics for train routing strategies, which aim at improving the performance and quality of the produced route plans. On one hand, we develop the Type-Based Routing Strategy, based on the idea of parking trains of the same rolling stock type on the same tracks. On the other hand, we develop the In-Residence Time Routing Strategy, based on parking trains ordered according to their departure time. Both routing strategies consists of four components: (1) standard parking rules, (2) combination and split rules, (3) conflict resolution rules and (4) unnecessary movements rules. The goal is to incorporate information from the logistics side of the problem into the framework. We demonstrated the performance of the heuristics developed in two real-life cases in the Dutch railway network. The results of the experiments carried out demonstrate the potential of the resulting framework to produce more efficient route plans and to adapt to different problems designs such as different matching problems types and different shunting yards.

Files

Master_Thesis_JacobTrepat.pdf

(.pdf | 9.31 Mb)