Routing Optimization for the Train Unit Shunting Problem using a Multi-Agent Deep Reinforcement Learning Framework

More Info
expand_more

Abstract

In busy passenger railway networks, a large amount of trains have to be parked in shunting yards off the mainline every night, where they will be cleaned, maintained, sorted and parked. This problem is known as the Train Unit Shunting Problem (TUSP), which is a hard combinatorial optimization problem faced by railway operators. The TUSP is currently solved using human heuristics, which is difficult and time-consuming. Reinforcement learning approaches have been developed in the last few years to efficiently approach this problem. In this research we develop, in a multi-agent deep reinforcement learning framework, an heuristic for random exploration to efficiently search the state-action space and two heuristics for train routing strategies, which aim at improving the performance and quality of the produced route plans. On one hand, we develop the Type-Based Routing Strategy, based on the idea of parking trains of the same rolling stock type on the same tracks. On the other hand, we develop the In-Residence Time Routing Strategy, based on parking trains ordered according to their departure time. Both routing strategies consists of four components: (1) standard parking rules, (2) combination and split rules, (3) conflict resolution rules and (4) unnecessary movements rules. The goal is to incorporate information from the logistics side of the problem into the framework. We demonstrated the performance of the heuristics developed in two real-life cases in the Dutch railway network. The results of the experiments carried out demonstrate the potential of the resulting framework to produce more efficient route plans and to adapt to different problems designs such as different matching problems types and different shunting yards.