Reinforcement learning for train timetable rescheduling under perturbation

None, None; None, None; None, None; None, None; None, None; None, None

Reinforcement learning for train timetable rescheduling under perturbation

A general value-based approach

Journal Article (2026)

Author(s)

Pu Zhang (Beijing Jiaotong University)

Lingyun Meng (Beijing Jiaotong University)

Yongqiu Zhu (TU Delft - Transport, Mobility and Logistics)

Jianrui Miao (Beijing Jiaotong University)

Xiaojie Luan (Beijing Jiaotong University)

Zhengwen Liao (Beijing Jiaotong University)

Reinforcement learning High-speed railway Train timetable rescheduling Real-timerailway traffic management

DOI related publication

https://doi.org/10.1016/j.cie.2026.111867 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:f5c1a5fd-b326-4b54-92d8-cdfe307a467e

More Info

expand_more

Publication Year

2026

Language

English

Journal title

Computers and Industrial Engineering

Volume number

214

Article number

111867

Downloads counter

18

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper proposes a value-based deep reinforcement learning approach that is capable of handling train timetable rescheduling under both disturbed and disrupted situations. A railway environment is constructed to simulate the problem as a Markov decision process, where the optimization objective is integrated into the reward module and various constraints are incorporated into the conflict detection and avoidance module. To address the challenges of sparse rewards and large action space with limited legal actions, a value-based algorithm framework is proposed to efficiently select and effectively evaluate actions. Through the designed simulation and training procedures, the proposed approach is tested on several disturbance and disruption cases based on a real-world instance (i.e. a Chinese high-speed railway corridor). Experimental results show that the proposed method can obtain high-quality solutions within a reasonable computing time, and also outperforms handcrafted rules in terms of the optimality of solutions. Furthermore, the proposed method exhibits promising generalization capabilities in homogeneous perturbation scenarios (disturbance scenarios and disruption scenarios that share either the same affected location and start time or the same affected location and disrupted duration).

Files

1-s2.0-S0360835226000689-main.... (pdf)

(pdf | 12.4 Mb)

Taverne

File under embargo until 02-08-2026