Circular Image

Y. Zhu

info

Please Note

27 records found

A graph neural network based deep reinforcement learning method for the train platforming and rescheduling problem

Journal article (2026) - Hongxiang Zhang, Andrea D’Ariano, Yongqiu Zhu, Yaoxin Wu, Liuyang Hu, Gongyuan Lu
The train platforming schedule is the crucial plan for guiding trains to travel through a railway station without spatial and temporal conflicts. When trains are delayed in arriving at the station due to disturbances or disruptions, it raises the Train Platforming and Rescheduling Problem (TPRP), one of the hot topics in railway traffic management. It focuses on allocating platforms and time slots for trains to reduce delays and ensure operational efficiency in a station. This paper introduces a novel graph neural network based deep reinforcement learning method to address this problem, named Learning to Reschedule Platforms (L2RP). We formulate the solving process of TPRP as a customized Markov decision process. Meanwhile, we integrate a microscopic discrete-event train operation simulation model to serve as the agent exploration environment, which provides states, executes actions, and completes transitions. Then, we design a hybrid graph neural network based policy network to derive high-quality actions under each graph encoded state.The policy network is trained with the reward function designed to minimize total train knock-on delays and platform changes. The experiments on real-world instances show that the proposed L2RP method can produce high-quality solutions for instances of various scenarios within stably short solving times. ...
Journal article (2026) - Pu Zhang, Lingyun Meng, Yongqiu Zhu, Jianrui Miao, Xiaojie Luan, Zhengwen Liao
This paper proposes a value-based deep reinforcement learning approach that is capable of handling train timetable rescheduling under both disturbed and disrupted situations. A railway environment is constructed to simulate the problem as a Markov decision process, where the optimization objective is integrated into the reward module and various constraints are incorporated into the conflict detection and avoidance module. To address the challenges of sparse rewards and large action space with limited legal actions, a value-based algorithm framework is proposed to efficiently select and effectively evaluate actions. Through the designed simulation and training procedures, the proposed approach is tested on several disturbance and disruption cases based on a real-world instance (i.e. a Chinese high-speed railway corridor). Experimental results show that the proposed method can obtain high-quality solutions within a reasonable computing time, and also outperforms handcrafted rules in terms of the optimality of solutions. Furthermore, the proposed method exhibits promising generalization capabilities in homogeneous perturbation scenarios (disturbance scenarios and disruption scenarios that share either the same affected location and start time or the same affected location and disrupted duration). ...
Recent research in Energy-Efficient Train Control (EETC) and Energy-Efficient Train Timetabling (EETT) has uncovered various strategies that can be utilized to reduce railway energy consumption without placing additional demands on the capacity or compromising the robustness of operations. Several railway undertakings have already integrated aspects of these methodologies both in daily operations by the implementation of Driver Advisory Systems (DAS) and in the timetable design process. The major passenger railway operator in the Netherlands, Nederlandse Spoorwegen (NS), utilizes a tablet-based DAS that provides coasting advice to train drivers, while also displaying the route, timetable, temporary speed restrictions and blocks occupied by preceding traffic. Despite the implementation of this system, historical trajectory data from real world operations in the Netherlands indicate variances in the extent of energy-efficient train driving application. These variations could lessen the energy-savings of EETC and increase operational costs. Hence, the main aim of this poster is to evaluate the application of the EETC strategy in real world operations under varying environmental and operational conditions based on historical timetable and train trajectory data, while identifying the causes leading to the observed differences. Subsequently, a literature review of train trajectory optimization techniques is conducted to examine the extent to which these causes are addressed. Finally, the real world applicability of these methods is discussed and future research directions are provided. ...

A Deep Reinforcement Learning Method for the Train Platforming and Rescheduling Problem

Abstract (2025) - Hongxiang Zhang, Yongqiu Zhu, Liuyang Hu, Andrea D’Ariano, Yaoxin Wu, Gongyuan Lu
This paper proposes the Learning to Platforming (L2P) method, a novel graph neural network based deep reinforcement learning method, to solve the Train Platforming and Rescheduling Problem (TPRP). We customize a Markov decision process (MDP) to formulate the solving process of TPRP, utilizing a graph structure to represent states of trains, routes, and berthing tracks from a microscopic perspective. Then, we design a hybrid graph neural network named hAI-GNN to learn informative node embeddings on the graph encoded state. These embeddings are utilized to derive an effective action from the lightweight action space of MDP, which is associated with the decision object train under the state. A discrete-event simulation model is employed to serve as the environment of MDP and implement state transition mechanism. The hAI-GNN based policy network is trained by the Proximal Policy Optimization (PPO) algorithm with the reward function designed to minimize total knock-on delay trains and platform changes. The experiments on real-world instances show that the proposed L2P method can obtain high-quality solutions for both small and large scale instances within very short solving times. ...
Railway systems suffer from disturbances in operations, such as extended section running times caused by temporary speed restrictions and prolonged dwell times at stations due to unexpected passenger volumes. These disturbances cause deviations from the original timetable and negatively impact service reliability and passenger experience. Effective and timely rescheduling measures are crucial in reducing the impact of these disturbances. Existing timetable rescheduling models that rely on optimization-based methods often struggle with computational inefficiency, especially when dealing with scenarios involving a large number of train services. To address these challenges, we propose a learning-based timetable rescheduling framework that considers scalability in its formulation to reduce the growing computational burden associated with an increasing number of train services. The proposed framework decomposes the complex rescheduling problem into multiple subtasks, facilitating a systematic approach to managing extensive railway networks with numerous stations and train services. A high-level agent, functioning as a centralized traffic controller, is responsible for decomposing the overall deviation reduction task into subtasks at a low level and assigning them to individual train services with the primary objective of minimizing the time required to restore the original timetable. Low-level agents, acting as distributed train dispatchers, are tasked with rescheduling the timetables of their assigned trains. These low-level agents employ various dispatching strategies, facilitated by inter-train communication, to search for optimal rescheduling solutions while adhering to operational constraints such as minimum headway requirements. The lowlevel agents utilize an actor-critic architecture to generate continuous control decisions for dwell and running times, enabling them to learn and optimize their performance. Knowledge-sharing mechanisms amongst the low-level agents enable faster and more robust learning. Furthermore, advanced exploration methods are integrated to enhance the efficiency of the agents' training process. ...

A heterogeneous multi-agent reinforcement learning framework

Journal article (2025) - Enze Liu, Shuguang Zhan, Yongqiu Zhu, Zhiyuan Lin, Dian Wang
With growing demand straining urban transit systems’ resilience in managing outburst passenger flows, existing approaches focused on offline and single-modal evacuations remain limited. This study proposes an online multi-modal evacuation framework that coordinates on-duty taxis, buses, and metros while minimizing impact on their regular services. We develop a data-driven agent-based environment to update multi-modal transit data and stranded passenger information in real time. Two coordination strategies are introduced: (1) an independent strategy using a decentralized training and distributed execution algorithm, and (2) a collaborative strategy using a hybrid centralized training and distributed execution algorithm. To dynamically assess evacuation effectiveness, we design a resilience framework with three metrics: robustness, rapidity, and resourcefulness. These metrics are transformed into demand-responsive feedback at each time step, enabling agents to proactively generate resilient evacuation plans. In a real-world case study triggered by a railway disruption, our approach outperforms genetic algorithms and multi-agent deep deterministic policy gradient algorithms in computation time and solution quality under offline conditions. Simulated new environments further validate its online applicability, demonstrating its potential for real-world deployment. ...
Metro networks face operational challenges due to increasing ridership and system growth, particularly in managing delay propagation. Epidemiology models have recently been an interesting method in transportation research for studying delays. This study, therefore, aims to investigate if the Susceptible-infectious-susceptible (SIS) model is suitable to help model delay propagation in a metro network through its ability to reproduce the vulnerability of metro stations for specific instances. Using data from the Washington Metro Network, two groups of delay propagation instances were selected and used for model training and testing using a differential evolution algorithm. The results indicate that the vulnerability values as calculated from the reallife data do not follow the expected trend. Still, our model can capture this variation with good vulnerability estimation accuracy for both groups. Also, the predicted vulnerability values for the first group are more accurate than for the second group. However, limitations such as underestimation and overestimation of station vulnerabilities, and sensitivity to training data were observed. These challenges stemmed from the dynamics between specific parameters and the lack of additional factors. ...

A review of the literature and future research directions

Journal article (2024) - Shuguang Zhan, Jiemin Xie, S. C. Wong, Yongqiu Zhu, Francesco Corman
External and internal factors can cause disturbances or disruptions in daily train operations, leading to deviations from official timetables and passenger delays. As a result, efficient train timetable rescheduling (TTR) methods are necessary to restore disrupted train services. Although TTR has been a popular research topic in recent years, the uncertain characteristics of railways have not been sufficiently addressed. This review first identifies the primary uncertainties of TTR and examines their impacts on both TTR and passenger routing during disturbances or disruptions. It finds that only a few uncertainties have been investigated, and the existing solution methods do not adequately meet practical requirements, such as considering the dynamic nature of disturbances or disruptions, which is crucial for real-world applications. Therefore, the review highlights problems associated with TTR uncertainties that need urgent attention and suggests promising methodologies that could effectively address these issues as future research directions. This review aims to help practitioners develop improved automatic train-dispatching systems with better train-rescheduling performance under disturbances or disruptions compared to current systems. ...
Journal article (2023) - Ping Huang, Zhongcan Li, Yongqiu Zhu, Chao Wen, Francesco Corman
Railway operations are subject to deviations from the planned schedule, i.e., delays. In those situations, high-quality traffic control actions are needed to reduce the delays. Existing studies mainly used prescriptive techniques (e.g., mathematical programming, heuristics) to identify the best control action. These methods have limitations in the firm reliance on deterministic parameters prescriptively or normatively determined beforehand, and little understandability by the practitioners. These drawbacks hinder their acceptance in practice. This study exploits instead past realization data to provide decision support for traffic control. The realized data describe the traffic control actions taken by human controllers, and their effects; those latter are more complex than a linear sum of predetermined parameters. We use decision graphs to identify which traffic control action leads to the best solution, in terms of reduction of delays, based on the past performance of the same action in similar conditions. We are also able to explain the reasons and the factors that lead to each suggested action. We focus on the relevant case of merging stations, where multiple lines merge as one line, deciding the relative order between two consecutive trains. The method determines the stochastic effects of the two possible decisions at merge points, which allows for choosing the best one. Compared within the framework of realized data, the action suggested is the best out of a series of benchmarks, including simple rules and optimization, improving (reducing delays) approximately 11.7% on the common benchmarks. The variables with the highest impact on the utility are the length of the planned dwell time and the planned presence of an overtaking. The variables influencing the utility most are the actual delays of trains, the train type, and the order actually implemented. ...
Journal article (2023) - Jia Ning, Qiyuan Peng, Yongqiu Zhu, Xinjie Xing, Otto Anker Nielsen
When urban rail transit (URT) does not provide 24-hour services, passengers who travel at late night may not be able to reach their destinations with only URT trains. As a result, passengers have to find alternative transport means, or combine URT trains with other transport services to fulfill their journeys. This paper investigates the integrated optimization of last train timetabling and bridging service design with consideration of passenger path choices. Two bridging services are considered: taxis and buses. Based on pre-constructed path sets, a bi-objective mixed-integer nonlinear programming (MINLP) model is developed, aiming at minimizing total passenger travel time and total passenger travel cost. To reduce the model scale and improve solution efficiency, three path dominance principles are proposed to remove redundant passenger paths without loss of optimality. An adaptive iterative algorithm is designed to obtain the Pareto frontier curve. The proposed model and solution methods are demonstrated on the Chengdu URT network. Results indicate that passenger travel costs and travel times can be significantly reduced by the integrated optimization. It also provides passengers with a safer night travel environment due to the reduction in passenger travel times in taxis. ...
Journal article (2023) - Pengling Wang, Yongqiu Zhu, Wei Zhu
Virtual coupling technology was recently proposed in railways, which separates trains by a relative braking distance (or even shorter distance) and moves trains synchronously to increase capacity at bottlenecks. This study proposes a real-time cooperative train trajectory planning algorithm for coordinating train movements under virtual coupling by considering stochastic initial delays. The algorithm uses mixed-integer programming models to estimate the delay propagation among trains, detect feasible coupled-running locations, and optimize the trajectories of the two trains such that they coordinate their speeds to achieve energy-efficient, punctual movements, as well as a safe coupled-running process. A robust optimization method is proposed to capture the stochastic delays as an uncertainty set, which is reformulated to its dual problem. Case studies of planning train trajectories for the classical virtual-coupling scenario suggest that (1) the coupled-running distance is greatly affected by the coordination of train timetables, delays, and safe separation constraints at switches; (2) the coordination of train movements for a coupled-running process imposes extra energy costs; and (3) the proposed method can detect feasible coupled-running locations and produce cooperative speed profiles in short computational times. ...
Preprint (2023) - Yongqiu Zhu, Pengling Wang, Francesco Corman
The railway systems are affected by unexpected disturbances on a daily basis causing delays to trains and passengers. Real-time traffic management is necessary and is currently handled by human traffic controllers who mainly focus on minimizing train delays. In the past two decades, extensive research has been done to improve railway traffic management by shifting the focus to passengers (generally called delay management) using optimization or heuristics. The existing optimization-based models are difficult to solve efficiently. The existing heuristics perform well in terms of computational efficiency but are based on simplified passenger behaviours. In this paper, we propose an efficient method including complex passenger behaviours. That is a Reinforcement Learning (RL) framework for delay management (at a network-level for mainline railways) aiming to minimize passenger destination delays. In our method, passengers re-plan their travels when actually miss their transfers (i.e. reactive behaviour) or when they are aware of better path choices than their current planned ones (i.e. proactive behaviour), whichever happens first. The re-plan behaviour is allowed to happen multiple times during a single journey of a passenger as long as it is beneficial. We tested the proposed RL approach on a real-world railway network and compared it to three benchmarks: the first-in-first-out (FIFO) dispatching rule that is widely adopted in practice, and the train-centric and passenger-centric optimization models (TcOM and PcOM) on timetable rescheduling. Results show the ability of our RL approach to obtain better rescheduling solutions (in terms of total passenger delays) than the FIFO and TcOM, and better computational efficiency (43 seconds on average after training) than the PcOM (4572 seconds on average). With sufficient training, 74% of the trained RL agents can solve all other new delay scenarios designed in our case study, implying good generalization performance of the proposed method. ...
Journal article (2022) - Jia Ning, Qiyuan Peng, Yongqiu Zhu, Yu Jiang, Otto Anker Nielsen
In cities where the urban rail transit (URT) systems do not provide 24-h services, passengers may not be able to reach their destinations if the last train services have closed by the time they arrive at the transfer stations. This paper aims to seek a well-coordinated last train timetable that can transport as many passengers as possible to their destinations (referred to as reachable passengers) and also transport those passengers who cannot reach their destinations (referred to as unreachable passengers) to the stations as close as possible to their destinations. A bi-objective mixed-integer linear programming (MILP) model is developed to maximize the number of reachable passengers and minimize the total remaining travel distance of all passengers. The augmented ε-constraint method is applied to generate all Pareto optimal solutions of the bi-objective MILP model. Numerical experiments were implemented in the Chengdu URT network. Results indicate that compared to the current-in-use timetable, the optimized timetable by our methods significantly increased the number of reachable passengers and meanwhile reduced the average remaining travel distance of unreachable passengers. In addition, we discussed two possible strategies to improve passengers’ destination reachability, which are encouraging passengers to arrive early at their origin stations, and optimizing the timetable of last trains and non-last trains at the same time. ...
Journal article (2022) - Pengling Wang, Yongqiu Zhu, Francesco Corman
Optimizing the railway timetable to increase synchronous accelerating and braking processes can lead to an improvement in the usage of regenerative energy. However, such a synchronized timetable might result in little or unsuitable transfer connections for the passengers. This paper focuses on the optimization of railway periodic timetables, to increase usage of regenerative energy while ensuring passenger satisfaction. We work by extending the traditional Periodic Event Scheduling Problem (PESP) formulation, to address the problem of synchronization of acceleration and braking phases, (and re-used energy) and including passenger-related events (and their satisfaction). Three objectives are identified, in a resulting Mixed Integer Linear Programming (MILP) model: maximizing the overlapping times of accelerating and braking trains to achieve increased usage of regenerative energy, minimizing the total passengers’ generalized travel times (global passenger dissatisfaction), and minimizing the maximum increase in individual's generalized travel time (local passenger dissatisfaction). A multi-step approach solves the trade-offs among three conflicting objectives. Results on a realistic case study show that the proposed approach can find optimized timetables, which compared to the currently-in-use timetable, can increase the usage of regenerative energy by over 1.5 times, save the average generalized travel time per passenger by 2 min, with only a minor increase on specific individual generalized travel time (up to 4 min). A detailed results analysis imply that to achieve a higher usage of regenerative energy, it is required to have a higher tolerance for the maximum increase in individual generalized travel time, while this is not necessary for the overall passenger generalized travel time, which can even be reduced when the maximum increase in individual generalized travel time becomes larger. ...
Journal article (2022) - João Paiva Fonseca, Tobias Zündorf, Evelien van der Hurk, Yongqiu Zhu, Allan Larsen
Designing a public transport timetable that maximizes passenger service, measured in weighted travel time, is an intricate problem. The weighted travel time depends on the free route choice of passengers. Passenger route choice depends on the timetable. In turn, the timetable that minimizes weighted travel time depends on the route choice of passengers—and therefore requires passenger route choice information. Consequently, a sequential approach where timetables are designed provided pre-fixed passenger assignment to routes may not find the optimal timetable. This paper aims to integrate passenger route choice and timetabling. It addresses the problem of designing maximal passenger service public transport timetables in systems with free route choice within a budget for operating costs. Operating costs are defined by the minimal cost vehicle schedule required to operate the timetable. The proposed methodology integrates a matheuristic for timetabling and vehicle scheduling with a passenger assignment model in an iterative framework, where different forms of integration are evaluated. Focus is on long- to medium-term timetabling, provided an initial timetable. The results for a realistic case study in the Greater Copenhagen area indicate that our approach consistently leads, at no additional cost, to timetables that represent a reduction in passenger weighted travel time in comparison with both an initial timetable and a non-integrated timetabling method that receives a single-passenger assignment as input. ...
Journal article (2021) - Yongqiu Zhu, Rob M.P. Goverde
Unexpected disruptions occur in the railways on a daily basis, which are typically handled manually by experienced traffic controllers with the support of predefined contingency plans. When several disruptions occur simultaneously, it is rather hard for traffic controllers to make rescheduling decisions, because (1) the predefined contingency plans corresponding to these disruptions may conflict with each other and (2) no predefined contingency plan considering the combined effects of multiple disruptions is available. This paper proposes a Mixed Integer Linear Programming (MILP) model to reschedule the timetable in case of multiple disruptions that occur at different geographic locations but have overlapping periods and are pairwise connected by at least one train line. The dispatching measures of retiming, reordering, cancelling, adding stops and flexible short-turning are formulated in the MILP model that also considers the rolling stock circulations at terminal stations and platform capacity. We develop two approaches for rescheduling the timetable in a dynamic environment: the sequential approach and the combined approach. In the sequential approach, a single-disruption rescheduling model is applied to handle each new disruption with the last solution as reference. In the combined approach, the multiple-disruption rescheduling model is applied every time an extra disruption occurs by considering all ongoing disruptions. A rolling-horizon solution method to the multiple-disruption model has been developed to handle long multiple connected disruptions in a more efficient way. The sequential and combined approaches have been tested on real-life instances on a subnetwork of the Dutch railways with 38 stations and 10 train lines operating half-hourly in each direction. In a few cases, the sequential approach did not find feasible solutions, while the combined approach obtained the solutions for all considered cases. Besides, the combined approach was able to find solutions with less cancelled train services and/or train delays than the sequential approach. For long disruptions, the proposed rolling-horizon method was able to generate high-quality rescheduling solutions in an acceptable time. ...
Journal article (2020) - Yongqiu Zhu, Rob Goverde
Unexpected disruptions occur frequently in the railways, during which many train services cannot run as scheduled. This paper deals with timetable rescheduling during such disruptions, particularly in the case where all tracks between two stations are blocked for hours. In practice, a disruption may become shorter or longer than predicted. To take the uncertainty of the disruption duration into account, this paper formulates the timetable rescheduling as a rolling horizon two-stage stochastic programming problem in deterministic equivalent form. The random disruption duration is assumed to have a finite number of possible realizations, called scenarios, with given probabilities. Every time a prediction about the range of the disruption end time is updated, new scenarios are defined, and a two-stage stochastic model computes the optimal rescheduling solution to all these scenarios. The stochastic method was tested on a part of the Dutch railways, and compared to a deterministic rolling-horizon method. The results showed that compared to the deterministic method, the stochastic method is more likely to generate better rescheduling solutions for uncertain disruptions by less train cancellations and/or delays, while the solution robustness can be affected by the predicted range regarding the disruption end time. ...
Journal article (2020) - Y. Zhu, R.M.P. Goverde
During railway disruptions, most passengers may not be able to find preferred alternative train services due to the current way of handling disruptions that does not take passenger responses into account. To offer better alternatives to passengers, this paper proposes a novel passenger-oriented timetable rescheduling model, which integrates timetable rescheduling and passenger reassignment into a Mixed Integer Linear Programming model with the objective of minimizing generalized travel times: in-vehicle times, waiting times at origin/transfer stations and the number of transfers. The model applies the dispatching measures of re-timing, re-ordering, cancelling, flexible stopping and flexible short-turning trains, handles rolling stock circulations at both short-turning and terminal stations of trains, and takes station capacity into account. To solve the model efficiently, an Adapted Fix-and-Optimize (AFaO) algorithm is developed. Numerical experiments were carried out to a part of the Dutch railways. The results show that the proposed passenger-oriented timetable rescheduling model is able to shorten generalized travel times significantly compared to an operator-oriented timetable rescheduling model that does not consider passenger responses. By allowing only 10 min more train delay than an optimal operator-oriented rescheduling solution, the passenger-oriented model is able to shorten the generalized travel times over all passengers by thousands of minutes in all considered disruption scenarios. With a passenger-oriented rescheduled timetable, more passengers continue their train travels after a disruption started, compared to a rescheduled timetable from the operator-oriented model. The AFaO algorithm obtains high-quality solutions to the passenger-oriented model in up to 300 s. ...
Conference paper (2020) - Yongqiu Zhu, Hongrui Wang, Rob Goverde
Real-time railway traffic management is important for the daily operations of railway systems. It predicts and resolves operational conflicts caused by events like excessive passenger boardings/alightings. Traditional optimization methods for this problem are restricted by the size of the problem instances. Therefore, this paper proposes a reinforcement learning-based timetable rescheduling method. Our method learns how to reschedule a timetable off-line and then can be applied online to make an optimal dispatching decision immediately by sensing the current state of the railway environment. Experiments show that the rescheduling solution obtained by the proposed reinforcement learning method is affected by the state representation of the railway environment. The proposed method was tested to a part of the Dutch railways considering scenarios with single initial train delays and multiple initial train delays. In both cases, our method found high-quality rescheduling solutions within limited training episodes. ...
Doctoral thesis (2019) - Yongqiu Zhu, Rob Goverde
Railway systems are vulnerable to unexpected disruptions, which usually result in track blockages for a few hours. In practice, disruptions are handled manually and the resulting impact to passengers is rarely considered. To enable disruption management more efficiently, operator-friendly and passenger-friendly, this thesis develops mathematical models and solution methods for dynamic passenger assignment, timetable rescheduling, and the integrated passenger assignment with timetable rescheduling during disruptions. ...