Deep reinforcement learning for bag batch planning in a baggage handling system
J.L. Cramer (TU Delft - Mechanical Engineering)
F. Schulte – Mentor (TU Delft - Transport Engineering and Logistics)
Mark Duinkerken – Graduation committee member (TU Delft - Transport Engineering and Logistics)
P.E. Hoefkens – Graduation committee member (Vanderlande Industries B.V.)
L.M. Bosman – Graduation committee member (Vanderlande Industries B.V.)
M. Nelemans – Graduation committee member (Vanderlande Industries B.V.)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The current operational processes in an airport handling system (BHS) are not suitable for the implementation robots for the loading of baggage. This study aims to contribute to the implementation of new operational strategies, named batch-based pull approach, in a BHS to create a more automated and efficient operation. In this work a deep reinforcement learning (DRL) model is developed that can generate a baggage loading planning in real time for a baggage handling system in the dynamical operating environment to enhance robotic loading. The loading operation was formulated as a Markov decision process, and proximal policy optimization (PPO) algorithm was used to train the DRL agent. The DRL was compared with Vanderlande’s heuristic and tested on a case study of Brussels Airport. It automatically learned how to make baggage load planning decisions in simulations of a real-world BHS, generally loading more bags with a robot, but used more load units (LU), highlighting a trade-off between robot use efficiency and LU usage. This study demonstrates the potential of using deep-reinforcement learning for real-time loading planning in dynamic baggage handling systems with loading robots. However, more work is needed for consistent performance and real-world implementation. The results obtained are strongly related to the current model formulation, necessitating
additional research to gain more insight into the operating performance. This research serves as a proof-of-concept for future applications.
Files
File under embargo until 16-01-2027