Solving the Online 3D Bin Packing Problem with Graph-Based Reinforcement Learning
G. Corvi (TU Delft - Mechanical Engineering)
C. Lieu – Mentor (TU Delft - Learning & Autonomous Control)
Ronald Poelman – Graduation committee member (Fizyr B.V.)
Martijn Wisse – Graduation committee member (TU Delft - Robot Dynamics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The rapidly growing volume of parcel shipments is straining transportation and logistics sectors, highlighting the need for innovative solutions to optimize packing and loading processes. The online bin packing problem (BPP), an NP-hard computational problem, finds practical applications in numerous sectors, including modern packaging and intelligent logistics. This study proposes a novel reinforcement learning (RL) approach to tackle the online 3D-BPP emphasizing applicability and versatility. The key innovation is the representation of the packing scene as a graph, enabling effective encoding of task-specific high-level features. This graph-based structure serves as the foundation for an RL agent designed to learn an optimal packing strategy through dynamic interaction with the environment. The proposed approach uniquely operates within the continuous domain, enhancing generalization across diverse packing tasks. Experimental evaluations in both simulated environments and a real-world setting demonstrate that the solution achieves state-of-the-art performance across multiple complex three-dimensional packing scenarios.