Reinforcement Learning for the Discrete Dynamic Berth Allocation Problem

Evaluating a trained Discrete Dynamic Berth Allocation model on Berth breakdowns

Bachelor Thesis (2026)
Author(s)

T. Kuklys (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. March Moya – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

N. Yorke-Smith – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
21-06-2026
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
6
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The problem of scheduling vessels in a port as they arrive one-by-one is known as the Dynamic Berth Allocation Problem and it is NP-hard. This paper analyses the influence of berth breakdowns on the scheduling and optimality of a trained Reinforcement Learning model by March Moya et al. for such a problem. Several breakdown parameters, including frequency, severity, dura- tion, and the probability of binary versus partial breakdowns, were examined independently and in combination with one another with respect to different scheduling heuristics. The breakdowns were dynamically injected into the model’s event loop so that it did not have knowledge of upcoming breakdowns. Each experimental configuration was evaluated using ten random seeds and the sample mean and standard deviation were computed. The results showed low variance between different seeds and configurations. Breakdown frequency was the main factor limiting the model’s perfor- mance, moving the performance from a 19.6% advantage to 6.4% in the most extreme cases when compared to the baseline heuristic of WTSP. The other parameters did not produce significant model degradation.

Files

License info not available