Maritime Dock Detection using Camera

Master Thesis (2025)
Author(s)

D.J. Scherrenburg (TU Delft - Mechanical Engineering)

Contributor(s)

Holger Caesar – Mentor (TU Delft - Intelligent Vehicles)

D. Kotiadis – Mentor (Damen Shipyards)

J. Micah Prendergast – Graduation committee member (TU Delft - Human-Robot Interaction)

F Verburg – Graduation committee member (Damen Shipyards)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
05-06-2025
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Autonomous vessels offer potential benefits in safety, operational efficiency, and environmental impact, but require reliable perception systems to function independently. This thesis investigates the reliability of camera-based detection of maritime docks, a class of static but visually diverse objects that are often difficult to distinguish from their surroundings. While extensive research exists for detecting ships and buoys, docks remain underexplored, with no publicly available datasets containing a significant number of labeled instances.
To address this gap, the Dordrecht Dock Dataset was developed, containing 30,761 frames recorded under real-world maritime conditions across eight distinct docks. A custom interpolation-based annotation tool enabled efficient labeling, achieving annotation speeds of up to 3,000 frames per hour. Two deep learning models—Faster R-CNN and YOLO11n—were trained and evaluated on this dataset using a consistent pipeline. Evaluation followed a leave-one-dock-out cross-validation strategy with fixed test sets and standard performance metrics, including mAP50, mAP50–95, F1 score, and inference speed.
Faster R-CNN outperformed YOLO11n across nearly all accuracy metrics (mAP50 of 0.85 vs. 0.69), particularly at shorter ranges. YOLO11n demonstrated over twice the inference speed (37 FPS vs. 17 FPS), but exhibited weaker generalization and reduced reliability on visually different docks. Both models showed significant performance drops beyond 100–110 meters, emphasizing the need for high-resolution input or focus on short-range detection.

Files

License info not available