Maritime Dock Detection using Camera

None, None

Maritime Dock Detection using Camera

Master Thesis (2025)

Author(s)

D.J. Scherrenburg (TU Delft - Mechanical Engineering)

Contributor(s)

Holger Caesar – Mentor (TU Delft - Intelligent Vehicles)

D. Kotiadis – Mentor (Damen Shipyards)

J.M. Prendergast – Graduation committee member (TU Delft - Human-Robot Interaction)

F Verburg – Graduation committee member (Damen Shipyards)

Faculty

Mechanical Engineering

Deep Learning Camera Object Detection Maritime Docks

To reference this document use:

https://resolver.tudelft.nl/uuid:c0f51541-6326-4f38-b97d-48bf7831002f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

05-06-2025

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Autonomous vessels offer potential benefits in safety, operational efficiency, and environmental impact, but require reliable perception systems to function independently. This thesis investigates the reliability of camera-based detection of maritime docks, a class of static but visually diverse objects that are often difficult to distinguish from their surroundings. While extensive research exists for detecting ships and buoys, docks remain underexplored, with no publicly available datasets containing a significant number of labeled instances.
To address this gap, the Dordrecht Dock Dataset was developed, containing 30,761 frames recorded under real-world maritime conditions across eight distinct docks. A custom interpolation-based annotation tool enabled efficient labeling, achieving annotation speeds of up to 3,000 frames per hour. Two deep learning models—Faster R-CNN and YOLO11n—were trained and evaluated on this dataset using a consistent pipeline. Evaluation followed a leave-one-dock-out cross-validation strategy with fixed test sets and standard performance metrics, including mAP50, mAP50–95, F1 score, and inference speed.
Faster R-CNN outperformed YOLO11n across nearly all accuracy metrics (mAP50 of 0.85 vs. 0.69), particularly at shorter ranges. YOLO11n demonstrated over twice the inference speed (37 FPS vs. 17 FPS), but exhibited weaker generalization and reduced reliability on visually different docks. Both models showed significant performance drops beyond 100–110 meters, emphasizing the need for high-resolution input or focus on short-range detection.

Files

Thesis_Report_-_Daan_Scherrenb... (pdf)

(pdf | 15.1 Mb)

License info not available