Deep - learning - based computer vision for underwater environments

Doctoral Thesis (2025)
Authors

A. Ilioudi (TU Delft - Human-Robot Interaction)

Research Group
Human-Robot Interaction
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Human-Robot Interaction
ISBN (print)
978-94-93431-61-4
DOI:
https://doi.org/10.4233/uuid:047e28e3-2b91-4d60-a851-920ca716f287
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep Learning (DL) has transformed computer vision, leading to significant
progress in areas like autonomous vehicles and industrial automation. However, its application in underwater environments remains challenging due to factors such as light absorption, scattering, and water turbidity, which degrade image quality and hinder DL model performance. This thesis addresses these challenges by enhancing DL-based computer vision techniques for underwater scenarios, with a focus on autonomous robotic litter collection from the seabed. The work targets key limitations such as data scarcity, visual degradation, and scene variability, and proposes domain-informed approaches to enhance model generalization. The contributions cover the full DL pipeline, starting with the design of representative training data that supports object detection in shallow-water conditions, providing a benchmark for training and evaluation of detection algorithms. A multi-robot system is developed that integrates aerial, surface, and underwater vehicles to perform collaborative litter detection and collection. The thesis presents the system design, deployment, and the role of computer vision in the operational workflow. To address image degradation, an automated framework is proposed for selecting image enhancement methods based on task-specific performance metrics. Furthermore, environment-specific neural networks are introduced to handle variability in turbidity and lighting. Generalization to Out-of-Distribution (OOD) data is further addressed through a hybrid classification framework that combines a Convolutional Neural Network (CNN) with a physics-based classifier using the Moving Horizon Estimation (MHE) framework. Their outputs are fused via Dempster-Shafer theory to enable decision-making in unfamiliar scenarios. Finally, domain-informed neural networks are proposed to integrate physics-based knowledge into the DL pipeline via knowledge distillation. This method improves generalization and reduces dependence on large labeled datasets. The proposed methods are validated through simulation and real-world deployments, demonstrating improved performance and adaptability. Together, these contributions provide an integrated framework for deploying DL-based perception systems in challenging underwater environments.

Files

License info not available
warning

File under embargo until 15-05-2026