Deep - learning - based computer vision for underwater environments
A. Ilioudi (TU Delft - Human-Robot Interaction)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Deep Learning (DL) has transformed computer vision, leading to significant
progress in areas like autonomous vehicles and industrial automation. However, its application in underwater environments remains challenging due to factors such as light absorption, scattering, and water turbidity, which degrade image quality and hinder DL model performance. This thesis addresses these challenges by enhancing DL-based computer vision techniques for underwater scenarios, with a focus on autonomous robotic litter collection from the seabed. The work targets key limitations such as data scarcity, visual degradation, and scene variability, and proposes domain-informed approaches to enhance model generalization. The contributions cover the full DL pipeline, starting with the design of representative training data that supports object detection in shallow-water conditions, providing a benchmark for training and evaluation of detection algorithms. A multi-robot system is developed that integrates aerial, surface, and underwater vehicles to perform collaborative litter detection and collection. The thesis presents the system design, deployment, and the role of computer vision in the operational workflow. To address image degradation, an automated framework is proposed for selecting image enhancement methods based on task-specific performance metrics. Furthermore, environment-specific neural networks are introduced to handle variability in turbidity and lighting. Generalization to Out-of-Distribution (OOD) data is further addressed through a hybrid classification framework that combines a Convolutional Neural Network (CNN) with a physics-based classifier using the Moving Horizon Estimation (MHE) framework. Their outputs are fused via Dempster-Shafer theory to enable decision-making in unfamiliar scenarios. Finally, domain-informed neural networks are proposed to integrate physics-based knowledge into the DL pipeline via knowledge distillation. This method improves generalization and reduces dependence on large labeled datasets. The proposed methods are validated through simulation and real-world deployments, demonstrating improved performance and adaptability. Together, these contributions provide an integrated framework for deploying DL-based perception systems in challenging underwater environments.
Files
File under embargo until 15-05-2026