Self-supervised learning for multi-label sewer defect classification

Journal Article (2026)
Author(s)

Tugba Yildizli (TU Delft - Water Systems Engineering)

Tianlong Jia (TU Delft - Water Systems Engineering, Karlsruhe Institut für Technologie)

Jeroen Langeveld (TU Delft - Water Systems Engineering, Partners4UrbanWater)

Riccardo Taormina (TU Delft - Water Systems Monitoring & Modelling)

Research Group
Water Systems Engineering
DOI related publication
https://doi.org/10.1016/j.autcon.2025.106751
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Water Systems Engineering
Volume number
182
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2CIW of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.