Self-supervised learning for multi-label sewer defect classification

None, None; None, None; None, None; None, None

Self-supervised learning for multi-label sewer defect classification

Journal Article (2026)

Author(s)

Tugba Yildizli (TU Delft - Water Systems Engineering)

Tianlong Jia (TU Delft - Water Systems Engineering, Karlsruhe Institut für Technologie)

Jeroen Langeveld (TU Delft - Water Systems Engineering, Partners4UrbanWater)

Riccardo Taormina (TU Delft - Water Systems Monitoring & Modelling)

Computer vision Transfer learning Asset management Semi-supervised learning Sewer defect classification

DOI related publication

https://doi.org/10.1016/j.autcon.2025.106751 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:a459e506-b730-4f05-b7d0-25eececf0abe

More Info

expand_more

Publication Year

2026

Language

English

Journal title

Automation in Construction

Volume number

182

Article number

106751

Downloads counter

53

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2_CIW of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.

Files

1-s2.0-S0926580525007915-main.... (pdf)

(pdf | 2.64 Mb)