Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation

Conference Paper (2022)
Author(s)

L. Zeng (Student TU Delft)

Attila Lengyel (TU Delft - Pattern Recognition and Bioinformatics)

Nergis Tömen (TU Delft - Pattern Recognition and Bioinformatics)

Jan van Van Gemert (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2022 L. Zeng, A. Lengyel, N. Tömen, J.C. van Gemert
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 L. Zeng, A. Lengyel, N. Tömen, J.C. van Gemert
Research Group
Pattern Recognition and Bioinformatics
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this work, we leverage estimated depth to boost self-supervised contrastive learning for segmentation of urban scenes, where unlabeled videos are readily available for training self-supervised depth estimation. We argue that the semantics of a coherent group of pixels in 3D space is self-contained and invariant to the contexts in which they appear. We group coherent, semantically related pixels into coherent depth regions given their estimated depth and use copy-paste to synthetically vary their contexts. In this way, cross-context correspondences are built in contrastive learning and a context-invariant representation is learned. For unsupervised semantic segmentation of urban scenes, our method surpasses the previous state-of-the-art baseline by +7.14% in mIoU on Cityscapes and +6.65% on KITTI. For fine-tuning on Cityscapes and KITTI segmentation, our method is competitive with existing models, yet, we do not need to pre-train on ImageNet or COCO, while we are also more computationally efficient. Our code is available on https://github.com/LeungTsang/CPCDR.

Files

0893.pdf
(pdf | 10.6 Mb)
License info not available