Semantic Segmentation of RGB-Z Aerial Imagery Using Convolutional Neural Networks

Master thesis (2020)

Authors

A.E. Mulder Architecture and the Built Environment

Contributors

B. Dukai Urban Data Science - Architecture and the Built Environment (mentor)

R.Y. Peters Urban Data Science - Architecture and the Built Environment (graduation committee member)

J.E. Stoter Urban Data Science - Architecture and the Built Environment (graduation committee member)

Sven Briels (coach)

Jean Michel Renders (coach)

Faculty

Architecture and the Built Environment, Architecture and the Built Environment

Deep learning CNN DSM Remotely sensed data Semantic segmentation Topographic maps RGB-Z Aerial imagery

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:b936953b-4c73-4ce1-a897-7da4287ff79a

Published Date

24-06-2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Architecture and the Built Environment

Abstract

Semantic segmentation (or pixel-level classification) of remotely sensed imagery has shown to be useful for applications in fields as mapping of land cover, object detection, change detection and land-use analysis. Deep learning algorithms called convolutional neural networks (CNNs) have shown to outperform traditional computer vision and machine learning approaches in tackling semantic segmentation tasks. Furthermore, addition of height information (Z) to aerial imagery (RGB) is believed to improve segmentation results. However, discussion remains on the following: to what extent height information adds value; the best way to combine RGB information with height information; and what type of height information can best be used. This study aims to answer these questions. In this research work, the CNN architectures FCN-8s, SegNet, U-Net and FuseNet-SF5 are trained to semantically segment 10 cm resolution true ortho imagery of Haarlem, potentially augmented with height information. The outputted topographic maps contain the classes building, road, water and other. Experiments are conducted that allow for the comparison of 1) models trained on RGB and on RGB-Z, 2) models combining RGB and height information through data fusion and through data stacking, and 3) models trained using different types of absolute and relative height approaches. Performances are compared based on scores on the performance measure (mean) intersection over union (IoU) and through visual assessment of outputted prediction maps. The results indicated that on average segmentation performance improves by approximately 1 percent when absolute height information is added. The class building showed to benefit the most from the addition of height information. Furthermore, extracting features from height information in a separate encoder and fusing these into RGB feature maps, led to a higher overall segmentation quality than when height information is provided as a stacked extra band and processed in the same encoder as the RGB information. Finally, models using relative height delivered a higher quality segmentation than when absolute height approaches were used, especially for large objects. The best performing model; FuseNet-SF5 trained on RGB imagery and pixel-level, relative height, retrieved a mean IoU of 0.8427 and IoUs of 0.8744, 0.7865, 0.9131 and 0.7966 for the classes building, road, water and other respectively. This model was able to correctly classify over 90% of the pixels of 67% of all the objects present in the ground truth. Overall, this study showed that, when considering semantic segmentation of aerial RGB imagery, 1) height information can improve segmentation results, 2) adding height information through data fusion can result in a higher segmentation quality than when data stacking is used, and 3) providing relative height to a network, rather than absolute height, can improve semantic segmentation quality.

Files

Thesis_SemanticSegmentation_Am... (.pdf)

(.pdf | 152 Mb)

Presentation_SemanticSegmentat... (.pdf)

(.pdf | 14.4 Mb)

GraduationPlan_SemanticSegment... (.pdf)

(.pdf | 3.98 Mb)