Building classification of VHR airborne stereo images using fully convolutional networks and free training samples

None, None; None, None; None, None; None, None; None, None

Building classification of VHR airborne stereo images using fully convolutional networks and free training samples

Journal Article (2018)

Author(s)

Y Chen (Student TU Delft)

W. Gao (TU Delft - Urban Data Science)

E. Widyaningrum (TU Delft - Optical and Laser Remote Sensing)

M. Zheng (TU Delft - OLD Department of GIS Technology)

K. Zhou (TU Delft - Optical and Laser Remote Sensing)

Research Group

Urban Data Science

Copyright

DOI related publication

https://doi.org/10.5194/isprs-archives-XLII-4-87-2018

FCN Atrous convolution Base map Building classification Fine tuning Free training samples Mislabels VHR airborne stereo images

To reference this document use:

https://resolver.tudelft.nl/uuid:70995006-c9b3-4dac-a199-9eb3f9263232

More Info

expand_more

Publication Year

2018

Language

English

Copyright

Research Group

Urban Data Science

Issue number

4

Volume number

42

Pages (from-to)

155-160

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Semantic segmentation, especially for buildings, from the very high resolution (VHR) airborne images is an important task in urban mapping applications. Nowadays, the deep learning has significantly improved and applied in computer vision applications. Fully Convolutional Networks (FCN) is one of the tops voted method due to their good performance and high computational efficiency. However, the state-of-art results of deep nets depend on the training on large-scale benchmark datasets. Unfortunately, the benchmarks of VHR images are limited and have less generalization capability to another area of interest. As existing high precision base maps are easily available and objects are not changed dramatically in an urban area, the map information can be used to label images for training samples. Apart from object changes between maps and images due to time differences, the maps often cannot perfectly match with images. In this study, the main mislabeling sources are considered and addressed by utilizing stereo images, such as relief displacement, different representation between the base map and the image, and occlusion areas in the image. These free training samples are then fed to a pre-trained FCN. To find the better result, we applied fine-tuning with different learning rates and freezing different layers. We further improved the results by introducing atrous convolution. By using free training samples, we achieve a promising building classification with 85.6% overall accuracy and 83.77% F1 score, while the result from ISPRS benchmark by using manual labels has 92.02% overall accuracy and 84.06% F1 score, due to the building complexities in our study area.

Files

Isprs_archives_XLII_4_87_2018.... (pdf)

(pdf | 1.39 Mb)