Floor count from street view imagery using learning-based façade parsing

Master thesis (2023)

Authors

D.J. Dobson Architecture and the Built Environment

Contributors

G.A.K. Arroyo Ohori Urban Data Science - Architecture and the Built Environment (supervisor 1)

N. Ibrahimli Urban Data Science - Architecture and the Built Environment (supervisor 2)

Faculty

Architecture and the Built Environment

Deep Learning Kernel Density Estimation Facade Automatic Floor count Street View Imagery Façade parsing Floor counting Number of storeys Image rectification

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:857658d4-9134-4935-9c28-fef907d9ace1

Published Date

20-01-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Architecture and the Built Environment

Abstract

Street view imagery (SVI) is one of the largest (growing) resources in urban analytics. A global close-up of the urban environment, if you will, which is rich in (untapped) information such as floor count. Floor count is useful in many applications, from improving energy consumption calculations to creation of 3D city models without elevation data. So far, efforts to extract floor count from SVI are mainly approached as a classification problem with the use of convolutional neural networks (CNNs). Limitations of this approach include the need of large (manually annotated) datasets, and uncertainty how these models learn to count storeys. Therefore, we aim to develop a method that can be trained on available datasets and determine floor count in a more explainable manner. In order to make the floor count determination method more transparent, we mimic the row-wise counting of storeys as humans do: by vertically parsing a column of windows (and occasional door). Façade parsing is a common computer vision task that we can solve with deep learning. In this work, we employ the Mask R-CNN framework, that is trained on publicly available datasets, for the detection and segmentation of windows and doors. Then, the vertical distribution of detected / segmented windows and doors is estimated by computing the kernel density estimation function. The floor count is extracted by finding the number of maxima in the function, as the maxima represent the dense areas of windows and doors on a horizontal axis (i.e. storeys). To improve the results, an automatic image rectification is added as pre-processing step that enforces the regularity and repetitive occurrence of windows and doors. The full pipeline thus consists of three stages: 1) automatic image rectification, 2) window and door detection/ segmentation with Mask RCNN, 3) floor count estimation via maxima finding on the kernel density estimation (KDE) function. In addition, a small "wild" dataset was created that contains a higher variability in floor count, image quality and architectural styles, which better reflect real world SVI than existing façade datasets. The floor count performance of the full pipeline was evaluated on the Amsterdam Facade (subset), ECP, TRIMS and "wild SVI" datasets. Since floor count annotations were missing, these are manually added. For detection-based data, the best results are an accuracy of 83% and a mean absolute error (MAE) of 0.17. For normalised segmentation-based data, the best results are an accuracy of 80% and a MAE of 0.20. Considering the method is still at its infancy, the results are promising. With further improvements in the pipeline and addition of automatic façade acquisition, the approach can contribute in large scale extraction of floor count information from SVI. To encourage further development, the pipeline prototype, dataset and floor count annotations are open source and will be released on https://github.com/Dobberzoon/Facade2Floorcount.

Files

KEYNOTE_P5_5152739.pdf

(.pdf | 66 Mb)

P5_THESIS_DOBSON_5152739.pdf

(.pdf | 63.8 Mb)