DD
D.J. Dobson
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Street view imagery (SVI) is one of the largest (growing) resources in urban analytics. A global close-up of the urban environment, if you will, which is rich in (untapped) information such as floor count. Floor count is useful in many applications, from improving energy consumption calculations to creation of 3D city models without elevation data. So far, efforts to extract floor count from SVI are mainly approached as a classification problem with the use of convolutional neural networks (CNNs). Limitations of this approach include the need of large (manually annotated) datasets, and uncertainty how these models learn to count storeys. Therefore, we aim to develop a method that can be trained on available datasets and determine floor count in a more explainable manner.
In order to make the floor count determination method more transparent, we mimic the row-wise counting of storeys as humans do: by vertically parsing a column of windows (and occasional door). Façade parsing is a common computer vision task that we can solve with deep learning. In this work, we employ the Mask R-CNN framework, that is trained on publicly available datasets, for the detection and segmentation of windows and doors. Then, the vertical distribution of detected / segmented windows and doors is estimated by computing the kernel density estimation function. The floor count is extracted by finding the number of maxima in the function, as the maxima represent the dense areas of windows and doors on a horizontal axis (i.e. storeys). To improve the results, an automatic image rectification is added as pre-processing step that enforces the regularity and repetitive occurrence of windows and doors. The full pipeline thus consists of three stages: 1) automatic image rectification, 2) window and door detection/ segmentation with Mask RCNN, 3) floor count estimation via maxima finding on the kernel density estimation (KDE) function. In addition, a small "wild" dataset was created that contains a higher variability in floor count, image quality and architectural styles, which better reflect real world SVI than existing façade datasets.
The floor count performance of the full pipeline was evaluated on the Amsterdam Facade (subset), ECP, TRIMS and "wild SVI" datasets. Since floor count annotations were missing, these are manually added. For detection-based data, the best results are an accuracy of 83% and a mean absolute error (MAE) of 0.17. For normalised segmentation-based data, the best results are an accuracy of 80% and a MAE of 0.20. Considering the method is still at its infancy, the results are promising. With further improvements in the pipeline and addition of automatic façade acquisition, the approach can contribute in large scale extraction of floor count information from SVI. To encourage further development, the pipeline prototype, dataset and floor count annotations are open source and will be released on https://github.com/Dobberzoon/Facade2Floorcount. ...
In order to make the floor count determination method more transparent, we mimic the row-wise counting of storeys as humans do: by vertically parsing a column of windows (and occasional door). Façade parsing is a common computer vision task that we can solve with deep learning. In this work, we employ the Mask R-CNN framework, that is trained on publicly available datasets, for the detection and segmentation of windows and doors. Then, the vertical distribution of detected / segmented windows and doors is estimated by computing the kernel density estimation function. The floor count is extracted by finding the number of maxima in the function, as the maxima represent the dense areas of windows and doors on a horizontal axis (i.e. storeys). To improve the results, an automatic image rectification is added as pre-processing step that enforces the regularity and repetitive occurrence of windows and doors. The full pipeline thus consists of three stages: 1) automatic image rectification, 2) window and door detection/ segmentation with Mask RCNN, 3) floor count estimation via maxima finding on the kernel density estimation (KDE) function. In addition, a small "wild" dataset was created that contains a higher variability in floor count, image quality and architectural styles, which better reflect real world SVI than existing façade datasets.
The floor count performance of the full pipeline was evaluated on the Amsterdam Facade (subset), ECP, TRIMS and "wild SVI" datasets. Since floor count annotations were missing, these are manually added. For detection-based data, the best results are an accuracy of 83% and a mean absolute error (MAE) of 0.17. For normalised segmentation-based data, the best results are an accuracy of 80% and a MAE of 0.20. Considering the method is still at its infancy, the results are promising. With further improvements in the pipeline and addition of automatic façade acquisition, the approach can contribute in large scale extraction of floor count information from SVI. To encourage further development, the pipeline prototype, dataset and floor count annotations are open source and will be released on https://github.com/Dobberzoon/Facade2Floorcount. ...
Street view imagery (SVI) is one of the largest (growing) resources in urban analytics. A global close-up of the urban environment, if you will, which is rich in (untapped) information such as floor count. Floor count is useful in many applications, from improving energy consumption calculations to creation of 3D city models without elevation data. So far, efforts to extract floor count from SVI are mainly approached as a classification problem with the use of convolutional neural networks (CNNs). Limitations of this approach include the need of large (manually annotated) datasets, and uncertainty how these models learn to count storeys. Therefore, we aim to develop a method that can be trained on available datasets and determine floor count in a more explainable manner.
In order to make the floor count determination method more transparent, we mimic the row-wise counting of storeys as humans do: by vertically parsing a column of windows (and occasional door). Façade parsing is a common computer vision task that we can solve with deep learning. In this work, we employ the Mask R-CNN framework, that is trained on publicly available datasets, for the detection and segmentation of windows and doors. Then, the vertical distribution of detected / segmented windows and doors is estimated by computing the kernel density estimation function. The floor count is extracted by finding the number of maxima in the function, as the maxima represent the dense areas of windows and doors on a horizontal axis (i.e. storeys). To improve the results, an automatic image rectification is added as pre-processing step that enforces the regularity and repetitive occurrence of windows and doors. The full pipeline thus consists of three stages: 1) automatic image rectification, 2) window and door detection/ segmentation with Mask RCNN, 3) floor count estimation via maxima finding on the kernel density estimation (KDE) function. In addition, a small "wild" dataset was created that contains a higher variability in floor count, image quality and architectural styles, which better reflect real world SVI than existing façade datasets.
The floor count performance of the full pipeline was evaluated on the Amsterdam Facade (subset), ECP, TRIMS and "wild SVI" datasets. Since floor count annotations were missing, these are manually added. For detection-based data, the best results are an accuracy of 83% and a mean absolute error (MAE) of 0.17. For normalised segmentation-based data, the best results are an accuracy of 80% and a MAE of 0.20. Considering the method is still at its infancy, the results are promising. With further improvements in the pipeline and addition of automatic façade acquisition, the approach can contribute in large scale extraction of floor count information from SVI. To encourage further development, the pipeline prototype, dataset and floor count annotations are open source and will be released on https://github.com/Dobberzoon/Facade2Floorcount.
In order to make the floor count determination method more transparent, we mimic the row-wise counting of storeys as humans do: by vertically parsing a column of windows (and occasional door). Façade parsing is a common computer vision task that we can solve with deep learning. In this work, we employ the Mask R-CNN framework, that is trained on publicly available datasets, for the detection and segmentation of windows and doors. Then, the vertical distribution of detected / segmented windows and doors is estimated by computing the kernel density estimation function. The floor count is extracted by finding the number of maxima in the function, as the maxima represent the dense areas of windows and doors on a horizontal axis (i.e. storeys). To improve the results, an automatic image rectification is added as pre-processing step that enforces the regularity and repetitive occurrence of windows and doors. The full pipeline thus consists of three stages: 1) automatic image rectification, 2) window and door detection/ segmentation with Mask RCNN, 3) floor count estimation via maxima finding on the kernel density estimation (KDE) function. In addition, a small "wild" dataset was created that contains a higher variability in floor count, image quality and architectural styles, which better reflect real world SVI than existing façade datasets.
The floor count performance of the full pipeline was evaluated on the Amsterdam Facade (subset), ECP, TRIMS and "wild SVI" datasets. Since floor count annotations were missing, these are manually added. For detection-based data, the best results are an accuracy of 83% and a mean absolute error (MAE) of 0.17. For normalised segmentation-based data, the best results are an accuracy of 80% and a MAE of 0.20. Considering the method is still at its infancy, the results are promising. With further improvements in the pipeline and addition of automatic façade acquisition, the approach can contribute in large scale extraction of floor count information from SVI. To encourage further development, the pipeline prototype, dataset and floor count annotations are open source and will be released on https://github.com/Dobberzoon/Facade2Floorcount.
Student report
(2021)
-
D.J. Dobson, H. Dong, N. van der Horst, L.M. Langhorst, J.A.J. van der Vaart, Z. Wu, L. Nan, S. Du, Dirk Voets
Storing accurate models of complex geometries in a compact way has become an increasingly challenging issue, especially when dealing with large datasets. One of such datasets is Cobra-Groeninzicht's database of all trees in the Netherlands. In the gaming industry, a new technique is being used to generate tree models: the L-system. An L-system stores a string representation of the structural model of a tree, with the added possibility for recursive modelling using growing rules. This format proves a promising alternative to more traditional methods of storing complex geometries. However, it remains unclear whether it can be an accurate enough representation for modelling and analysing real-life trees.
In this research project, the AdTree algorithm is used to reconstruct a skeleton from a point cloud of a single tree. This skeleton is then transformed to an L-System string format, as well as a CityJSON format (both in JSON structure). The L-system format comes with the advantage that it allows for several methods of increasing its compactness further (growing, generalisation). The overall size of these files also indicates fewer storage space is needed to store the tree geometry. The quality of the L-System skeleton is nearly equal to the input, the skeleton generated by. Assuming it can be read and drawn using a Turtle program, the L-system thus allows for storing the same geometric information more compactly than traditional storage formats, with sufficient accuracy, and the added possibilities of growing or generalising the model. ...
In this research project, the AdTree algorithm is used to reconstruct a skeleton from a point cloud of a single tree. This skeleton is then transformed to an L-System string format, as well as a CityJSON format (both in JSON structure). The L-system format comes with the advantage that it allows for several methods of increasing its compactness further (growing, generalisation). The overall size of these files also indicates fewer storage space is needed to store the tree geometry. The quality of the L-System skeleton is nearly equal to the input, the skeleton generated by. Assuming it can be read and drawn using a Turtle program, the L-system thus allows for storing the same geometric information more compactly than traditional storage formats, with sufficient accuracy, and the added possibilities of growing or generalising the model. ...
Storing accurate models of complex geometries in a compact way has become an increasingly challenging issue, especially when dealing with large datasets. One of such datasets is Cobra-Groeninzicht's database of all trees in the Netherlands. In the gaming industry, a new technique is being used to generate tree models: the L-system. An L-system stores a string representation of the structural model of a tree, with the added possibility for recursive modelling using growing rules. This format proves a promising alternative to more traditional methods of storing complex geometries. However, it remains unclear whether it can be an accurate enough representation for modelling and analysing real-life trees.
In this research project, the AdTree algorithm is used to reconstruct a skeleton from a point cloud of a single tree. This skeleton is then transformed to an L-System string format, as well as a CityJSON format (both in JSON structure). The L-system format comes with the advantage that it allows for several methods of increasing its compactness further (growing, generalisation). The overall size of these files also indicates fewer storage space is needed to store the tree geometry. The quality of the L-System skeleton is nearly equal to the input, the skeleton generated by. Assuming it can be read and drawn using a Turtle program, the L-system thus allows for storing the same geometric information more compactly than traditional storage formats, with sufficient accuracy, and the added possibilities of growing or generalising the model.
In this research project, the AdTree algorithm is used to reconstruct a skeleton from a point cloud of a single tree. This skeleton is then transformed to an L-System string format, as well as a CityJSON format (both in JSON structure). The L-system format comes with the advantage that it allows for several methods of increasing its compactness further (growing, generalisation). The overall size of these files also indicates fewer storage space is needed to store the tree geometry. The quality of the L-System skeleton is nearly equal to the input, the skeleton generated by. Assuming it can be read and drawn using a Turtle program, the L-system thus allows for storing the same geometric information more compactly than traditional storage formats, with sufficient accuracy, and the added possibilities of growing or generalising the model.