Structure Guided Roof Heightmap Completion

via Diffusion Model

Master Thesis (2025)
Author(s)

X. Zhao (TU Delft - Architecture and the Built Environment)

Contributor(s)

H. Ledoux – Mentor (TU Delft - Urban Data Science)

W. GAO – Mentor (TU Delft - Urban Data Science)

Azarakhsh Rafiee – Graduation committee member (TU Delft - Digital Technologies)

R.Y. Peters – Graduation committee member (TU Delft - Urban Data Science)

Faculty
Architecture and the Built Environment
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
19-06-2025
Awarding Institution
Delft University of Technology
Programme
['Architecture, Urbanism and Building Sciences']
Faculty
Architecture and the Built Environment
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Urban digital twins rely on accurate rooftop geometry, yet airborne lidar point clouds are frequently sparse and incomplete, leading to substantial information loss in building reconstruction. This thesis investigates diffusion--based learning as a remedy for high-fidelity roof recovery under severe data corruption.

This thesis proposes a two-stage framework that operates on 2.5D height-map representations. Stage~I introduces a dual-task diffusion model that jointly performs roof height-map completion and roof-line prediction. A novel Bidirectional Control Module enables reciprocal conditioning between the two tasks, enforcing geometric consistency during the denoising process. Stage~II employs a patch-based diffusion upsampler equipped with positional embeddings and a domain-specific global context encoder to synthesise high-resolution height maps while remaining computationally tractable for large and variably-sized buildings. A rigorous preprocessing pipeline further yields two challenging benchmarks, \textsc{S80\_i30} and \textsc{S80\_i80}, derived from 160k real-world building samples.

Extensive experiments conducted on these datasets demonstrate the effectiveness of the proposed approach. Under moderate corruption (\textsc{S80\_i30}), the completion model attains an \textit{RMSE} of \textbf{0.89}~m and a Chamfer distance of \textbf{0.06}, improving upon the state-of-the-art RoofDiffusion baseline by 13.2\% and 17.3\%, respectively. In the severe setting (\textsc{S80\_i80}), the method sustains a 13.5\% \textit{RMSE} reduction. The upsampling stage delivers an additional 10\% \textit{RMSE} gain over the best classical interpolator, and the end-to-end pipeline achieves \textit{RMSE} values of 0.91~m (moderate) and 1.42~m (severe).

The thesis contributes: (i) a structurally-aware diffusion framework for roof completion, (ii) a scalable patch-based upsampler, and (iii) public benchmarks that reflect real lidar degradation. Collectively, these advances close a critical gap between theoretical research and practical generation of LOD2.2 building models, facilitating more reliable urban analytics and planning applications.

Files

Thesis.pdf
(pdf | 15.9 Mb)
License info not available