Urban digital twins rely on accurate rooftop geometry, yet airborne lidar point clouds are frequently sparse and incomplete, leading to substantial information loss in building reconstruction. This thesis investigates diffusion--based learning as a remedy for high-fidelity roof r
...
Urban digital twins rely on accurate rooftop geometry, yet airborne lidar point clouds are frequently sparse and incomplete, leading to substantial information loss in building reconstruction. This thesis investigates diffusion--based learning as a remedy for high-fidelity roof recovery under severe data corruption.
This thesis proposes a two-stage framework that operates on 2.5D height-map representations. Stage~I introduces a dual-task diffusion model that jointly performs roof height-map completion and roof-line prediction. A novel Bidirectional Control Module enables reciprocal conditioning between the two tasks, enforcing geometric consistency during the denoising process. Stage~II employs a patch-based diffusion upsampler equipped with positional embeddings and a domain-specific global context encoder to synthesise high-resolution height maps while remaining computationally tractable for large and variably-sized buildings. A rigorous preprocessing pipeline further yields two challenging benchmarks, \textsc{S80\_i30} and \textsc{S80\_i80}, derived from 160k real-world building samples.
Extensive experiments conducted on these datasets demonstrate the effectiveness of the proposed approach. Under moderate corruption (\textsc{S80\_i30}), the completion model attains an \textit{RMSE} of \textbf{0.89}~m and a Chamfer distance of \textbf{0.06}, improving upon the state-of-the-art RoofDiffusion baseline by 13.2\% and 17.3\%, respectively. In the severe setting (\textsc{S80\_i80}), the method sustains a 13.5\% \textit{RMSE} reduction. The upsampling stage delivers an additional 10\% \textit{RMSE} gain over the best classical interpolator, and the end-to-end pipeline achieves \textit{RMSE} values of 0.91~m (moderate) and 1.42~m (severe).
The thesis contributes: (i) a structurally-aware diffusion framework for roof completion, (ii) a scalable patch-based upsampler, and (iii) public benchmarks that reflect real lidar degradation. Collectively, these advances close a critical gap between theoretical research and practical generation of LOD2.2 building models, facilitating more reliable urban analytics and planning applications.