Reconstruction and Rendering of Buildings as Radiance Fields for View Synthesis

More Info


In inspection and display scenarios, reconstructing and rendering the entire surface of a building is a critical step in presenting the overall condition of the building. In building reconstruction, most works are based on point clouds because of their enhanced availability. In recent years, neural radiance fields (NeRF) have become a common function for implementing novel view synthesis. Compared to other traditional 3D graphic methods, NeRF-based models have a solid ability to produce photorealistic images with rich details that point clouds based methods cannot offer. As a result, we decided to investigate the performance of this technique in architectural scenes and look for ways to improve it for more significant scenes.

This thesis explores the ability to reconstruct large-field scenes with NeRF-based models. NeRF introduced a fully-connected network to predict the volume density and view-dependent emitted radiance at the special location, which will be projected into an image through classic volume rendering techniques. Due to the limitation of near-field ambiguity and parameterization of unbounded scenes, the original NeRF does not perform well on 360° input view, especially when the inputs are sparse. An inverted sphere parameterization that facilitates free view synthesis is introduced to address this limitation so that the foreground and background views can be trained separately. Besides that, we also compare the performance of tracing different light geometries, ray and cone, respectively. Meanwhile, to generate the reconstructed scene precisely, raw RGB images should be pre-processed to estimate the corresponding camera parameters. Finally, customized camera paths should be prepared to generate the final rendered video.

According to our experiments, training foreground and background separately is a promising method to solve practical large-scale scene reconstruction problems. A complete wrap-around view of the target building can be obtained using adjusted camera path parameters. Furthermore, introducing conical frustum casting into the original model also provides an alternative method to implement reconstruction. We named this method mip-NeRF++, which can contribute to the final results to some extent.