Bringing a Personal Point of View: Evaluating Dynamic 3D Gaussian Splatting for Egocentric Scene Reconstruction

Master Thesis (2025)
Author(s)

J. Warchocki (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.C. van Gemert – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Michael Weinmann – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
18-06-2025
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Egocentric video provides a unique view into human perception and interaction, with growing relevance for augmented reality, robotics, and assistive technologies. However, rapid camera motion and complex scene dynamics pose major challenges for 3D reconstruction from this perspective. While 3D Gaussian Splatting (3DGS) has become a state-of-the-art method for efficient, high-quality novel view synthesis, variants, that focus on reconstructing dynamic scenes from monocular video are rarely evaluated on egocentric video. It remains unclear whether existing models generalize to this setting or if egocentric-specific solutions are needed. In this work, we evaluate dynamic monocular 3DGS models on egocentric and exocentric video using paired ego-exo recordings from the EgoExo4D dataset. We find that reconstruction quality is consistently lower in egocentric views. Analysis reveals that the difference in reconstruction quality, measured in peak signal-to-noise ratio, stems from the reconstruction of static, not dynamic, content. Our findings underscore current limitations and motivate the development of egocentric-specific approaches, while also highlighting the value of separately evaluating static and dynamic regions of a video.

Files

Master_Thesis_Warchocki.pdf
(pdf | 28.7 Mb)
License info not available