Semantic Supervision and Representation Design in 3D Gaussian Splatting for Urban Scene Understanding

None, None

Semantic Supervision and Representation Design in 3D Gaussian Splatting for Urban Scene Understanding

Master Thesis (2026)

Author(s)

H.E. Chassagnette (TU Delft - Mechanical Engineering)

Contributor(s)

Holger Caesar – Mentor (TU Delft - Mechanical Engineering)

M. Weinmann – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Mechanical Engineering

Autonomous Vehicles Robotics Semantic 3D Gaussian Splatting

To reference this document use

https://resolver.tudelft.nl/uuid:29cbbd42-f383-4b67-893d-d6650e14b352

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

02-04-2026

Awarding Institution

Delft University of Technology

Programme

Mechanical Engineering, Vehicle Engineering, Cognitive Robotics

Faculty

Mechanical Engineering

Downloads counter

15

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

3D Gaussian Splatting (3DGS) has recently emerged as an efficient method for high-fidelity scene reconstruction in autonomous driving environments. While semantic information has been incorporated into Gaussian based representations for scene understanding tasks, it's broader potential for influencing the training process remains unexplored.
This thesis investigates how semantic supervision can be integrated into 3DGS training through several semantic-aware strategies, including alternative semantic loss functions, weighting schemes, and semantic-guided densification mechanisms. In addition, we explore different ways of organising RGB and semantic information within the representation. Since RGB appearance and semantic information differ in complexity, we compare a joint Gaussian representation, where RGB and semantic supervision act on the same primitives, with a separated Gaussian representation, where semantic information is modelled by an independent Gaussian set.
Experimental results show that the choice of semantic classification loss is the dominant factor influencing semantic performance, while auxiliary strategies do not provide significant improvements. Furthermore, we observe a clear trade-off between representation designs: the joint representation achieves stronger semantic performance but at the cost of degradation in RGB reconstruction quality, whereas the separated representations preserves RGB fidelity with minimal degradation while still achieving good semantic performance. These findings highlight the trade-offs between representations and motivate the exploration of hybrid organisations that better balance RGB reconstruction quality and semantic performance.

Files

ThesisReport_3DGS_HugoChassagn... (pdf)

(pdf | 2.5 Mb)

License info not available