SpatiaLLM

Bridging the Gap Between Natural Language and 3D Scans

Student Report (2025)
Author(s)

M.J. van der Meer (TU Delft - Architecture and the Built Environment)

H. Ye (TU Delft - Architecture and the Built Environment)

S.T. ter Braak (TU Delft - Architecture and the Built Environment)

J. Pille (TU Delft - Architecture and the Built Environment)

N. Singh (TU Delft - Architecture and the Built Environment)

Contributor(s)

L. Nan – Mentor (TU Delft - Urban Data Science)

Faculty
Architecture and the Built Environment
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
13-11-2025
Awarding Institution
Delft University of Technology
Project
['Synthesis Project 2025']
Programme
['Geomatics']
Faculty
Architecture and the Built Environment
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recent advances in large language models (LLMs) have expanded natural language reasoning and multimodal understanding but remain limited in grounding with 3D spatial environments. This project addresses that gap by developing a system that enables natural language interaction with indoor spatial data derived from light detection and ranging (LiDAR) point clouds and panoramic imagery provided by the client: ScanPlan. The system processes spatial data through a pipeline that includes room segmentation, geometric analysis, and object clustering. A structured query language lite (SQLite) database stores the structured information, which an AI agent queries using a reasoning framework that translates natural language into actionable commands. The system supports multimodal input, allowing users to interact via text or by selecting objects in 2D panoramas, which are then mapped to 3D point clouds using segment anything model 2 (SAM2). The interface combines a chat function with 2D and 3D viewers, making spatial data accessible to non-experts. While the prototype successfully answers a range of spatial and semantic queries, challenges remain in scaling room segmentation and handling complex multi-room relationships. The project demonstrates a step towards making rich 3Dbuilding data queryable through intuitive, language-based interaction.

Files

License info not available