X. Zhang | TU Delft Repository

Dynamic Patch Focus for Transformer-based Gaze Estimation

Master thesis (2025) - D. Sochirca (author) , X. Zhang (mentor) , N. Tömen (mentor) , H. Wang (graduation committee member)

Appearance-based 3D gaze estimation must accommodate two conflicting needs: fine ocular detail and global facial context. Vanilla Vision Transformers (ViTs) struggle with both needs due to their fixed 16 × 16 patch grid that (i) fragments critical features like the eyes into mult ...

SpeechCAT: Cross-Attentive Transformer for Audio to Motion Generation

Master thesis (2025) - S. Deaconu (author) , X. Zhang (mentor) , J.C. van Gemert (mentor) , H. Wang (graduation committee member)

Audio-to-motion generation is an important task with applications in virtual avatar creation for XR systems and intelligent robot control in daily life scenarios.
Most current motion generation methods depend on a single encoder-decoder architecture to simultaneously model a ...

Towards Automatic Cerebral 3D-2D CTA-DSA Registration

Master thesis (2025) - C.A. Downs (author) , Theo van Walsum (mentor) , Ruisheng Su (mentor) , P. Matthijs van der Sluijs (mentor) , X. Zhang (mentor) , T. Höllt (graduation committee member) , M.J.T. Reinders (graduation committee member)

Stroke remains a leading cause of morbidity and mortality worldwide, despite advances in treatment modalities. Endovascular thrombectomy (EVT), a revolutionary intervention for ischemic stroke, is limited by its reliance on 2D fluoroscopic imaging, which lacks depth and compr ...

Gaze-Guided 3D Hand Motion Prediction for Detecting Intent in Egocentric Grasping Tasks

Master thesis (2024) - Y. He (author) , A.H.A. Stienen (mentor) , X. Zhang (mentor)

Human intention detection with hand motion prediction is critical to drive the upper-extremity assistive robots. However, the traditional methods relying on physiological signal measurement are restrictive and often lack environmental context. We propose a novel approach that int ...

Utilising 3D Gaussian Splatting for PointNet object classification

Exploring the potential of volume rendering techniques without using meshes

Bachelor thesis (2024) - D.J.A. van Dale (author) , X. Zhang (mentor)

The demand for high-quality 3D visualizations has surged across various professional fields, prompting significant advancements in computer graphics. One such advancement is 3D Gaussian Splatting, a technique evolving from Lee Westover’s splatting concept introduced in 1990. This ...

Reducing Overfitting in 3D Gaussian Splatting using Depth Supervision

Bachelor thesis (2024) - T.H.B. Spanhoff (author) , X. Zhang (mentor) , M. Weinmann (graduation committee member)

3D Gaussian Splatting (3DGS) is a method for representing 3D scenes, but is prone to overfitting when trained with limited viewpoint diversity, of- ten resulting in artifacts like floating Gaussians at incorrect depths. This paper addresses this issue by introducing 3D Gaussian S ...

Bridging the world of 2D and 3D Computer Vision

Self-Supervised Cross-modality Feature Learning through 3D Gaussian Splatting

Bachelor thesis (2024) - A. Simionescu (author) , X. Zhang (mentor) , M. Weinmann (graduation committee member)

Current robotic perception systems utilize a variety of sensors to estimate and understand a robot's surroundings. This paper focuses on a novel data representation technique that makes use of a recent scene reconstruction algorithm, known as 3D Gaussian Splatting, to explicitly ...

Semantic 3D segmentation of 3D Gaussian Splats

Assessing existing point cloud segmentation techniques on semantic segmentation of synthetic 3D Gaussian Splats scenes

Bachelor thesis (2024) - K.K. Jurski (author) , X. Zhang (mentor) , M. Weinmann (graduation committee member)

3D Gaussian Splatting (3DGS) is a promising 3D reconstruction and novel-view synthesis technique. However, the field of semantic 3D segmentation of 3D Gaussian Splats scenes remains largely unexplored. This paper discusses the challenges of performing 3D segmentation directly on ...

Application of Photogrammetry to Gaussian Splatting for mesh and texture reconstruction

Bachelor thesis (2024) - K.J. Kiisa (author) , X. Zhang (mentor) , M. Weinmann (graduation committee member)

Gaussian Splatting is a successful recent method for generating novel views of a scene based on photographs taken from that scene [1]. It uses rasterization in order to render the scenes it generates, which consist of 3D Gaussians. However, modern hardware and tools are designed ...

How can we reduce the effect of noise on 3D Gaussian Splats?

A Study on Deblurring and Recoloring Techniques to Enhance 3D Reconstructions

Bachelor thesis (2024) - T.G. Meijer (author) , X. Zhang (mentor) , M. Weinmann (mentor)

Multi-view image recognition is crucial for numerous applications such as autonomous vehicles and robotics, where accurate 3D reconstructions from 2D images are essential. However, the presence of various noise factors like motion blur, variable lighting, and changes in the field ...

From Points to Faces: An automotive lidar-based face recognition system

Master thesis (2023) - M.A.R.M. Humblet Vertongen (author) , Holger Caesar (mentor) , L. Peternel (coach) , X. Zhang (coach)

Face recognition using lidar presents challenges arising from high dimensionality and data sparsity, especially at longer distances. This paper proposes a novel approach for face recognition via automotive lidar. The approach leverages a combination of deep learning and point clo ...

3D Kinematics Estimation with Biomechanics Model

Master thesis (2023) - Z.Y. Lin (author) , J.C. van Gemert (mentor) , X. Zhang (mentor) , P. Kellnhofer (graduation committee member)

Human 3D kinematics estimation involves measuring joint angles and body segment scales to quantify and analyze the mechanics of human movements. It has applications in areas such as injury prevention, disease identification, and sports science. Conventional marker-based motion ca ...

Investigating Effects of Participant Variation on Performance of Visual Stimuli Reconstruction From fMRI Signals Using Machine Learning

Bachelor thesis (2023) - Q. Zheng (author) , X. Zhang (mentor) , N. Tömen (graduation committee member)

Image reconstruction from neural activation data is a field that has been growing in popularity with developments such as neuralink in the brain-machine interface space. To make better decisions when collecting data for this purpose, it is important to know what qualities to opti ...

A study of the impact of CNN architecture variation on predicting brain activity using feature-weighted receptive fields

Bachelor thesis (2023) - V. Murgoci (author) , X. Zhang (mentor) , N. Tömen (graduation committee member)

This study investigates the relationship between deep learning models and the human brain, specifically focusing on the prediction of brain activity in response to static visual stimuli using functional magnetic resonance imaging (fMRI). By leveraging intermediate outputs of pre- ...

Noise Attacks as a First Layer of Privacy Protection in Semantic Data Extraction From Brain Activity

Bachelor thesis (2023) - T.C. Walter (author) , X. Zhang (mentor)

This paper explores using synthetic noise superimposed on fMRI data to selectively impact the performance of the Generic Object Decoding (GOD) model developed at Kamitani Lab. The GOD model predicts image categories that subjects viewed, based on their recorded fMRI brain activit ...

Denoising task fMRI data for image reconstructions

Denoising of Functional Magnetic Resonance Imaging (fMRI) Data for Improved Visual Stimulus Reconstruction using Machine Learning

Bachelor thesis (2023) - N. Smolin (author) , X. Zhang (mentor) , N. Tömen (graduation committee member)

This study aims to investigate the impact of various denoising algorithms on the quality of visual stimulus reconstructions based on functional magnetic resonance imaging (fMRI) data. While fMRI provides a valuable, noninvasive method for assessing brain activity, the reliability ...

Identification of subjects from reconstructed images

Identification of individual subjects based on image reconstructions generated from fMRI brain scans

Bachelor thesis (2023) - A.G. Mercier (author) , X. Zhang (mentor) , N. Tömen (graduation committee member)

Reconstructing seen images from functional magnetic resonance imaging (fMRI) brain scans has been a growing topic of interest in the field of neuroscience, fostered by innovation in machine learning and AI. This paper investigates the possible presence of personal features allowi ...

Automatic Camera Extrinsics Estimation in the Catheterization Laboratory

Master thesis (2022) - J. Zeng (author) , J.H.G. Dauwels (mentor) , X. Zhang (mentor)

Surgical workflow analysis has gained more importance in operating rooms, which could take responsibility for the working condition, the safety of both patients and surgical personnel, as well as the working efficiency. Focusing on the optimization of the workflow, a set of camer ...

Automatic Detection of Mind-Wandering Based on Body and Hand Movements from “Mementos” Dataset

Bachelor thesis (2022) - A. Kārkliņš (author) , B.J.W. Dudzik (mentor) , X. Zhang (mentor) , H.S. Hung (mentor) , P.K. Murukannaiah (graduation committee member)

The aim of this research is to discuss if it is possible or feasible enough to detect Mind-wandering of individuals using their hand and body movements from video recordings. The basis for this research is “Mementos”[9] data set, containing over 2000 recordings of people watching ...

Automatic Detection of Mind-Wandering using Facial Expressions

Bachelor thesis (2022) - R. Kargul (author) , B.J.W. Dudzik (mentor) , X. Zhang (mentor) , H.S. Hung (mentor) , P.K. Murukannaiah (graduation committee member)

Spending time in front of screens has become an inescapable activity, which might be interrupted by unrelated external causes. While automatic approaches to identify mind-wandering (MW) have already been investigated, past research was done with self-reports or physiological data ...