XZ

X. Zhang

24 records found

Appearance-based 3D gaze estimation must accommodate two conflicting needs: fine ocular detail and global facial context. Vanilla Vision Transformers (ViTs) struggle with both needs due to their fixed 16 × 16 patch grid that (i) fragments critical features like the eyes into mult ...
Audio-to-motion generation is an important task with applications in virtual avatar creation for XR systems and intelligent robot control in daily life scenarios.
Most current motion generation methods depend on a single encoder-decoder architecture to simultaneously model a ...
Stroke remains a leading cause of morbidity and mortality worldwide, despite advances in treatment modalities. Endovascular thrombectomy (EVT), a revolutionary intervention for ischemic stroke, is limited by its reliance on 2D fluoroscopic imaging, which lacks depth and compr ...
Human intention detection with hand motion prediction is critical to drive the upper-extremity assistive robots. However, the traditional methods relying on physiological signal measurement are restrictive and often lack environmental context. We propose a novel approach that int ...

Utilising 3D Gaussian Splatting for PointNet object classification

Exploring the potential of volume rendering techniques without using meshes

The demand for high-quality 3D visualizations has surged across various professional fields, prompting significant advancements in computer graphics. One such advancement is 3D Gaussian Splatting, a technique evolving from Lee Westover’s splatting concept introduced in 1990. This ...

Bridging the world of 2D and 3D Computer Vision

Self-Supervised Cross-modality Feature Learning through 3D Gaussian Splatting

Current robotic perception systems utilize a variety of sensors to estimate and understand a robot's surroundings. This paper focuses on a novel data representation technique that makes use of a recent scene reconstruction algorithm, known as 3D Gaussian Splatting, to explicitly ...
Gaussian Splatting is a successful recent method for generating novel views of a scene based on photographs taken from that scene [1]. It uses rasterization in order to render the scenes it generates, which consist of 3D Gaussians. However, modern hardware and tools are designed ...

Semantic 3D segmentation of 3D Gaussian Splats

Assessing existing point cloud segmentation techniques on semantic segmentation of synthetic 3D Gaussian Splats scenes

3D Gaussian Splatting (3DGS) is a promising 3D reconstruction and novel-view synthesis technique. However, the field of semantic 3D segmentation of 3D Gaussian Splats scenes remains largely unexplored. This paper discusses the challenges of performing 3D segmentation directly on ...
3D Gaussian Splatting (3DGS) is a method for representing 3D scenes, but is prone to overfitting when trained with limited viewpoint diversity, of- ten resulting in artifacts like floating Gaussians at incorrect depths. This paper addresses this issue by introducing 3D Gaussian S ...

How can we reduce the effect of noise on 3D Gaussian Splats?

A Study on Deblurring and Recoloring Techniques to Enhance 3D Reconstructions

Multi-view image recognition is crucial for numerous applications such as autonomous vehicles and robotics, where accurate 3D reconstructions from 2D images are essential. However, the presence of various noise factors like motion blur, variable lighting, and changes in the field ...
Face recognition using lidar presents challenges arising from high dimensionality and data sparsity, especially at longer distances. This paper proposes a novel approach for face recognition via automotive lidar. The approach leverages a combination of deep learning and point clo ...
Human 3D kinematics estimation involves measuring joint angles and body segment scales to quantify and analyze the mechanics of human movements. It has applications in areas such as injury prevention, disease identification, and sports science. Conventional marker-based motion ca ...
Image reconstruction from neural activation data is a field that has been growing in popularity with developments such as neuralink in the brain-machine interface space. To make better decisions when collecting data for this purpose, it is important to know what qualities to opti ...
This study investigates the relationship between deep learning models and the human brain, specifically focusing on the prediction of brain activity in response to static visual stimuli using functional magnetic resonance imaging (fMRI). By leveraging intermediate outputs of pre- ...

Denoising task fMRI data for image reconstructions

Denoising of Functional Magnetic Resonance Imaging (fMRI) Data for Improved Visual Stimulus Reconstruction using Machine Learning

This study aims to investigate the impact of various denoising algorithms on the quality of visual stimulus reconstructions based on functional magnetic resonance imaging (fMRI) data. While fMRI provides a valuable, noninvasive method for assessing brain activity, the reliability ...
This paper explores using synthetic noise superimposed on fMRI data to selectively impact the performance of the Generic Object Decoding (GOD) model developed at Kamitani Lab. The GOD model predicts image categories that subjects viewed, based on their recorded fMRI brain activit ...

Identification of subjects from reconstructed images

Identification of individual subjects based on image reconstructions generated from fMRI brain scans

Reconstructing seen images from functional magnetic resonance imaging (fMRI) brain scans has been a growing topic of interest in the field of neuroscience, fostered by innovation in machine learning and AI. This paper investigates the possible presence of personal features allowi ...
Surgical workflow analysis has gained more importance in operating rooms, which could take responsibility for the working condition, the safety of both patients and surgical personnel, as well as the working efficiency. Focusing on the optimization of the workflow, a set of camer ...
Mind wandering occurs when a person’s attention unintentionally shifts away from their current thought or task. Being able to automatically detect cases of mind wandering can assist applications with attention retention, and help people with maintaining focus. Many methods have b ...
Spending time in front of screens has become an inescapable activity, which might be interrupted by unrelated external causes. While automatic approaches to identify mind-wandering (MW) have already been investigated, past research was done with self-reports or physiological data ...