Gaze-Guided 3D Hand Motion Prediction for Detecting Intent in Egocentric Grasping Tasks

None, None; None, None; None, None

Gaze-Guided 3D Hand Motion Prediction for Detecting Intent in Egocentric Grasping Tasks

Conference Paper (2025)

Author(s)

Yufei He (Student TU Delft)

Xucong Zhang (TU Delft - Pattern Recognition and Bioinformatics)

Arno H.A. Stienen (TU Delft - Biomechatronics & Human-Machine Control)

Research Group

Pattern Recognition and Bioinformatics

DOI related publication

https://doi.org/10.1109/IROS60139.2025.11246573 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:51a7b665-87c6-4a78-9de0-450f603127d1

More Info

expand_more

Publication Year

2025

Language

English

Research Group

Pattern Recognition and Bioinformatics

Pages (from-to)

14580-14586

Publisher

IEEE

ISBN (print)

979-8-3315-4394-5

ISBN (electronic)

979-8-3315-4393-8

Event

2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025 (2025-10-19 - 2025-10-25), Hangzhou, China

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Human intention detection with hand motion prediction is critical to drive the upper-extremity assistive robots in neurorehabilitation applications. However, the traditional methods relying on physiological signal measurement are restrictive and often lack environmental context. We propose a novel approach that predicts future sequences of both hand poses and joint positions. This method integrates gaze information, historical hand motion sequences, and environmental object data, adapting dynamically to the assistive needs of the patient without prior knowledge of the intended object for grasping. Specifically, we use a vector-quantized variational autoencoder for robust hand pose encoding with an autoregressive generative transformer for effective hand motion sequence prediction. We demonstrate the usability of these novel techniques in a pilot study with healthy subjects. To train and evaluate the proposed method, we collect a dataset consisting of various types of grasp actions on different objects from multiple subjects. Through extensive experiments, we demonstrate that the proposed method can successfully predict sequential hand movement. Especially, the gaze information shows significant enhancements in prediction capabilities, particularly with fewer input frames, highlighting the potential of the proposed method for real-world applications.

Files

Gaze-Guided_3D_Hand_Motion_Pre... (pdf)

(pdf | 3 Mb)

Taverne

File under embargo until 27-05-2026