FACET

Fast and Accurate Event-Based Eye Tracking Using Ellipse Modeling for Extended Reality

Conference Paper (2025)
Author(s)

Junyuan Ding (Beihang University)

Ziteng Wang (DVSense (Beijing) Technology Co., Ltd)

Chang Gao (TU Delft - Electronics)

Min Liu (DVSense (Beijing) Technology Co., Ltd)

Qinyu Chen (Universiteit Leiden)

Research Group
Electronics
DOI related publication
https://doi.org/10.1109/ICRA55743.2025.11127327
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Electronics
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl. Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
10347-10354
ISBN (print)
979-8-3315-4140-8
ISBN (electronic)
979-8-3315-4139-2
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Eye tracking is a key technology for gaze-based interactions in Extended Reality (XR), but traditional frame-based systems struggle to meet XR's demands for high accuracy, low latency, and power efficiency. Event cameras offer a promising alternative due to their high temporal resolution and low power consumption. In this paper, we present FACET (Fast and Accurate Event-based Eye Tracking), an end-to-end neural network that directly outputs pupil ellipse parameters from event data, optimized for real-time XR applications. The ellipse output can be directly used in subsequent ellipse-based pupil trackers. We enhance the EV-Eye dataset by expanding annotated data and converting original mask labels to ellipse-based annotations to train the model. Besides, a novel trigonometric loss is adopted to address angle discontinuities and a fast causal event volume event representation method is put forward. On the enhanced EV-Eye test set, FACET achieves an average pupil center error of 0.20 pixels and an inference time of 0.53 ms, reducing pixel error and inference time by 1.6 × and 1.8 × compared to the prior art, EV-Eye, with 4.4 × and 11.7 × less parameters and arithmetic operations. The code is available at https://github.com/DeanJY/FACET.

Files

License info not available
warning

File under embargo until 02-03-2026