Multi-representation Emotion Recognition in Immersive Environments

None, None

Multi-representation Emotion Recognition in Immersive Environments

Master Thesis (2024)

Author(s)

Tony Yang (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Guohao Guohao – Mentor (TU Delft - Embedded Systems)

X. Zhang – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

KG Langendoen – Graduation committee member (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:daa6194d-5c24-4ba0-a4bd-a31f1b5d6686

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

22-10-2024

Awarding Institution

Delft University of Technology

Programme

Electrical Engineering | Embedded Systems

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study addresses the gap for fine-grained emotion recognition in immersive environments utilizing solely data from on-board sensors. Two data representations of users eyes are utilized, including periocular recordings and eye movements (gaze estimation and pupil measurements). A novel multi-representation method integrating feature extractors for each representation alongside an effective feature fusion technique is proposed. The method significantly outperforms baselines that use only a single representation or incorporate content stimuli. It achieves an F1-score of 0.85 with 10\% data, approximately 40 seconds of data from all emotions, for personal adaptation, recognizing emotions while watching unseen parts of the stimuli used for adaptation. In a more practical scenario, the method achieves an F1-score of 0.71 with five seconds of personal adaptation data from each emotion, recognizing emotions while watching completely unseen stimuli. Under the same but more extreme condition, where only one second of data is available, the proposed achieves an F1-score of 0.68.
Furthermore, the study demonstrates that estimated labels can substitute for user-provided labels without sacrificing performance in emotion recognition, thus eliminating the need for users to manually label emotion elicitation segments.
Future work will focus on improving performance by allocating more computational resources and making architectural modifications, conducting deeper investigations into the decision-making process, and developing real-time recognition systems for in-the-wild experiments.
The results of this study suggest that more engaging, adaptive, and personalized experiences in immersive environments can be developed.

Files

Tony_Thesis_Emotion_Recognitio... (pdf)

(pdf | 0 Mb)

License info not available

File under embargo until 22-10-2025