Corrnet

Fine-grained emotion recognition for video watching using wearable physiological sensors

Journal Article (2020)
Author(s)

Tianyi Zhang (TU Delft - Electrical Engineering, Mathematics and Computer Science, Centrum Wiskunde & Informatica (CWI))

Abdallah El Ali (Centrum Wiskunde & Informatica (CWI))

Chen Wang (Xinhua News Agency, Beijing)

Alan Hanjalic (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Pablo Cesar (TU Delft - Electrical Engineering, Mathematics and Computer Science, Centrum Wiskunde & Informatica (CWI))

Research Group
Multimedia Computing
DOI related publication
https://doi.org/10.3390/s21010052 Final published version
More Info
expand_more
Publication Year
2020
Language
English
Research Group
Multimedia Computing
Issue number
1
Volume number
21
Article number
52
Pages (from-to)
1-25
Downloads counter
282
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1–4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (≤64 Hz) (3) large amounts of neu-tral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance.