Exploring the impact of noise, language familiarity, and experimental settings on emotion recognition

None, None; None, None; None, None; None, None; None, None; None, None; None, None

Exploring the impact of noise, language familiarity, and experimental settings on emotion recognition

Journal Article (2025)

Author(s)

Terry Amorese (Università degli Studi della Campania “Luigi Vanvitelli”)

Marialucia Cuciniello (Università degli Studi della Campania “Luigi Vanvitelli”)

Anna Alterio (Università degli Studi della Campania “Luigi Vanvitelli”)

Daniele Pepe (Università degli Studi della Campania “Luigi Vanvitelli”)

Odette Scharenborg (TU Delft - Multimedia Computing)

Gennaro Cordasco (Università degli Studi della Campania “Luigi Vanvitelli”)

Anna Esposito (Università degli Studi della Campania “Luigi Vanvitelli”)

Noise Speech recognition Language proficiency Language understanding Vocal emotion recognition

DOI related publication

https://doi.org/10.3389/fpsyg.2025.1548975 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:183e2fad-ecd0-44dc-b8da-abb7e74a05b0

More Info

expand_more

Publication Year

2025

Language

English

Journal title

Frontiers in Psychology

Volume number

16

Article number

1548975

Downloads counter

106

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Introduction: This work aims to understand the contextual factors affecting speech emotion recognition (SER), more specifically the current research investigates whether the identification of vocal emotional expressions of anger, fear, sadness, joy, and neutrality is affected by three factors: (a) the experimental setting, exploring vocal emotion recognition in both a controlled, soundproof laboratory and a more natural listening environment; (b) the effect of stimuli’s background noise: sentences were presented with three different levels of noise to gradually increase the level of difficulty: one clear (no noise) condition and two noise conditions; (c) language familiarity, since the stimuli comprised Italian sentences, and participants were both native (Italians) and Dutch speakers, who did not know Italian. Method: Dutch and Italian participants were involved in a vocal emotion recognition task carried out in two different experimental settings (realistic vs. laboratory). The stimuli were vocal utterances from the Italian EMOVO dataset, conveying emotions like anger, fear, sadness, joy, and neutrality, and were presented in three different noise conditions. Results: Concerning the effect of the experimental setting, even in higher levels of background noise conditions, individuals possess the remarkable ability to discern emotional nuances conveyed through voice. Regarding familiarity with the language, differences in emotion recognition performance between the Italian and Dutch listeners were observed, but the error magnitude was contingent on the emotional categories. Higher noise levels reduced accuracy, but people could still discern emotions, especially prosody. Conclusion: The study highlighted that emotion recognition is influenced by variables such as listening context, background noise, and language familiarity. These results could be useful for developing robust Speech Emotion Recognition (SER) systems and improving human-computer interaction.

Files

Fpsyg-1-1548975.pdf

(pdf | 1.27 Mb)