An Exploratory Examination of Objective Intelligibility Metrics Under Reverberant Conditions

Bachelor Thesis (2025)
Author(s)

M. Jin (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jorge Martinez – Mentor (TU Delft - Multimedia Computing)

Dimme de Groot – Mentor (TU Delft - Multimedia Computing)

P. Przemysław – Graduation committee member (TU Delft - Embedded Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
31-01-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Clear communication in public address systems is essential, especially in environments where safety or information clarity is critical. Speech intelligibility is often assessed using objective intelligibility metrics (OIMs), which predict intelligibility through mathematical models. These metrics perform well in non-highly reverberant conditions but face challenges in highly reverberant environments and with non-European languages like Mandarin. This study examines the performance of three intrusive OIMs—ESTOI, HASPI, and SIIB\textsubscript{Gauss}—in two aspects: (1) how these metrics perform under different reverberation conditions for English, using STIPA as a reference, and (2) how robust these metrics are by comparing the variances of scores between Mandarin and English. The results show that the variances of predicted scores by the test metrics are equal between Mandarin and English. HASPI, ESTOI, and SIIB\textsubscript{Gauss} demonstrate similar performance across a broader range of reverberation conditions (from a T60 of 0.05s to 7s) for English, contradicting the theory that most intrusive intelligibility metrics struggle with severe reverberation conditions \cite{galbrun_speech_2016}. The findings highlight the need for further research to evaluate potential biases in OIMs and their performance across languages. Incorporating listening tests could provide a more solid examination of these metrics under diverse conditions for different languages.

Files

Final_paper_MIngyi.pdf
(pdf | 0.409 Mb)
License info not available