Evaluating selection criteria for functions mapping objective speech intelligibility predictions to subjective scores

Bachelor Thesis (2025)
Author(s)

B. Tekin (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jorge Martinez – Mentor (TU Delft - Multimedia Computing)

Dimme de Groot – Mentor (TU Delft - Multimedia Computing)

P. Przemysław – Graduation committee member (TU Delft - Embedded Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
31-01-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Objective speech intelligibility metrics (OIMs) are widely used in various fields, including public ser- vice announcements. These metrics do not directly predict the intelligibility of a speech (defined as the ratio of understandable words in an audio sample), but produce values that tend to monotonically increase with intelligibility. Several mapping functions, typically logistic models, are applied to raw objective scores to produce accurate predictions. However, there exists no standard methodology for choosing the best mapping curve, therefore, researchers tend to reuse curves originally meant for other datasets and OIMs. This research ap- plies a method called Akaike Information Criterion (AIC), specifically developed for model selection, to existing candidate models as well as new ideas based on simple heuristics. Afterwards, the models are evaluated using AIC. The new criterion affirmed the logistic mapping functions chosen for the objective intelligibility metrics STOI and MIKNN, and highlighted alternative models for the SIIB and SIIBgauss. However, with too few listening conditions on the dataset, strong inferences could not be easily made from the data.

Files

Final_5452023.pdf
(pdf | 0.531 Mb)
License info not available