Evaluating selection criteria for functions mapping objective speech intelligibility predictions to subjective scores
B. Tekin (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Jorge Martinez – Mentor (TU Delft - Multimedia Computing)
Dimme de Groot – Mentor (TU Delft - Multimedia Computing)
P. Przemysław – Graduation committee member (TU Delft - Embedded Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Objective speech intelligibility metrics (OIMs) are widely used in various fields, including public ser- vice announcements. These metrics do not directly predict the intelligibility of a speech (defined as the ratio of understandable words in an audio sample), but produce values that tend to monotonically increase with intelligibility. Several mapping functions, typically logistic models, are applied to raw objective scores to produce accurate predictions. However, there exists no standard methodology for choosing the best mapping curve, therefore, researchers tend to reuse curves originally meant for other datasets and OIMs. This research ap- plies a method called Akaike Information Criterion (AIC), specifically developed for model selection, to existing candidate models as well as new ideas based on simple heuristics. Afterwards, the models are evaluated using AIC. The new criterion affirmed the logistic mapping functions chosen for the objective intelligibility metrics STOI and MIKNN, and highlighted alternative models for the SIIB and SIIBgauss. However, with too few listening conditions on the dataset, strong inferences could not be easily made from the data.