Estimating Reverberation Time by a Function of Intrusive Speech Intelligibility Measures

Bachelor Thesis (2024)
Author(s)

M.R. de Groot (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jorge Martinez – Graduation committee member (TU Delft - Multimedia Computing)

Dimme de Groot – Mentor (TU Delft - Multimedia Computing)

Maria Soledad Pera – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
27-06-2024
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A Room Impulse Response (RIR) is a mathematical model for sound propagation in a room. Estimating RIR parameters such as the reverberation time (T60) allows Automatic Speech Recognition (ASR) systems to adapt to reverberation in input signals by changing their behavior based on these estimates. Currently, machine learning techniques provide the most accurate T60 estimations. We propose a novel methodology by using intrusive Speech Intelligibility Measures (SIMs) beyond their traditional application. In this study we utilize SIIB, SIIB^Gauss, STOI and ESTOI as SIMs. For each SIM we find a best fit curve with respect to the reverberation time (T60) using a statistical approach. The statistical analysis is applied on simulated RIRs obtained by using the Image Source Method. The estimator for SIIB^Gauss achieves the lowest Mean Squared Error of 0.353 on simulated data. Although this does not outperform state-of-the-art models, we offer recommendations for possible improvements. Preliminary experiments suggest that enhancing noise robustness is crucial and that the estimators could be generalized to real-world scenarios. However, further research is necessary to confirm this.

Files

License info not available