Using quality and intelligibility measures to create an estimator for reverberation time in a shoebox-shaped room with a multilayer perceptron model

Bachelor Thesis (2024)
Author(s)

A.L. Mol (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jorge Martinez – Mentor (TU Delft - Multimedia Computing)

Dimme de Groot – Mentor (TU Delft - Multimedia Computing)

Maria Soledad Pera – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
08-07-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reverberation is a key aspect when designing the interior of buildings, and must be carefully considered in the context of the function of the room. Defined by the reverberation time (RT), it is known to have a big influence on the intelligibility and quality of audio in closed spaces.
In this work, we investigate the relationship between the RT and explore the feasibility of using multilayer perceptron (MLP) networks to create an estimator for the RT by using the values of objective measures as input features. We investigate five measures in particular: the Perceptual Evaluation of Speech Quality (PESQ), Virtual Speech Quality Objective Listener (ViSQOL) and its extension focused on audio (ViSQOLAudio), and the Short-time Objective Intelligibility Measure (STOI) and its extension ESTOI.
We create a 3-layer MLP network that estimates the RT with a mean absolute error of 0.144 on our simulated RIR test sets and 0.196 on our real RIR test set.

Files

FINAL_PAPER.pdf
(pdf | 1.24 Mb)
License info not available