Exploring the Relationship Between Bias and Speech Acoustics in Automatic Speech Recognition Systems

None, None

Exploring the Relationship Between Bias and Speech Acoustics in Automatic Speech Recognition Systems

An Experimental Investigation Using Acoustic Embeddings and Bias Metrics on a Dataset of Spoken Dutch

Bachelor Thesis (2024)

Author(s)

P.P. Cichoń (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

O.E. Scharenborg – Mentor (TU Delft - Multimedia Computing)

Jorge Martinez – Mentor (TU Delft - Multimedia Computing)

N.M. Gürel – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Bias Speech processing Automatic speech recognition Acoustics modeling

To reference this document use:

https://resolver.tudelft.nl/uuid:e184cfe0-2861-4cdf-b0e5-3c777aa27518

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

25-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automatic Speech Recognition (ASR) systems have become an integral part of daily lives. Despite their widespread use, these systems can exhibit biases that express themselves in the differences in their accuracy and performance across different demographic groups. Methods quantifying these biases have been developed. This paper investigates the relationship between bias and the acoustic characteristics of speakers. By examining various acoustic embeddings, derived from models like wav2vec 2.0 and XLSR, we aim to identify which embeddings correlate most strongly with bias. The findings offer insights into improving the fairness of ASRs by exploring how acoustic features influence bias in ASR systems. Future research directions include exploring isolated speech properties and extending the study to diverse linguistic contexts to deepen understanding in this area.

Files

Final_paper_Piotr.pdf

(pdf | 0.484 Mb)

License info not available