Integrating Base Performance and Performance Differences in Automatic Speech Recognition Metrics

Bachelor Thesis (2024)
Author(s)

B.V. van Vliet (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Odette Scharenborg – Mentor (TU Delft - Multimedia Computing)

Jorge Martinez – Mentor (TU Delft - Multimedia Computing)

N.M. Gürel – Coach (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
25-06-2024
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automatic Speech Recognition (ASR) systems are becoming increasingly popular in this day and age. Unfortunately, due to inherent biases within these systems, performance disparities exist among specific demographic groups. Bias metrics can be used to measure this bias. Within ASR they represent a niche area that has not yet been thoroughly explored. The few bias metrics that exist in literature mainly centre around the performance differences between speaker groups. This paper proposes two new bias metrics that focus not only on performance differences, but also take the base performance into account: Weighted Performance Bias (WPB) and Intergroup Weighted Performance Bias (IWPB). Although the lack of ground truth makes the results less easily interpretable, the results show similar trends within the new metrics as those defined in literature: bias is greatest among non-native Dutch speech.

Files

Research_Project.pdf
(pdf | 0.424 Mb)
License info not available