State-of-the-art Automatic Speech Recognition Systems on Dutch Regional Dialects

Exploring Bias in Dutch-trained Automatic Speech Recognition Systems

Bachelor Thesis (2024)
Author(s)

S.A. Kasdorp (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Y. Zhang – Mentor (TU Delft - Multimedia Computing)

O.E. Scharenborg – Mentor (TU Delft - Multimedia Computing)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
25-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automatic Speech Recognition is a field that has seen a strong increase in developments in recent years. In order to ensure objectivity and reliability in these systems, it is crucial they remain unbiased and treat speakers equally. This paper explores the bias of two state-of-the-art ASR systems in the domain of Dutch and Flemish speech, specifically towards regional dialects. Specifically, it explores Microsoft's Azure AI Speech Services ASR system and Google Chirp. It analyses the performance of these two systems on the JASMIN-CGN language corpus. The results show that speech from West-Dutch regions is recognized correctly significantly more often than other Dutch regions and speech from Brabant is recognized correctly significantly more often than other Flemish regions.

Files

RP_paper_final.pdf
(pdf | 0.269 Mb)
License info not available