Improving ASR performance on Jasmin Flemish Dutch data by performing frequency perturbation

Bachelor Thesis (2022)
Author(s)

N. Sweijen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

O.E. Scharenborg – Mentor (TU Delft - Multimedia Computing)

T.B. Patel – Mentor (TU Delft - Multimedia Computing)

Joana P. Gonçalves – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Neal Sweijen
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Neal Sweijen
Graduation Date
22-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

ASR (automatic speech recognition) systems are used widely in our current day and age. However, for a technology that is used so much in our daily life it contains a lot of bias. This means that not all people can use it equally, people with a different gender, age and dialect will all see different results. The goal of this paper is to reduce this bias, in this case the dialect Flemish Dutch by increasing the performance of this dialect. Since collecting data is expensive, a data augmentation technique has been used. This technique has been used to increase the training data and lower the word error rate of this dialect. Frequency perturbation was used as the data augmentation technique. This technique amplifies or reduces the amplitude of certain frequency bands. We managed to improve upon the Flemish Dutch dialect slightly. Even though the dialect is still quite a bit worse compared to other Dutch dialects, it was improved nonetheless.

Files

License info not available