Improving ASR performance on Jasmin Flemish Dutch data by performing frequency perturbation

More Info
expand_more

Abstract

ASR (automatic speech recognition) systems are used widely in our current day and age. However, for a technology that is used so much in our daily life it contains a lot of bias. This means that not all people can use it equally, people with a different gender, age and dialect will all see different results. The goal of this paper is to reduce this bias, in this case the dialect Flemish Dutch by increasing the performance of this dialect. Since collecting data is expensive, a data augmentation technique has been used. This technique has been used to increase the training data and lower the word error rate of this dialect. Frequency perturbation was used as the data augmentation technique. This technique amplifies or reduces the amplitude of certain frequency bands. We managed to improve upon the Flemish Dutch dialect slightly. Even though the dialect is still quite a bit worse compared to other Dutch dialects, it was improved nonetheless.