Print Email Facebook Twitter Improving Northern Regional Dutch Speech Recognition by Adapting Perturbation-based Data Augmentation Title Improving Northern Regional Dutch Speech Recognition by Adapting Perturbation-based Data Augmentation Author Zhlebinkov, Nikolay (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Scharenborg, O.E. (mentor) Patel, T.B. (mentor) P. Gonçalves, Joana (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2022-06-22 Abstract Automatic speech recognition (ASR) does not perform equally well on every speaker. There is bias against many attributes, including accent. To train Dutch ASR, there exists CGN(Corpus Gesproken Nederlands) and as an extension, the JASMIN corpus with annotated accented data. This paper focuses on improving ASR performance for NRAD (Northern regional accented Dutch) speech, training on speakers from the region of Overijssel. To achieve this improvement, the corpus data is augmented using Vocal Tract Length Perturbation (VTLP), which entails randomly warping the frequency of each recording using a factor in the range [0.9, 1.1]. The baseline and augmented ASR systems are trained using trigram GMM-HMM (Gaussian mixture model hidden Markov models) through the Kaldi toolkit on the DelftBlue supercomputer. This leads to improvements on word error rates (WER) for all speaker groups and styles, with an overall relative improvement of 14,64% and the biggest improvement observed for male speakers - from 25.15% WER to 19,68% WER. The impact of this augmentation on other accents and non-accented speech is not explored. This experiment can serve as a stepping stone for developing overall more robust and less biased Dutch ASR. Subject Speech recognitionvocal tract length perturbationData Augmentation To reference this document use: http://resolver.tudelft.nl/uuid:081e1dc0-6bb3-454c-95cf-b0ac50d7d554 Part of collection Student theses Document type bachelor thesis Rights © 2022 Nikolay Zhlebinkov Files PDF RP_Paper_NZhlebinkov_v3.5.pdf 138.99 KB Close viewer /islandora/object/uuid:081e1dc0-6bb3-454c-95cf-b0ac50d7d554/datastream/OBJ/view