Mitigating Regional Accent Bias in ASR Systems
More Info
expand_more
Abstract
End-to-end Automatic Speech Recognition (ASR) systems improved drastically in recent years and they work extremely well on many large datasets. However, research shows that these models failed to capture the variability in speech production and have biases against the variant caused by the regional accented speech. Moreover, ASR research on regional accents is primarily done in languages used by a large population, like English and Arabic, and the effect of regional accented speech on E2E ASR systems in non-popular languages is still unknown. It is important to know the effect of regional accented speech on E2E ASR systems as it helps researchers to build an inclusive E2E ASR system. In this project, I aim to mitigate the biases against regional accented speech. I select standard speech and regional accented speech from CommonVoice's French and German datasets. I combine the state-of-the-art Conformer Recurrent Neural Network Transducer model with Multi-Domain Adversarial Training (MDAT) to boost the performance of regional accented speech while not hurting the performance of the standard speech. Moreover, since the regional accented speech is typically low-resourced, I study the amount of data required for effective MDAT, as well as the effect of different domain classifiers on the performance of Multi-Domain Adversarial Training. Experimental results show that MDAT can mitigate the biases against regional accented speech in both French and German. The best model in French reduces the bias by around 12% and the best model in German reduces the bias by around 7%. Additionally, MDAT is an effective method for bias mitigation as it can achieve similar performance as the MDAT model trained with the full dataset using only a small amount (e.g. 30 minutes) of untranscribed regional accented speech. Finally, different domain classifier architectures were found to have similar effects on the results of MDAT, thus there is no significant differences among the domain classifier in this project.