Google Chirp vs. Whisper: Evaluating ASR performance on Dutch Native vs. Non-Native Teenager Speech

More Info
expand_more

Abstract

Automatic Speech Recognition (ASR) systems have become increasingly important for society, yet their performance varies significantly across different diverse speaker groups. With a significant non-native population in the Netherlands, it is crucial that ASR systems accurately recognize diverse speech. Commercial state-of-the-art ASR systems are yet under-explored in their performance on Dutch diverse speech. This study evaluates the performance of two recently developed and affordable ASR systems, Google Chirp and OpenAI's Whisper, on speech from native and non-native Dutch teenagers. This research evaluates the recognition accuracy of these ASR systems and identifies common transcription errors. The results show slightly worse performance compared to previous research on non-native speech, and Whisper performing generally better than Google Chirp on the speaker groups.