YZ

Authored

2 records found

Automatic speech recognition (ASR) should serve every speaker, not only the majority “standard” speakers of a language. In order to build inclusive ASR, mitigating the bias against speaker groups who speak in a “non-standard” or “diverse” way is crucial. We aim to mitigate the bi ...
Automatic speech recognition (ASR) systems have seen substantial improvements in the past decade; however, not for all speaker groups. Recent research shows that bias exists against different types of speech, including non-native accents, in state-of-the-art (SOTA) ASR systems. T ...

Contributed

10 records found

How Does OpenAI’s Whisper Interpret Dysarthric Speech?

An Analysis of Acoustic Feature Probing and Representation Layers for Dysarthic Speech

This paper investigates how OpenAI’s Whisper model processes dysarthric speech by probing its internal acoustic feature representations. Utilizing the TORGO database, we analyzed Whisper’s capability to encode significant acoustic features specific to dysarthric speech across its ...

Improving State-of-the-Art ASR Systems for Speakers with Dysarthria

Applying Low-Rank Adaptation Transfer Learning to Whisper

Dysarthria is a speech disorder that limits an individual’s ability to clearly articulate, due to the weakening of the muscles involved in speech. Despite recent advances in Automatic Speech Recognition (ASR), the recognition of dysarthric speech remains a significant challenge b ...

Reducing Bias in State-of-the-Art ASR Systems for Child Speech

Addressing Age and Gender Disparities through Transfer Learning Strategies

Automatic Speech Recognition (ASR) systems have transformed human-machine interaction, yet they often struggle with child speech due to the unique vocal characteristics. This thesis investigates age and gender biases, focusing on enhancing the performance of state-of-the-art ASR ...

How Good Are State-of-the-Art Automatic Speech Recognition Systems in Recognizing Dutch Diverse Speech?

An Evaluation of Meta MMS and OpenAI Whisper on Native and Non-Native Dutch Speech

Automatic speech recognition (ASR) is increasingly used in daily applications, such as voice-activated virtual assistants like Siri and Alexa, real-time transcription for meetings and lectures, and voice commands for smart home devices. However, studies show that even state-of-th ...

State-of-the-art Automatic Speech Recognition Systems on Dutch Regional Dialects

Exploring Bias in Dutch-trained Automatic Speech Recognition Systems

Automatic Speech Recognition is a field that has seen a strong increase in developments in recent years. In order to ensure objectivity and reliability in these systems, it is crucial they remain unbiased and treat speakers equally. This paper explores the bias of two state-of-th ...

Automatic Dysarthria Severity Assessment using Whisper-extracted Features

Evaluating ML architectures for dysarthria severity assessment on TORGO and MSDM

Dysarthria is a speech disorder commonly caused by neurological disorders such as strokes, cerebral palsy and Amyotrophic Lateral Sclerosis (ALS). The severity level of dysarthria greatly influences the appropriate treatment for a patient. However, assessing the severity of dysar ...

Comparing performance of ASR systems on native Dutch children and teenagers: Google vs. Microsoft

Evaluating Speech Recognition Accuracy of state-of-the-art ASR models on Dutch child and teenager speech

Automatic Speech Recognition (ASR) technology is becoming more and more useful in everyday life, therefor also requiring higher accuracy across all different user demographics. This study compares the performance of Google's and Microsoft's ASR systems on native Dutch child and t ...

Evaluating Alternative Metrics for Dysarthric Speech Recognition

Assessing the Effectiveness of Different Evaluation Metrics in Dysarthric Speech Recognition Systems Across Various Severities

Dysarthria is a motor speech disorder resulting in slurred or slow speech that can be difficult to understand. This re- search paper evaluates the effectiveness of various metrics for automatic speech recognition (ASR), such as character error rate (CER), Jaro-Winkler distance, a ...
Automatic Speech Recognition (ASR) systems have become increasingly important for society, yet their performance varies significantly across different diverse speaker groups. With a significant non-native population in the Netherlands, it is crucial that ASR systems accurately re ...