Automatic Dysarthria Severity Assessment using Whisper-extracted Features

Evaluating ML architectures for dysarthria severity assessment on TORGO and MSDM

More Info
expand_more

Abstract

Dysarthria is a speech disorder commonly caused by neurological disorders such as strokes, cerebral palsy and Amyotrophic Lateral Sclerosis (ALS). The severity level of dysarthria greatly influences the appropriate treatment for a patient. However, assessing the severity of dysarthria in a patient is a time-consuming process that requires a trained speech therapist. Therefore the following work explores a variety of classifier architectures for automatic dysarthria severity assessment using Whisper encodings. The datasets used were MSDM and TORGO while the classifier architectures implemented included a Convolutional Neural Networks and Recurrent Neural Network variants. Across both datasets, the Gated Recurrent Unit network (GRU) achieved the best performance with 97.21% accuracy on MSDM and 97.47% on TORGO.

Files