Automatic Dysarthria Severity Assessment using Whisper-extracted Features

None, None

Automatic Dysarthria Severity Assessment using Whisper-extracted Features

Evaluating ML architectures for dysarthria severity assessment on TORGO and MSDM

Bachelor Thesis (2024)

Author(s)

C. Charlesworth (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Z. Yue – Mentor (TU Delft - Multimedia Computing)

Y. Zhang – Mentor (TU Delft - Multimedia Computing)

Thomas Durieux – Graduation committee member (TU Delft - Software Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Dysarthria Dysarthria Detection TORGO Dysarthria severity assessment

To reference this document use:

https://resolver.tudelft.nl/uuid:0cf97029-a030-43ea-b6be-00beb5c6f3d1

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

27-06-2024

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Dysarthria is a speech disorder commonly caused by neurological disorders such as strokes, cerebral palsy and Amyotrophic Lateral Sclerosis (ALS). The severity level of dysarthria greatly influences the appropriate treatment for a patient. However, assessing the severity of dysarthria in a patient is a time-consuming process that requires a trained speech therapist. Therefore the following work explores a variety of classifier architectures for automatic dysarthria severity assessment using Whisper encodings. The datasets used were MSDM and TORGO while the classifier architectures implemented included a Convolutional Neural Networks and Recurrent Neural Network variants. Across both datasets, the Gated Recurrent Unit network (GRU) achieved the best performance with 97.21% accuracy on MSDM and 97.47% on TORGO.

Files

RP_Final2.pdf

(pdf | 1.15 Mb)

License info not available