Study of the performance of automatic speech recognition systems in speakers with Parkinson’s Disease

Conference Paper (2019)
Author(s)

Laureano Moro Velázquez (Johns Hopkins University)

JaeJin Cho (Johns Hopkins University)

S Watanabe (Johns Hopkins University)

Mark Hasegawa-Johnson (University of Illinois at Urbana Champaign)

O.E. Scharenborg (TU Delft - Multimedia Computing)

H Kim (University of Illinois at Urbana Champaign)

Najim Dehak (Johns Hopkins University)

Research Group
Multimedia Computing
Copyright
© 2019 Laureano Moro-Velazquez, JaeJin Cho, Shinji Watanabe, Mark A. Hasegawa-Johnson, O.E. Scharenborg, Heejin Kim, Najim Dehak
DOI related publication
https://doi.org/10.21437/Interspeech.2019-2993
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Laureano Moro-Velazquez, JaeJin Cho, Shinji Watanabe, Mark A. Hasegawa-Johnson, O.E. Scharenborg, Heejin Kim, Najim Dehak
Research Group
Multimedia Computing
Volume number
2019-September
Pages (from-to)
3875-3879
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Parkinson’s Disease (PD) affects motor capabilities of patients, who in some cases need to use human-computer assistive technologies to regain independence. The objective of this work is to study in detail the differences in error patterns from state-of-the-art Automatic Speech Recognition (ASR) systems on speech from people with and without PD. Two different speech recognizers (attention-based end-to-end and Deep Neural Network - Hidden Markov Models hybrid systems) were trained on a Spanish language corpus and subsequently tested on speech from 43 speakers with PD and 46 without PD. The differences related to error rates, substitutions, insertions and deletions of characters and phonetic units between the two groups were analyzed, showing that the word error rate is 27% higher in speakers with PD than in control speakers, with a moderated correlation between that rate and the developmental stage of the disease. The errors were related to all manner classes, and were more pronounced in the vowel /u/. This study is the first to evaluate ASR systems’ responses to speech from patients at different stages of PD in Spanish. The analyses showed general trends but individual speech deficits must be studied in the future when designing new ASR systems for this population.

Files

2993.pdf
(pdf | 2.16 Mb)
License info not available