Investigating model performance in language identification

None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None

Investigating model performance in language identification

beyond simple error statistics

Journal Article (2023)

Author(s)

Suzy J. Styles (Nanyang Technological University)

Victoria Y.H. Chua (Nanyang Technological University)

Fei Ting Woon (Nanyang Technological University)

Hexin Liu (Nanyang Technological University)

Leibny Paola Garcia Perera (Johns Hopkins University)

Sanjeev Khudanpur (Johns Hopkins University)

Andy W.H. Khong (Nanyang Technological University)

Justin Dauwels (TU Delft - Signal Processing Systems)

Research Group

Signal Processing Systems

DOI related publication

https://doi.org/10.21437/Interspeech.2023-1707

Child-directed speech Code-switching Language diarization Language identification

To reference this document use:

https://resolver.tudelft.nl/uuid:9aa12fd1-7691-496a-9e38-9d9872b24f0b

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Signal Processing Systems

Volume number

2023-August

Pages (from-to)

4129-4133

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Language development experts need tools that can automatically identify languages from fluent, conversational speech and provide reliable estimates of usage rates at the level of an individual recording. However, LID systems are typically evaluated on metrics such as equal error rate and balanced accuracy, applied at the level of an entire speech corpus. These overview metrics do not provide information about model performance at the level of individual speakers, recordings, or units of speech with different linguistic characteristics. Overview statistics may mask systematic errors in model performance for some subsets of the data, and consequently, have worse performance on data derived from some subsets of human speakers, creating a kind of algorithmic bias. Here, we investigate how well a number of LID systems perform on individual recordings and speech units with different linguistic properties in the MERLIon CCS Challenge featuring accented code-switched child-directed speech.

Files

Styles23_interspeech.pdf

(pdf | 2.62 Mb)

License info not available