MERLIon CCS Challenge

None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None

MERLIon CCS Challenge

A English-Mandarin code-switching child-directed speech corpus for language identification and diarization

Journal Article (2023)

Author(s)

Victoria Y.H. Chua (Nanyang Technological University)

Hexin Liu (Nanyang Technological University)

Leibny Paola Garcia Perera (Johns Hopkins University)

Fei Ting Woon (Nanyang Technological University)

Jinyi Wong (Nanyang Technological University)

Xiangyu Zhang (Johns Hopkins University, TU Delft - Mechanical Engineering)

Sanjeev Khudanpur (Johns Hopkins University)

Andy W.H. Khong (Nanyang Technological University)

Justin Dauwels (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Suzy J. Styles (Nanyang Technological University)

Research Group

Signal Processing Systems

Child-directed speech Code-switching Language diarization Language identification

DOI related publication

https://doi.org/10.21437/Interspeech.2023-1446 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:73006d56-c470-4390-8c12-f2725d5ecbf9

More Info

expand_more

Publication Year

2023

Language

English

Research Group

Signal Processing Systems

Volume number

2023-August

Pages (from-to)

4109-4113

Event

24th International Speech Communication Association, Interspeech 2023 (2023-08-20 - 2023-08-24), Dublin, Ireland

Downloads counter

470

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To enhance the reliability and robustness of language identification (LID) and language diarization (LD) systems for heterogeneous populations and scenarios, there is a need for speech processing models to be trained on datasets that feature diverse language registers and speech patterns. We present the MERLIon CCS challenge, featuring a first-of-its-kind Zoom video call dataset of parent-child shared book reading, of over 30 hours with over 300 recordings, annotated by multilingual transcribers using a high-fidelity linguistic transcription protocol. The audio corpus features spontaneous and in-the-wild English-Mandarin code-switching, child-directed speech in non-standard accents with diverse language-mixing patterns recorded in a variety of home environments. This report describes the corpus, as well as LID and LD results for our baseline and several systems submitted to the MERLIon CCS challenge using the corpus.

Files

Chua23_interspeech.pdf

(pdf | 1.03 Mb)

License info not available