Chin-Hui Lee | TU Delft Repository

The Multimodal Information Based Speech Processing (Misp) 2022 Challenge

Audio-Visual Diarization And Recognition

Conference paper (2023) - Zhe Wang (author) , Shilong Wu (author) , Diyuan Liu (author) , More Authors (author) , Hang Chen (author) , Mao-Kui He (author) , Jun Du (author) , Chin Hui Lee (author) , Jingdong Chen (author) , Shinji Watanabe (author) , Sabato Marco Siniscalchi (author) , Odette Scharenborg (author)

The Multi-modal Information based Speech Processing (MISP) challenge aims to extend the application of signal processing technology in specific scenarios by promoting the research into wake-up words, speaker diarization, speech recognition, and other technologies. The MISP2022 ch ...

Audio-Visual Wake Word Spotting in MISP2021 Challenge

Dataset Release and Deep Analysis

Journal article (2022) - Hengshun Zhou (author) , Jun Du (author) , Gongzhen Zou (author) , Zhaoxu Nian (author) , Chin-Hui Lee (author) , Sabato Marco Siniscalchi (author) , Shinji Watanabe (author) , O.E. Scharenborg (author) , Jingdong Chen (author) , More Authors (author)

In this paper, we describe and release publicly the audio-visual wake word spotting (WWS) database in the MISP2021 Challenge, which covers a range of scenarios of audio and video data collected by near-, mid-, and far-field microphone arrays, and cameras, to create a shared and p ...

Audio-Visual Speech Recognition in MISP2021 Challenge

Dataset Release and Deep Analysis

Journal article (2022) - Hang Chen (author) , Jun Du (author) , Yusheng Dai (author) , Chin-Hui Lee (author) , Sabato Marco Siniscalchi (author) , Shinji Watanabe (author) , Odette Scharenborg (author) , Jingdong Chen (author) , Bao Cai Yin (author) , Jia Pan (author)

In this paper, we present the updated Audio-Visual Speech Recognition (AVSR) corpus of MISP2021 challenge, a large-scale audio-visual Chinese conversational corpus consisting of 141h audio and video data collected by far/middle/near microphones and far/middle cameras in 34 real-h ...

The First Multimodal Information Based Speech Processing (Misp) Challenge

Data, Tasks, Baselines And Results

Conference paper (2022) - Hang Chen (author) , Hengshun Zhou (author) , Jun Du (author) , Chin-Hui Lee (author) , Jingdong Chen (author) , Shinji Watanabe (author) , Sabato Marco Siniscalchi (author) , O.E. Scharenborg (author) , Di-Yuan Liu (author) , More authors (author)

In this paper we discuss the rational of the Multi-model Information based Speech Processing (MISP) Challenge, and provide a detailed description of the data recorded, the two evaluation tasks and the corresponding baselines, followed by a summary of submitted systems and evaluat ...