Najim Dehak

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Conference paper (7)

Journal article (1)

8 records found

Finding Spoken Identifications

Using GPT-4 Annotation For An Efficient And Fast Dataset Creation Pipeline

Conference paper (2024) - Maliha Jahan , Helin Wang , Thomas Thebaud , Yinglun Sun , Giang Le , Zsuzsanna Fagyal , Odette Scharenborg , Mark Hasegawa-Johnson , Laureano Moro-Velazquez , Najim Dehak

The growing emphasis on fairness in speech-processing tasks requires datasets with speakers from diverse subgroups that allow training and evaluating fair speech technology systems. However, creating such datasets through manual annotation can be costly. To address this challenge ...

Discovering phonetic inventories with crosslingual automatic speech recognition

Journal article (2022) - Piotr Żelasko , Siyuan Feng , Laureano Moro Velázquez , Ali Abavisani , Saurabhchand Bhati , Odette Scharenborg , Mark Hasegawa-Johnson , Najim Dehak

The high cost of data acquisition makes Automatic Speech Recognition (ASR) model training problematic for most existing languages, including languages that do not even have a written script, or for which the phone inventories remain unknown. Past works explored multilingual train ...

How phonotactics affect multilingual and zero-shot asr performance

Conference paper (2021) - Siyuan Feng , Piotr Żelasko , Laureano Moro-Velázquez , Ali Abavisani , Mark Hasegawa-Johnson , Odette Scharenborg , Najim Dehak

The idea of combining multiple languages’ recordings to train a single automatic speech recognition (ASR) model brings the promise of the emergence of universal speech representation. Recently, a Transformer encoder-decoder model has been shown to leverage multilingual data well ...

Align or attend?

Toward More Efficient and Accurate Spoken Word Discovery Using Speech-to-Image Retrieval

Conference paper (2021) - Liming Wang , Xinsheng Wang , Mark Hasegawa-Johnson , Odette Scharenborg , Najim Dehak

Multimodal word discovery (MWD) is often treated as a byproduct of the speech-to-image retrieval problem. However, our theoretical analysis shows that some kind of alignment/attention mechanism is crucial for a MWD system to learn meaningful word-level representation. We verify o ...

That Sounds Familiar

An Analysis of Phonetic Representations Transfer Across Languages

Conference paper (2020) - Piotr Żelasko , Laureano Moro-Velázquez , Mark Hasegawa-Johnson , Odette Scharenborg , Najim Dehak

Only a handful of the world’s languages are abundant with the resources that enable practical applications of speech processing technologies. One of the methods to overcome this problem is to use the resources existing in other languages to train a multilingual automatic speech r ...

Study of the performance of automatic speech recognition systems in speakers with Parkinson’s Disease

Conference paper (2019) - Laureano Moro-Velazquez , JaeJin Cho , Shinji Watanabe , Mark A. Hasegawa-Johnson , Odette Scharenborg , Heejin Kim , Najim Dehak

Parkinson’s Disease (PD) affects motor capabilities of patients, who in some cases need to use human-computer assistive technologies to regain independence. The objective of this work is to study in detail the differences in error patterns from state-of-the-art Automatic Speech R ...

Visualizing Phoneme Category Adaptation in Deep Neural Networks

Conference paper (2018) - Odette Scharenborg , Sebastian Tiesmeyer , Mark Hasegawa-Johnson , Najim Dehak

Both human listeners and machines need to adapt their sound categories whenever a new speaker is encountered. This perceptual learning is driven by lexical information. The aim of this paper is two-fold: investigate whether a deep neural network-based (DNN) ASR system can adapt t ...

Building an ASR System for Mboshi Using A Cross-language Definition of Acoustic Units Approach

Conference paper (2018) - Odette Scharenborg , Patrick Ebel , Francesco Ciannella , Mark Hasegawa-Johnson , Najim Dehak

For many languages in the world, not enough (annotated) speech data is available to train an ASR system. Recently, we proposed a cross-language method for training an ASR system using linguistic knowledge and semi-supervised training. Here, we apply this approach to the low-resou ...