Search results | TU Delft Repositories

document

Low Complex Accurate Multi-Source RTF Estimation

Li, C. (author), Martinez, Jorge (author), Hendriks, R.C. (author)

Many multi-microphone algorithms depend on knowing the relative acoustic transfer functions (RTFs) of the individual sound sources in the acoustic scene. However, accurate joint RTF estimation for multiple sources is a challenging problem. Existing methods to jointly estimate the RTF for multiple sources have either no satisfying performance, or...

conference paper 2022

document

Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis

Chen, Hang (author), Du, Jun (author), Dai, Yusheng (author), Lee, Chin Hui (author), Siniscalchi, Sabato Marco (author), Watanabe, Shinji (author), Scharenborg, O.E. (author), Chen, Jingdong (author), Yin, Bao Cai (author), Pan, Jia (author)

In this paper, we present the updated Audio-Visual Speech Recognition (AVSR) corpus of MISP2021 challenge, a large-scale audio-visual Chinese conversational corpus consisting of 141h audio and video data collected by far/middle/near microphones and far/middle cameras in 34 real-home TV rooms. To our best knowledge, our corpus is the first...

journal article 2022

document

Improving Automatic Speech Recognition For Dysarthric Speech

Prananta, Luke (author)

master thesis 2021

document

Binaural beam-forming with dominant cue preservation for hearing aids

Sathyapriyan, V. (author)

For people with hearing impairment, it is important to have good speech intelligibility, while also being able to localise the sound sources. Many beam-forming algorithms for hearing aids have been proposed, that minimise the noise, in combination with spatial scene preservation of the target and the interferers. By constraining the spatial cues...

master thesis 2020

document

A Study on Reference Microphone Selection for Multi-Microphone Speech Enhancement

Zhang, Jie (author), Chen, Huawei (author), Hendriks, R.C. (author)

Multi-microphone speech enhancement methods typically require a reference position with respect to which the target signal is estimated. Often, this reference position is arbitrarily chosen as one of the reference microphones. However, it has been shown that the choice of the reference microphone can have a significant impact on the final...

journal article 2020

document

A Generative Neural Network Model for Speech Enhancement

Kapadia, Husain (author)

Listening in noise is a challenging problem that affects the hearing capability of not only normal hearing but especially hearing impaired people. Since the last four decades, enhancing the quality and intelligibility of noise corrupted speech by reducing the effect of noise has been addressed using statistical signal processing techniques as...

master thesis 2019

document

On the Enhancement of Intelligibility: Investigating the influence of different speech modifications on the intelligibility of speech in near-end noise

Luppes, Bob (author), Riemens, Ellen (author)

Several algorithms to enhance the intelligibility of speech in near-end noise were analyzed and implemented. The algorithms considered were assessed based on the intrusive instrumental intelligibility metric SIIB_Gauss. An implementation based on the direct optimization for this metric is assessed, as well as an implementation based on human...

bachelor thesis 2019

document

Robust Joint Estimation of Multimicrophone Signal Model Parameters

Koutrouvelis, A. (author), Hendriks, R.C. (author), Heusdens, R. (author), Jensen, Jesper (author)

One of the biggest challenges in multimicrophone applications is the estimation of the parameters of the signal model, such as the power spectral densities (PSDs) of the sources, the early (relative) acoustic transfer functions of the sources with respect to the microphones, the PSD of late reverberation, and the PSDs of microphone-self noise...

journal article 2019

document

On speech enhancement in very low SNRs for smart speakers

Sachos, Kostas (author)

Human interaction with a smart speaker involves often distant automatic speech recognition (ASR). However, ASR is a rather cumbersome task at significantly high levels of noise. Most of commercial smart speakers in order to achieve high ASR accuracy they tend to reduce the playback signal once the preset keyword is detected. In an effort to...

master thesis 2018

document

An Evaluation of Intrusive Instrumental Intelligibility Metrics

Van Kuyk, Steven (author), Kleijn, W.B. (author), Hendriks, R.C. (author)

Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and sEPSM<sup>corr</sup>. In addition, this paper investigates the ability of intelligibility metrics to generalize...

journal article 2018

document

A Low-Cost Robust Distributed Linearly Constrained Beamformer for Wireless Acoustic Sensor Networks with Arbitrary Topology

Koutrouvelis, A. (author), Sherson, T.W. (author), Heusdens, R. (author), Hendriks, R.C. (author)

We propose a new robust distributed linearly constrained beamformer which utilizes a set of linear equality constraints to reduce the cross power spectral density matrix to a block-diagonal form. The proposed beamformer has a convenient objective function for use in arbitrary distributed network topologies while having identical performance...

journal article 2018

document

Jointly optimal near-end and far-end multi-microphone speech intelligibility enhancement based on mutual information

Khademi, S. (author), Hendriks, R.C. (author), Kleijn, W.B. (author)

The processing required for the global maximization of the intelligibility of speech acquired by multiple microphones and rendered by a single loudspeaker, is considered in this paper. The intelligibility is quantized, based on the mutual information rate between the message spoken by the talker and the message as interpreted by the listener. We...

conference paper 2016

document

Distributed Speech Enhancement in Wireless Acoustic Sensor Networks

Zeng, Y. (author)

In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional noise reduction approaches process signals recorded...

doctoral thesis 2015

document

Post-Processing Method for Single Channel Speech Enhancement Systems

Stasinopoulos, V. (author)

Boosting the performance of a conventional speech enhancement system by applying post-processing restoration modules. The speech production process is modeled with linear prediction analysis (LPA). This yields to a two step problem: enhancement of the spectral envelope obtained after conducting LPA to the output signal of a conventional speech...

master thesis 2009

document

Tracking of Nonstationary Noise Based on Data-Driven Recursive Noise Power Estimation

Erkelens, J.S. (author), Heusdens, R. (author)

This paper considers estimation of the noise spectral variance from speech signals contaminated by highly nonstationary noise sources. The method can accurately track fast changes in noise power level (up to about 10 dB/s). In each time frame, for each frequency bin, the noise variance estimate is updated recursively with the minimum mean-square...

journal article 2008

document

Noise Tracking Using DFT Domain Subspace Decompositions

Hendriks, R.C. (author), Jensen, J. (author), Heusdens, R. (author)

All discrete Fourier transform (DFT) domain-based speech enhancement gain functions rely on knowledge of the noise power spectral density (PSD). Since the noise PSD is unknown in advance, estimation from the noisy speech signal is necessary. An overestimation of the noise PSD will lead to a loss in speech quality, while an underestimation will...

journal article 2008

document

Advances in DFT-Based Single-Microphone Speech Enhancement

Hendriks, R.C. (author)

The interest in the field of speech enhancement emerges from the increased usage of digital speech processing applications like mobile telephony, digital hearing aids and human-machine communication systems in our daily life. The trend to make these applications mobile increases the variety of potential sources for quality degradation. Speech...

doctoral thesis 2008