Searched for: subject%3A%22speech%255C%2Benhancement%22
(1 - 17 of 17)
document
Li, C. (author), Martinez, Jorge (author), Hendriks, R.C. (author)
Many multi-microphone algorithms depend on knowing the relative acoustic transfer functions (RTFs) of the individual sound sources in the acoustic scene. However, accurate joint RTF estimation for multiple sources is a challenging problem. Existing methods to jointly estimate the RTF for multiple sources have either no satisfying performance, or...
conference paper 2022
document
Chen, Hang (author), Du, Jun (author), Dai, Yusheng (author), Lee, Chin Hui (author), Siniscalchi, Sabato Marco (author), Watanabe, Shinji (author), Scharenborg, O.E. (author), Chen, Jingdong (author), Yin, Bao Cai (author), Pan, Jia (author)
In this paper, we present the updated Audio-Visual Speech Recognition (AVSR) corpus of MISP2021 challenge, a large-scale audio-visual Chinese conversational corpus consisting of 141h audio and video data collected by far/middle/near microphones and far/middle cameras in 34 real-home TV rooms. To our best knowledge, our corpus is the first...
journal article 2022
document
Prananta, Luke (author)
master thesis 2021
document
Sathyapriyan, V. (author)
For people with hearing impairment, it is important to have good speech intelligibility, while also being able to localise the sound sources. Many beam-forming algorithms for hearing aids have been proposed, that minimise the noise, in combination with spatial scene preservation of the target and the interferers. By constraining the spatial cues...
master thesis 2020
document
Zhang, Jie (author), Chen, Huawei (author), Hendriks, R.C. (author)
Multi-microphone speech enhancement methods typically require a reference position with respect to which the target signal is estimated. Often, this reference position is arbitrarily chosen as one of the reference microphones. However, it has been shown that the choice of the reference microphone can have a significant impact on the final...
journal article 2020
document
Kapadia, Husain (author)
Listening in noise is a challenging problem that affects the hearing capability of not only normal hearing but especially hearing impaired people. Since the last four decades, enhancing the quality and intelligibility of noise corrupted speech by reducing the effect of noise has been addressed using statistical signal processing techniques as...
master thesis 2019
document
Luppes, Bob (author), Riemens, Ellen (author)
Several algorithms to enhance the intelligibility of speech in near-end noise were analyzed and implemented. The algorithms considered were assessed based on the intrusive instrumental intelligibility metric SIIB_Gauss. An implementation based on the direct optimization for this metric is assessed, as well as an implementation based on human...
bachelor thesis 2019
document
Koutrouvelis, A. (author), Hendriks, R.C. (author), Heusdens, R. (author), Jensen, Jesper (author)
One of the biggest challenges in multimicrophone applications is the estimation of the parameters of the signal model, such as the power spectral densities (PSDs) of the sources, the early (relative) acoustic transfer functions of the sources with respect to the microphones, the PSD of late reverberation, and the PSDs of microphone-self noise...
journal article 2019
document
Sachos, Kostas (author)
Human interaction with a smart speaker involves often distant automatic speech recognition (ASR). However, ASR is a rather cumbersome task at significantly high levels of noise. Most of commercial smart speakers in order to achieve high ASR accuracy they tend to reduce the playback signal once the preset keyword is detected. In an effort to...
master thesis 2018
document
Koutrouvelis, A. (author), Sherson, T.W. (author), Heusdens, R. (author), Hendriks, R.C. (author)
We propose a new robust distributed linearly constrained beamformer which utilizes a set of linear equality constraints to reduce the cross power spectral density matrix to a block-diagonal form. The proposed beamformer has a convenient objective function for use in arbitrary distributed network topologies while having identical performance...
journal article 2018
document
Van Kuyk, Steven (author), Kleijn, W.B. (author), Hendriks, R.C. (author)
Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and sEPSM<sup>corr</sup>. In addition, this paper investigates the ability of intelligibility metrics to generalize...
journal article 2018
document
Khademi, S. (author), Hendriks, R.C. (author), Kleijn, W.B. (author)
The processing required for the global maximization of the intelligibility of speech acquired by multiple microphones and rendered by a single loudspeaker, is considered in this paper. The intelligibility is quantized, based on the mutual information rate between the message spoken by the talker and the message as interpreted by the listener. We...
conference paper 2016
document
Zeng, Y. (author)
In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional noise reduction approaches process signals recorded...
doctoral thesis 2015
document
Stasinopoulos, V. (author)
Boosting the performance of a conventional speech enhancement system by applying post-processing restoration modules. The speech production process is modeled with linear prediction analysis (LPA). This yields to a two step problem: enhancement of the spectral envelope obtained after conducting LPA to the output signal of a conventional speech...
master thesis 2009
document
Erkelens, J.S. (author), Heusdens, R. (author)
This paper considers estimation of the noise spectral variance from speech signals contaminated by highly nonstationary noise sources. The method can accurately track fast changes in noise power level (up to about 10 dB/s). In each time frame, for each frequency bin, the noise variance estimate is updated recursively with the minimum mean-square...
journal article 2008
document
Hendriks, R.C. (author), Jensen, J. (author), Heusdens, R. (author)
All discrete Fourier transform (DFT) domain-based speech enhancement gain functions rely on knowledge of the noise power spectral density (PSD). Since the noise PSD is unknown in advance, estimation from the noisy speech signal is necessary. An overestimation of the noise PSD will lead to a loss in speech quality, while an underestimation will...
journal article 2008
document
Hendriks, R.C. (author)
The interest in the field of speech enhancement emerges from the increased usage of digital speech processing applications like mobile telephony, digital hearing aids and human-machine communication systems in our daily life. The trend to make these applications mobile increases the variety of potential sources for quality degradation. Speech...
doctoral thesis 2008
Searched for: subject%3A%22speech%255C%2Benhancement%22
(1 - 17 of 17)