On speech enhancement in very low SNRs for smart speakers

Master thesis (2018)

Authors

K.A. Sachos Electrical Engineering, Mathematics and Computer Science

Contributors

R. Heusdens (supervisor 1)

Martin Bo Møller (supervisor 1)

Pablo Martinez-Nuevo (supervisor 1)

Jesper Kjaer Nielsen (supervisor 1)

Faculty

Electrical Engineering, Mathematics and Computer Science

Adaptive Filtering Speech enhancement Smart speakers Speech separation

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:67e1ea69-6d46-4d0c-9840-a27c0b126854

Published Date

19-10-2018

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Human interaction with a smart speaker involves often distant automatic speech recognition (ASR). However, ASR is a rather cumbersome task at significantly high levels of noise. Most of commercial smart speakers in order to achieve high ASR accuracy they tend to reduce the playback signal once the preset keyword is detected. In an effort to dispose this function from the smart speaker, in this thesis a speech enhancement technique is considered in the front-end of the ASR system aiming at the suppression of the dominant noise component in the degraded speech signal. Having a priori knowledge on the playback signal renders adaptive filtering a well-suited speech technique. Therefore, the class of least mean squares (LMS) algorithms is studied and assessed. Among other techniques of this class the transform domain LMS (TDLMS), due to its inherent signal decorrelation properties, is shown to achieve the best performance in terms of noise suppression and improved speech intelligibility as well as word error rate. The results of this study correspond to a set of simulation incorporating real impulse responses measured in both an anechoic and a reverberant environment.

Files

ThesisReport_K.Sachos_final_.p... (.pdf)

(.pdf | 1.52 Mb)