Auditory Kernels for Representing Degraded Speech

None, None

Auditory Kernels for Representing Degraded Speech

Auditory Kernels in an Efficient Representation of Degraded Speech

Bachelor Thesis (2025)

Author(s)

B. Karslıoğlu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Dimme de Groot – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jorge Abraham Martinez Castaneda – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Speech processing Auditory kernels Speech degrdation

To reference this document use

https://resolver.tudelft.nl/uuid:4cfcac57-22ec-4ecc-b133-bfa5db2babc3

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

131

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We explore the use of biologically inspired auditory kernels—learned from sparse coding on (clean) read speech—to analyze and reconstruct signals degraded with additive noise. Auditory kernels mimic spectrotemporal filters in the human auditory system, offering insight into how structured acoustic signals can be internally represented and selectively preserved. Our study applies an auditory kernel-based matching pursuit reconstruction framework to clean, degraded, and standalone noise audio, investigating kernel activation patterns across input types. The findings reveal kernel selectivity; structured signals like speech activate a common subset of kernels, while unstructured noise elicits distinct, less overlapping activations, allowing for more effective separation and implicit denoising. This selectivity results in implicit denoising, preserving intelligibility and perceptual quality even under degradation. By quantifying this behavior across noise types and SNR levels, we show that auditory kernels not only support robust signal reconstruction but also offer a biologically grounded, explainable mechanism for speech enhancement. These insights advance the use of sparse auditory models in both neuroscience and signal processing, motivating future work on adaptive or context-aware dictionaries.

Files

Research_Paper_Auditory_Kernel... (pdf)

(pdf | 1.13 Mb)

License info not available