Dd

Dimme de Groot

info

Please Note

3 records found

The Auditory Kernels of Bat Vocalizations

Conference paper (2026) - A. Savova, D. de Groot, J. Martinez
The efficient coding hypothesis posits that biological sensory systems maximize information transfer to the brain while minimizing neural resources. Although extensively studied in humans, its role in non-human auditory perception remains relatively unexplored. Here, we apply sparse coding to bat echolocation calls to test whether their vocalizations are intrinsically optimized for efficient representation. Unlike prior bat studies using black-box models, our approach examines how acoustic selectivity can emerge in early auditory structures from call structure alone, independent of higher-level neural processing. The learned kernel representations are compact, sparse, and functionally specialized, with distinct activation profiles encoding specific call shapes. These findings suggest that bat auditory systems are tuned to conspecific vocalizations and underscore the advantages of sparse coding over traditional signal representations. They also improve the interpretability of animal auditory processing and provide a computational basis for modeling animal signals, supporting future research in interspecies communication and decoding animal vocalizations. ...
Conference paper (2025) - D. de Groot, B. Karslioglu, O. Scharenborg, J. Martinez
In this paper we propose a robust loudspeaker beamforming algorithm which is used to enhance the performance of voice driven applications in scenarios where the loudspeakers introduce the majority of the noise, e.g. when music is playing loudly. The loudspeaker beamformer modifies the loudspeaker playback signals to create a low-acoustic-energy region around the device that implements automatic speech recognition for a voice driven application (VDA). The algorithm utilises a distortion measure based on human auditory perception to limit the distortion perceived by human listeners. Simulations and real-world experiments show that the proposed loudspeaker beamformer improves the speech recognition performance in all tested scenarios. Moreover, the algorithm allows to further reduce the acoustic energy around the VDA device at the expense of reduced objective audio quality at the listener’s location ...
Dysarthric speech poses significant challenges for automatic speech recognition (ASR) systems due to its high variability and reduced intelligibility. In this work we explore the use of diffusion models for dysarthric speech enhancement, which is based on the hypothesis that using diffusion-based speech enhancement moves the distribution of dysarthric speech closer to that of typical speech, which could potentially improve dysarthric speech recognition performance. We assess the effect of two diffusion-based and one signal-processing-based speech enhancement algorithms on intelligibility and speech quality of two English dysarthric speech corpora. We applied speech enhancement to both typical and dysarthric speech and evaluate the ASR performance using Whisper-Turbo, and the subjective and objective speech quality of the original and enhanced dysarthric speech. We also fine-tuned Whisper-Turbo on the enhanced speech to assess its impact on recognition performance. ...