Laughter detection in privacy-sensitive audio

More Info
expand_more

Abstract

With the development of new technologies and approaches in the field of social signal processing, concerns regarding privacy around recording conversations have arised. One of the main ways to preserve the privacy of the speakers in recorded conversations consists of decimating said conversations, which consists of reducing the sample frequency and the frequency bandwidth of the audio. This theoretically makes the verbal content of the conversation (the words themselves) unintelligible, while still preserving other useful non-verbal social cues such as laughter, pitch modulation and frequency of speech, amongst others. However, this has not been experimentally verified. This research paper addresses this knowledge gap by exploring the performance of laughter detection machine learning models with decimated audio. An existing pre-trained state-of-the-art laughter detection model was employed and its performance was evaluated for a dataset of decimated audio with sample frequencies ranging from 300Hz to 44100Hz.