Revealing Hidden Conversations in Privacy-Sensitive Audio Using Neural Networks

More Info
expand_more

Abstract

With widespread use of advanced technology for the recording, storing and sharing of social interactions, protecting privacy of people has been a growing concern. This paper zooms in on the collection of spoken audio with regard for the privacy of recorded individuals. Recently efforts have been made to collect audio at a low sampling rate to obfuscate spoken words in the recorded audio, such that conversations are kept private. This research investigates whether it is possible to upsample this low-resolution audio, using an existing super-resolution model, in order to reveal parts of the previously obfuscated conversations. The performance of the model is measured in terms of the word error rate of automatically generated transcriptions of the upsampled audio. It turns out that it is possible to significantly increase the intelligibility of low resolution privacy-sensitive audio by upsampling. Though the use of the super-resolution model seems to be limited when it comes to revealing significant parts of conversations.