Revealing Hidden Conversations in Privacy-Sensitive Audio Using Neural Networks

Bachelor Thesis (2022)
Author(s)

P.J. Vunderink (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Hayley Hung – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

J.D. Vargas Quiros – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Jasmijn A. Baaijens – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Pepijn Vunderink
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Pepijn Vunderink
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With widespread use of advanced technology for the recording, storing and sharing of social interactions, protecting privacy of people has been a growing concern. This paper zooms in on the collection of spoken audio with regard for the privacy of recorded individuals. Recently efforts have been made to collect audio at a low sampling rate to obfuscate spoken words in the recorded audio, such that conversations are kept private. This research investigates whether it is possible to upsample this low-resolution audio, using an existing super-resolution model, in order to reveal parts of the previously obfuscated conversations. The performance of the model is measured in terms of the word error rate of automatically generated transcriptions of the upsampled audio. It turns out that it is possible to significantly increase the intelligibility of low resolution privacy-sensitive audio by upsampling. Though the use of the super-resolution model seems to be limited when it comes to revealing significant parts of conversations.

Files

License info not available