Revealing Hidden Conversations in Privacy-Sensitive Audio Using Neural Networks

None, None

Revealing Hidden Conversations in Privacy-Sensitive Audio Using Neural Networks

Bachelor Thesis (2022)

Author(s)

P.J. Vunderink (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

H.S. Hung – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.D. Vargas Quiros – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.A. Baaijens – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Audio Privacy Super resolution Super-resolution Bandwidth extensions Artificial bandwidth extension

To reference this document use

https://resolver.tudelft.nl/uuid:ebd4d2bb-ddfb-470b-93f3-72a18e2b6fe8

More Info

expand_more

Publication Year

2022

Language

English

Graduation Date

24-06-2022

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

281

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With widespread use of advanced technology for the recording, storing and sharing of social interactions, protecting privacy of people has been a growing concern. This paper zooms in on the collection of spoken audio with regard for the privacy of recorded individuals. Recently efforts have been made to collect audio at a low sampling rate to obfuscate spoken words in the recorded audio, such that conversations are kept private. This research investigates whether it is possible to upsample this low-resolution audio, using an existing super-resolution model, in order to reveal parts of the previously obfuscated conversations. The performance of the model is measured in terms of the word error rate of automatically generated transcriptions of the upsampled audio. It turns out that it is possible to significantly increase the intelligibility of low resolution privacy-sensitive audio by upsampling. Though the use of the super-resolution model seems to be limited when it comes to revealing significant parts of conversations.

Files

RP_Thesis_Pepijn_Vunderink.pdf

(pdf | 1.48 Mb)

License info not available