For your voice only

Exploiting side channels in voice messaging for environment detection

More Info
expand_more

Abstract

Voice messages are an increasingly well-known method of communication, accounting for more than 200 million messages a day. Sending audio messages requires a user to invest lesser effort compared to texting while enhancing the meaning of the message by adding an emotional context (e.g., irony). Unfortunately, we suspect that voice messages might provide much more information than intended. In fact, speech audio waves are both directly recorded by the microphone, as well as propagated into the environment and possibly reflected back to the microphone. Reflected waves along with ambient noise are also recorded by the microphone and sent as part of the voice message. In this thesis, we propose a novel attack for inferring detailed information about user location (e.g., a specific room) leveraging a simple WhatsApp voice message. We demonstrated our attack considering 7,200 voice messages from 15 different users and four environments (i.e., three bedrooms and a terrace). We considered three realistic attack scenarios depending on previous knowledge of the attacker about the victim and the environment. Our thorough experimental results demonstrate the feasibility and efficacy of our proposed attack. We can infer the location of the user among a pool of four known environments with 85% accuracy. Moreover, our approach reaches an average accuracy of 93% in discerning between two rooms of similar size and furniture (i.e., two bedrooms), and an accuracy of up to 99% in classifying indoor and outdoor environments.