Query Recovery from Easy to Hard

Jigsaw Attack against SSE

Conference Paper (2024)
Authors

Hao Nie (Huazhong University of Science and Technology)

Wei Wang (Huazhong University of Science and Technology)

Peng Xu (Jinyinhu Laboratory, Huazhong University of Science and Technology)

Xianglong Zhang (Huazhong University of Science and Technology)

Laurence T. Yang (St. Francis Xavier University, Huazhong University of Science and Technology)

Kaitai Liang (TU Delft - Cyber Security)

Research Group
Cyber Security
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Cyber Security
Pages (from-to)
2599-2616
ISBN (electronic)
9781939133441
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Searchable symmetric encryption schemes often unintentionally disclose certain sensitive information, such as access, volume, and search patterns. Attackers can exploit such leakages and other available knowledge related to the user's database to recover queries. We find that the effectiveness of query recovery attacks depends on the volume/frequency distribution of keywords. Queries containing keywords with high volumes/frequencies are more susceptible to recovery, even when countermeasures are implemented. Attackers can also effectively leverage these “special” queries to recover all others. By exploiting the above finding, we propose a Jigsaw attack that begins by accurately identifying and recovering those distinctive queries. Leveraging the volume, frequency, and co-occurrence information, our attack achieves 90% accuracy in three tested datasets, which is comparable to previous attacks (Oya et al., USENIX' 22 and Damie et al., USENIX' 21). With the same runtime, our attack demonstrates an advantage over the attack proposed by Oya et al (approximately 15% more accuracy when the keyword universe size is 15k). Furthermore, our proposed attack outperforms existing attacks against widely studied countermeasures, achieving roughly 60% and 85% accuracy against the padding and the obfuscation, respectively. In this context, with a large keyword universe (≥3k), it surpasses current state-of-the-art attacks by more than 20%.