Peng Xu
Please Note
16 records found
1
Athena
Accelerating KeySwitch and Bootstrapping for Fully Homomorphic Encryption on CUDA GPU
Power of union
Federated honey password vaults against differential attack
The honey password vault is a promising method for managing user passwords and mitigating password-guessing attacks by creating plausible-looking decoy password vaults. Recently, various methods, such as Chatterjee-PCFG (IEEE S&P’15), Golla-Markov (ACM CCS’16), and Cheng-IUV (USENIX Security’21), have been proposed to construct the cornerstone of honey password vaults, known as the distribution transforming encoder (DTE). These innovations significantly enhance the security and functionality of each kind of DTE. However, our findings indicate that when users employ multiple honey password vaults of distinct DTEs to manage their passwords, a passive attacker can easily compromise user passwords by exploiting differences among those DTEs. Consequently, we propose the differential attack targeting existing honey password vaults. The extensive experimental results confirm the effectiveness of this attack, distinguishing real from decoy password vaults with accuracy from 99.13% to 100.00%. In response, we design a novel, collaborative approach to train DTE, called federated DTE model, and construct a secure honey password vault. This strategy markedly bolsters security, reducing the differential attack's distinguishing accuracy to approximately 52.41%, nearing the ideal threshold of 50.00%. Our findings emphasize the need for collaborative strategies to maintain password security to combat advanced cyber threats.
Peekaboo, I See Your Queries
Passive Attacks Against DSSE Via Intermittent Observations
The resulting dataset comprises the following: (i) fluid motions, encompassing pressure, flow velocity and direction (at the bottom and throughout the entire water column), and wave patterns; (ii) near-bed environmental conditions, including temperature, salinity, and turbidity (at the bottom and across a near-bed 1-meter range); (iii) supplementary meteorological data sourced from credible providers; and (iv) preliminary results from post-processing, showcasing the practical application of the data, such as lateral flows and turbulent kinetic energy characterizations.
This dataset is especially valuable due to its extensive temporal and spatial coverage, as well as the high concentrations characterizing many of the observations (from several grams per liter to tens of grams per liter). Conducted annually from 2015 to 2018, the NP-ChaM campaign facilitated detailed observations of seasonal variations in environmental conditions and associated physical processes. The eight observational sites, positioned on either side of the deep channel, enable quantifications of channel–shoal exchanges, along-channel flow dynamics, and saltwater intrusion. This dataset is suitable for advancing our understanding of along-channel and cross-channel dynamics in a channel–shoal system and for calibrating numerical models. The dataset has undergone rigorous quality control to ensure reliability and accuracy. ...
The resulting dataset comprises the following: (i) fluid motions, encompassing pressure, flow velocity and direction (at the bottom and throughout the entire water column), and wave patterns; (ii) near-bed environmental conditions, including temperature, salinity, and turbidity (at the bottom and across a near-bed 1-meter range); (iii) supplementary meteorological data sourced from credible providers; and (iv) preliminary results from post-processing, showcasing the practical application of the data, such as lateral flows and turbulent kinetic energy characterizations.
This dataset is especially valuable due to its extensive temporal and spatial coverage, as well as the high concentrations characterizing many of the observations (from several grams per liter to tens of grams per liter). Conducted annually from 2015 to 2018, the NP-ChaM campaign facilitated detailed observations of seasonal variations in environmental conditions and associated physical processes. The eight observational sites, positioned on either side of the deep channel, enable quantifications of channel–shoal exchanges, along-channel flow dynamics, and saltwater intrusion. This dataset is suitable for advancing our understanding of along-channel and cross-channel dynamics in a channel–shoal system and for calibrating numerical models. The dataset has undergone rigorous quality control to ensure reliability and accuracy.
Query Recovery from Easy to Hard
Jigsaw Attack against SSE
Searchable symmetric encryption schemes often unintentionally disclose certain sensitive information, such as access, volume, and search patterns. Attackers can exploit such leakages and other available knowledge related to the user's database to recover queries. We find that the effectiveness of query recovery attacks depends on the volume/frequency distribution of keywords. Queries containing keywords with high volumes/frequencies are more susceptible to recovery, even when countermeasures are implemented. Attackers can also effectively leverage these “special” queries to recover all others. By exploiting the above finding, we propose a Jigsaw attack that begins by accurately identifying and recovering those distinctive queries. Leveraging the volume, frequency, and co-occurrence information, our attack achieves 90% accuracy in three tested datasets, which is comparable to previous attacks (Oya et al., USENIX' 22 and Damie et al., USENIX' 21). With the same runtime, our attack demonstrates an advantage over the attack proposed by Oya et al (approximately 15% more accuracy when the keyword universe size is 15k). Furthermore, our proposed attack outperforms existing attacks against widely studied countermeasures, achieving roughly 60% and 85% accuracy against the padding and the obfuscation, respectively. In this context, with a large keyword universe (≥3k), it surpasses current state-of-the-art attacks by more than 20%.
d-DSE
Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases
Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs). Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data. Padding strategies are common countermeasures for the leakage, but they significantly increase storage and communication costs. In this work, we develop a new perspective on handling volume leakage. We start with distinct search and further explore a new concept called distinct DSE (d-DSE). We also define new security notions, in particular Distinct with Volume-Hiding security, as well as forward and backward privacy, for the new concept. Based on d-DSE, we construct the d-DSE designed EDB with related constructions for distinct keyword (d-KW-dDSE), keyword (KW-dDSE), and join queries (JOIN-dDSE) and update queries in encrypted databases. We instantiate a concrete scheme BF-SRE, employing Symmetric Revocable Encryption. We conduct extensive experiments on real-world datasets, such as Crime, Wikipedia, and Enron, for performance evaluation. The results demonstrate that our scheme is practical in data search and with comparable computational performance to the SOTA DSE scheme (MITRA*, AURA) and padding strategies (SEAL, ShieldDB). Furthermore, our proposal sharply reduces the communication cost as compared to padding strategies, with roughly 6.36 to 53.14x advantage for search queries.
The Power of Bamboo
On the Post-Compromise Security for Searchable Symmetric Encryption
Dynamic searchable symmetric encryption (DSSE) enables users to delegate the keyword search over dynamically updated encrypted databases to an honest-but-curious server without losing keyword privacy. This paper studies a new and practical security risk to DSSE, namely, secret key compromise (e.g., a user’s secret key is leaked or stolen), which threatens all the security guarantees offered by existing DSSE schemes. To address this open problem, we introduce the notion of searchable encryption with key-update (SEKU) that provides users with the option of non-interactive key updates. We further define the notion of post-compromise secure with respect to leakage functions to study whether DSSE schemes can still provide data security after the client’s secret key is compromised. We demonstrate that post-compromise security is achievable with a proposed protocol called “Bamboo”. Interestingly, the leakage functions of Bamboo satisfy the requirements for both forward and backward security. We conduct a performance evaluation of Bamboo using a real-world dataset and compare its runtime efficiency with the existing forward-and-backward secure DSSE schemes. The result shows that Bamboo provides strong security with better or comparable performance.
Groundwater Vulnerability in a Megacity Under Climate and Economic Changes
A Coupled Sociohydrological Analysis
Groundwater depletion has become increasingly challenging, and many cities worldwide have adopted drastic policies to relieve water stress due to socioeconomic growth. Located on the declining aquifer of the North China Plain, Beijing, for example, has developed plans to limit the size of the city’s population. However, the effect of population displacement under uncertain macroeconomic and climate change remains ambiguous. We adopt a sociohydrological model, with explicit consideration of the dynamics of human-water interactions, to explore the groundwater vulnerability of Beijing. We investigate how human response might shape the development trajectories of the groundwater-population-economy system under different macroscale economic and climate scenarios. Furthermore, we use a machine learning algorithm to identify the decisive factors to be considered for reducing groundwater vulnerability. Our results show that while rapid external economic development or larger annual average precipitation would enable recovery of the groundwater table in the short term, they may slacken human water shortage awareness and result in more acute groundwater depletion in the long run. Strengthening policymaker perceptions of groundwater depletion would prompt timely response policies for controlling population size. Improving the quantity and quality of labor force input to economic development would avoid downturns in the economy due to labor shortages. The outcomes of this study suggest that these strategies would effectively reduce groundwater vulnerability in the long run without causing severe socioeconomic recession. These findings highlight the importance of endogenizing human behavioral dynamics in sustainable urban water management.
It has become a trend for clients to outsource their encrypted databases to remote servers and then leverage the Searchable Encryption technique to perform secure data retrieval. However, the method has yet to be considered a crucial need for replication on searchable encrypted data. It calls for challenging works on Dynamic Searchable Symmetric Encryption (DSSE) since clients must share the search capability of the encrypted data replicas and guarantee forward and backward privacy. We define a new notion called 'Keyword Search Shareable Encryption' (KS2E2E) and the corresponding security model capturing forward and backward privacy. In our notion, data owners are allowed to share search indexes of the encrypted data with users. A search index will be updated with a new search key before sharing to guarantee the data privacy of the source database. The target database also inherits data search efficiency along with the shared data. We further construct an instance of KS2E called Branch, prove its security, and use real-world datasets to evaluate Branch. The evaluation results show that Branch's performance is comparable to classical DSSE schemes on search efficiency and demonstrate the effectiveness of searching encrypted data replicas from multiple owners.
High Recovery with Fewer Injections
Practical Binary Volumetric Injection Attacks against Dynamic Searchable Encryption
Searchable symmetric encryption enables private queries over an encrypted database, but it can also result in information leakages. Adversaries can exploit these leakages to launch injection attacks (Zhang et al., USENIX Security’16) to recover the underlying keywords from queries. The performance of the existing injection attacks is strongly dependent on the amount of leaked information or injection. In this work, we propose two new injection attacks, namely BVA and BVMA, by leveraging a binary volumetric approach. We enable adversaries to inject fewer files than the existing volumetric attacks by using the known keywords and reveal the queries by observing the volume of the query results. Our attacks can thwart well-studied defenses (e.g., threshold countermeasure, padding) without exploiting the distribution of target queries and client databases. We evaluate the proposed attacks empirically in real-world datasets with practical queries. The results show that our attacks can obtain a high recovery rate (> 80%) in the best-case scenario and a roughly 60% recovery even under a large-scale dataset with a small number of injections (< 20 files).
ROSE
Robust Searchable Encryption with Forward and Backward Security
Dynamic searchable symmetric encryption (DSSE) has been widely recognized as a promising technique to delegate update and search queries over an outsourced database to an untrusted server while guaranteeing the privacy of data. Many efforts on DSSE have been devoted to obtaining a good tradeoff between security and performance. However, it appears that all existing DSSE works miss studying on what will happen if the DSSE client issues irrational update queries carelessly, such as duplicate update queries and delete queries to remove non-existent entries (that have been considered by many popular database system in the setting of plaintext). In this scenario, we find that (1) most prior works lose their claimed correctness or security, and (2) no single approach can achieve correctness, forward and backward security, and practical performance at the same time. To address this problem, we study for the first time the notion of robustness of DSSE. Generally, we say that a DSSE scheme is robust if it can keep the same correctness and security even in the case of misoperations. Then, we introduce a new cryptographic primitive named key-updatable pseudo-random function and apply this primitive to constructing ROSE, a robust DSSE scheme with forward and backward security. Finally, we demonstrate the efficiency of ROSE and give the experimental comparisons.
DEKS
A Secure Cloud-Based Searchable Service Can Make Attackers Pay
Many practical secure systems have been designed to prevent real-world attacks via maximizing the attacking cost so as to reduce attack intentions. Inspired by this philosophy, we propose a new concept named delay encryption with keyword search (DEKS) to resist the notorious keyword guessing attack (KGA), in the context of secure cloud-based searchable services. Avoiding the use of complex (and unreasonable) assumptions, as compared to existing works, DEKS optionally leverages a catalyst that enables one (e.g., a valid data user) to easily execute encryption; without the catalyst, any unauthenticated system insiders and outsiders take severe time consumption on encryption. By this, DEKS can overwhelm a KGA attacker in the encryption stage before it obtains any advantage. We leverage the repeated squaring function, which is the core building block of our design, to construct the first DEKS instance. The experimental results show that DEKS is practical in thwarting KGA for both small and large-scale datasets. For example, in the Wikipedia, a KGA attacker averagely takes 7.23 years to break DEKS when the delay parameter T= 2 24. The parameter T can be flexibly adjusted based on practical needs, and theoretically, its upper bound is infinite.
Falcon
Malware Detection and Categorization with Network Traffic Images
Android is the most popular smartphone operating system. At the same time, miscreants have already created malicious apps to find new victims and infect them. Unfortunately, existing anti-malware procedures have become obsolete, and thus novel Android malware techniques are in high demand. In this paper, we present Falcon, an Android malware detection and categorization framework. More specifically, we treat the network traffic classification task as a 2D image sequence classification and handle each network packet as a 2D image. Furthermore, we use a bidirectional LSTM network to process the converted 2D images to obtain the network vectors. We then utilize those converted vectors to detect and categorize the malware. Our results reveal that Falcon could be an accurate and viable solution as we get 97.16% accuracy on average for the malware detection and 88.32% accuracy for the malware categorization.
Hybroid
Toward Android Malware Detection and Categorization with Program Code and Network Traffic
Android malicious applications have become so sophisticated that they can bypass endpoint protection measures. Therefore, it is safe to admit that traditional anti-malware techniques have become cumbersome, thereby raising the need to develop efficient ways to detect Android malware. In this paper, we present Hybroid, a hybrid Android malware detection and categorization solution that utilizes program code structures as static behavioral features and network traffic as dynamic behavioral features for detection (binary classification) and categorization (multi-label classification). For static analysis, we introduce a natural-language-processing-inspired technique based on function call graph embeddings and design a graph-neural-network-based approach to convert the whole graph structure of an Android app to a vector. For dynamic analysis, we extract network flow features from the raw network traffic by capturing each application’s network flow. Finally, Hybroid utilizes the network flow features combined with the graphs’ vectors to detect and categorize the malware. Our solution demonstrates 97.0% accuracy on average for malware detection and 94.0% accuracy for malware categorization. Also, we report remarkable results in different performance metrics such as F1-score, precision, recall, and AUC.
Android is the most dominant operating system in the mobile ecosystem. As expected, this trend did not go unnoticed by miscreants, and quickly enough, it became their favorite platform for discovering new victims through malicious apps. These apps have become so sophisticated that they can bypass anti-malware measures implemented to protect the users. Therefore, it is safe to admit that traditional anti-malware techniques have become cumbersome, sparking the urge to come up with an efficient way to detect Android malware. In this paper, we present a novel Natural Language Processing (NLP) inspired Android malware detection and categorization technique based on Function Call Graph Embedding. We design a graph neural network (graph embedding) based approach to convert the whole graph structure of an Android app to a vector. We then utilize the graphs' vectors to detect and categorize the malware families. Our results reveal that graph embedding yields better results as we get 99.6% accuracy on average for the malware detection and 98.7% accuracy for the malware categorization.
HawkEye
Cross-Platform Malware Detection with Representation Learning on Graphs
Malicious software, widely known as malware, is one of the biggest threats to our interconnected society. Cybercriminals can utilize malware to carry out their nefarious tasks. To address this issue, analysts have developed systems that can prevent malware from successfully infecting a machine. Unfortunately, these systems come with two significant limitations. First, they frequently target one specific platform/architecture, and thus, they cannot be ubiquitous. Second, code obfuscation techniques used by malware authors can negatively influence their performance. In this paper, we design and implement HawkEye, a control-flow-graph-based cross-platform malware detection system, to tackle the problems mentioned above. In more detail, HawkEye utilizes a graph neural network to convert the control flow graphs of executable to vectors with the trainable instruction embedding and then uses a machine-learning-based classifier to create a malware detection system. We evaluate HawkEye by testing real samples on different platforms and operating systems, including Linux (x86, x64, and ARM-32), Windows (x86 and x64), and Android. The results outperform most of the existing works with an accuracy of 96.82% on Linux, 93.39% on Windows, and 99.6% on Android. To the best of our knowledge, HawkEye is the first approach to consider graph neural networks in the malware detection field, utilizing natural language processing.