Peng Xu | TU Delft Repository

Athena

Accelerating KeySwitch and Bootstrapping for Fully Homomorphic Encryption on CUDA GPU

Conference paper (2026) - Yifan Yang, Kexin Zhang, Peng Xu, Zhaojun Lu, Wei Wang, Weiqi Wang, Kaitai Liang

Fully Homomorphic Encryption (FHE) enables computation over encrypted data, but it faces significant challenges in practical implementation due to its high computational costs, particularly in HMult, HRot, and Bootstrapping operations. This work presents Athena, an accelerated FHE system built on GPUs with a new algorithm-hardware co-design approach. Specifically, to accelerate HMult, HRot, and Bootstrapping, we redesign their common and expensive operation KeySwitch, based on the KLSS method proposed by Kim et al. in CRYPTO’23, and accelerate its core operations, namely NTT, EBConv, and IP. We further optimize the dataflow of Bootstrapping by reducing redundant EBConv and (I)NTT operations, and by improving the global memory I/O in the double-hoisting-based C2S/S2C operation. Moreover, Athena is designed as a general-purpose system that supports various cryptographic parameters. Experimental results demonstrate that Athena significantly improves the performance of KeySwitch and Bootstrapping. In particular, Athena’s accelerated KeySwitch optimizes HMult 2.17\times \sim 4.40\times and HRot 1.89\times \sim 4.54\times compared to TensorFHE (HPCA’23), Poseidon (HPCA’23), and FAB (HPCA’23), respectively. Besides, Athena’s Bootstrapping outperforms TensorFHE by nearly 2.74\times . ...

Power of union

Federated honey password vaults against differential attack

Journal article (2025) - Peng Xu, Tingting Rao, Wei Wang, Zhaojun Lu, Kaitai Liang

The honey password vault is a promising method for managing user passwords and mitigating password-guessing attacks by creating plausible-looking decoy password vaults. Recently, various methods, such as Chatterjee-PCFG (IEEE S&P’15), Golla-Markov (ACM CCS’16), and Cheng-IUV (USENIX Security’21), have been proposed to construct the cornerstone of honey password vaults, known as the distribution transforming encoder (DTE). These innovations significantly enhance the security and functionality of each kind of DTE. However, our findings indicate that when users employ multiple honey password vaults of distinct DTEs to manage their passwords, a passive attacker can easily compromise user passwords by exploiting differences among those DTEs. Consequently, we propose the differential attack targeting existing honey password vaults. The extensive experimental results confirm the effectiveness of this attack, distinguishing real from decoy password vaults with accuracy from 99.13% to 100.00%. In response, we design a novel, collaborative approach to train DTE, called federated DTE model, and construct a secure honey password vault. This strategy markedly bolsters security, reducing the differential attack's distinguishing accuracy to approximately 52.41%, nearing the ideal threshold of 50.00%. Our findings emphasize the need for collaborative strategies to maintain password security to combat advanced cyber threats. ...

Peekaboo, I See Your Queries

Passive Attacks Against DSSE Via Intermittent Observations

Conference paper (2025) - Hao Nie, Wei Wang, Peng Xu, Wei Chen, Laurence T. Yang, Mauro Conti, Kaitai Liang

Dynamic Searchable Symmetric Encryption (DSSE) allows secure searches over a dynamic encrypted database but suffers from inherent information leakage. Existing passive attacks against DSSE rely on persistent leakage monitoring to infer leakage patterns, whereas this work targets intermittent observation - a more practical threat model. We propose Peekaboo - a new universal attack framework - and the core design relies on inferring the search pattern and further combining it with auxiliary knowledge and other leakage. We instantiate Peekaboo over the SOTA attacks, Sap (USENIX' 21) and Jigsaw (USENIX' 24), to derive their “+” variants (Sap+ and Jigsaw+). Extensive experiments demonstrate that our design achieves >0.9 adjusted rand index for search pattern recovery and ~90% query accuracy vs. FMA's ~30% (CCS' 23). Peekaboo's accuracy scales with observation rounds and the number of observed queries but also it resists SOTA countermeasures, with >40% accuracy against file size padding and >80% against obfuscation. ...

Multi-year observations of near-bed hydrodynamics and suspended sediment at the core of the estuarine turbidity maximum of the Changjiang Estuary

Journal article (2025) - Zaiyang Zhou, Jianzhong Ge, Dirk Sebastiaan Van Maren, Hualong Luan, Wenyun Guo, Jianfei Ma, Yingjia Tao, Peng Xu, Yu Kuai, More Authors...

A comprehensive multi-year field campaign, the North Passage Channel Measurements (NP-ChaM), was designed and executed to enhance our understanding of the hydrodynamics and sediment dynamics in the North Passage, the primary navigation channel of the Changjiang Estuary, China. The NP-ChaM campaign comprised eight observational sites and spanned 50 d, distributed over 4 years, including two dry seasons and two wet seasons. A series of tripod systems, equipped with multiple instruments, were deployed on the seabed to monitor near-bed physical processes reliably.

The resulting dataset comprises the following: (i) fluid motions, encompassing pressure, flow velocity and direction (at the bottom and throughout the entire water column), and wave patterns; (ii) near-bed environmental conditions, including temperature, salinity, and turbidity (at the bottom and across a near-bed 1-meter range); (iii) supplementary meteorological data sourced from credible providers; and (iv) preliminary results from post-processing, showcasing the practical application of the data, such as lateral flows and turbulent kinetic energy characterizations.

This dataset is especially valuable due to its extensive temporal and spatial coverage, as well as the high concentrations characterizing many of the observations (from several grams per liter to tens of grams per liter). Conducted annually from 2015 to 2018, the NP-ChaM campaign facilitated detailed observations of seasonal variations in environmental conditions and associated physical processes. The eight observational sites, positioned on either side of the deep channel, enable quantifications of channel–shoal exchanges, along-channel flow dynamics, and saltwater intrusion. This dataset is suitable for advancing our understanding of along-channel and cross-channel dynamics in a channel–shoal system and for calibrating numerical models. The dataset has undergone rigorous quality control to ensure reliability and accuracy. ...

A comprehensive multi-year field campaign, the North Passage Channel Measurements (NP-ChaM), was designed and executed to enhance our understanding of the hydrodynamics and sediment dynamics in the North Passage, the primary navigation channel of the Changjiang Estuary, China. The NP-ChaM campaign comprised eight observational sites and spanned 50 d, distributed over 4 years, including two dry seasons and two wet seasons. A series of tripod systems, equipped with multiple instruments, were deployed on the seabed to monitor near-bed physical processes reliably.

The resulting dataset comprises the following: (i) fluid motions, encompassing pressure, flow velocity and direction (at the bottom and throughout the entire water column), and wave patterns; (ii) near-bed environmental conditions, including temperature, salinity, and turbidity (at the bottom and across a near-bed 1-meter range); (iii) supplementary meteorological data sourced from credible providers; and (iv) preliminary results from post-processing, showcasing the practical application of the data, such as lateral flows and turbulent kinetic energy characterizations.

This dataset is especially valuable due to its extensive temporal and spatial coverage, as well as the high concentrations characterizing many of the observations (from several grams per liter to tens of grams per liter). Conducted annually from 2015 to 2018, the NP-ChaM campaign facilitated detailed observations of seasonal variations in environmental conditions and associated physical processes. The eight observational sites, positioned on either side of the deep channel, enable quantifications of channel–shoal exchanges, along-channel flow dynamics, and saltwater intrusion. This dataset is suitable for advancing our understanding of along-channel and cross-channel dynamics in a channel–shoal system and for calibrating numerical models. The dataset has undergone rigorous quality control to ensure reliability and accuracy.

Query Recovery from Easy to Hard

Jigsaw Attack against SSE

Conference paper (2024) - Hao Nie, Wei Wang, Peng Xu, Xianglong Zhang, Laurence T. Yang, Kaitai Liang

Searchable symmetric encryption schemes often unintentionally disclose certain sensitive information, such as access, volume, and search patterns. Attackers can exploit such leakages and other available knowledge related to the user's database to recover queries. We find that the effectiveness of query recovery attacks depends on the volume/frequency distribution of keywords. Queries containing keywords with high volumes/frequencies are more susceptible to recovery, even when countermeasures are implemented. Attackers can also effectively leverage these “special” queries to recover all others. By exploiting the above finding, we propose a Jigsaw attack that begins by accurately identifying and recovering those distinctive queries. Leveraging the volume, frequency, and co-occurrence information, our attack achieves 90% accuracy in three tested datasets, which is comparable to previous attacks (Oya et al., USENIX' 22 and Damie et al., USENIX' 21). With the same runtime, our attack demonstrates an advantage over the attack proposed by Oya et al (approximately 15% more accuracy when the keyword universe size is 15k). Furthermore, our proposed attack outperforms existing attacks against widely studied countermeasures, achieving roughly 60% and 85% accuracy against the padding and the obfuscation, respectively. In this context, with a large keyword universe (≥3k), it surpasses current state-of-the-art attacks by more than 20%. ...

d-DSE

Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases

Conference paper (2024) - Dongli Liu, Wei Wang, Peng Xu, Laurence T. Yang, Bo Luo, Kaitai Liang

Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs). Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data. Padding strategies are common countermeasures for the leakage, but they significantly increase storage and communication costs. In this work, we develop a new perspective on handling volume leakage. We start with distinct search and further explore a new concept called distinct DSE (d-DSE). We also define new security notions, in particular Distinct with Volume-Hiding security, as well as forward and backward privacy, for the new concept. Based on d-DSE, we construct the d-DSE designed EDB with related constructions for distinct keyword (d-KW-dDSE), keyword (KW-dDSE), and join queries (JOIN-dDSE) and update queries in encrypted databases. We instantiate a concrete scheme BF-SRE, employing Symmetric Revocable Encryption. We conduct extensive experiments on real-world datasets, such as Crime, Wikipedia, and Enron, for performance evaluation. The results demonstrate that our scheme is practical in data search and with comparable computational performance to the SOTA DSE scheme (MITRA*, AURA) and padding strategies (SEAL, ShieldDB). Furthermore, our proposal sharply reduces the communication cost as compared to padding strategies, with roughly 6.36 to 53.14x advantage for search queries. ...

The Power of Bamboo

On the Post-Compromise Security for Searchable Symmetric Encryption

Conference paper (2023) - Tianyang Chen, Peng Xu, Stjepan Picek, Bo Luo, Willy Susilo, Hai Jin, Kaitai Liang

Dynamic searchable symmetric encryption (DSSE) enables users to delegate the keyword search over dynamically updated encrypted databases to an honest-but-curious server without losing keyword privacy. This paper studies a new and practical security risk to DSSE, namely, secret key compromise (e.g., a user’s secret key is leaked or stolen), which threatens all the security guarantees offered by existing DSSE schemes. To address this open problem, we introduce the notion of searchable encryption with key-update (SEKU) that provides users with the option of non-interactive key updates. We further define the notion of post-compromise secure with respect to leakage functions to study whether DSSE schemes can still provide data security after the client’s secret key is compromised. We demonstrate that post-compromise security is achievable with a proposed protocol called “Bamboo”. Interestingly, the leakage functions of Bamboo satisfy the requirements for both forward and backward security. We conduct a performance evaluation of Bamboo using a real-world dataset and compare its runtime efficiency with the existing forward-and-backward secure DSSE schemes. The result shows that Bamboo provides strong security with better or comparable performance. ...

Groundwater Vulnerability in a Megacity Under Climate and Economic Changes

A Coupled Sociohydrological Analysis

Journal article (2023) - Bin Li, Yi Zheng, Giuliano Di Baldassarre, Peng Xu, Saket Pande, Murugesu Sivapalan

Groundwater depletion has become increasingly challenging, and many cities worldwide have adopted drastic policies to relieve water stress due to socioeconomic growth. Located on the declining aquifer of the North China Plain, Beijing, for example, has developed plans to limit the size of the city’s population. However, the effect of population displacement under uncertain macroeconomic and climate change remains ambiguous. We adopt a sociohydrological model, with explicit consideration of the dynamics of human-water interactions, to explore the groundwater vulnerability of Beijing. We investigate how human response might shape the development trajectories of the groundwater-population-economy system under different macroscale economic and climate scenarios. Furthermore, we use a machine learning algorithm to identify the decisive factors to be considered for reducing groundwater vulnerability. Our results show that while rapid external economic development or larger annual average precipitation would enable recovery of the groundwater table in the short term, they may slacken human water shortage awareness and result in more acute groundwater depletion in the long run. Strengthening policymaker perceptions of groundwater depletion would prompt timely response policies for controlling population size. Improving the quantity and quality of labor force input to economic development would avoid downturns in the economy due to labor shortages. The outcomes of this study suggest that these strategies would effectively reduce groundwater vulnerability in the long run without causing severe socioeconomic recession. These findings highlight the importance of endogenizing human behavioral dynamics in sustainable urban water management. ...

Keyword Search Shareable Encryption for Fast and Secure Data Replication

Journal article (2023) - Wei Wang, Dongli Liu, Peng Xu, Laurence Tianruo Yang, Kaitai Liang

It has become a trend for clients to outsource their encrypted databases to remote servers and then leverage the Searchable Encryption technique to perform secure data retrieval. However, the method has yet to be considered a crucial need for replication on searchable encrypted data. It calls for challenging works on Dynamic Searchable Symmetric Encryption (DSSE) since clients must share the search capability of the encrypted data replicas and guarantee forward and backward privacy. We define a new notion called 'Keyword Search Shareable Encryption' (KS2E2E) and the corresponding security model capturing forward and backward privacy. In our notion, data owners are allowed to share search indexes of the encrypted data with users. A search index will be updated with a new search key before sharing to guarantee the data privacy of the source database. The target database also inherits data search efficiency along with the shared data. We further construct an instance of KS2E called Branch, prove its security, and use real-world datasets to evaluate Branch. The evaluation results show that Branch's performance is comparable to classical DSSE schemes on search efficiency and demonstrate the effectiveness of searching encrypted data replicas from multiple owners. ...

High Recovery with Fewer Injections

Practical Binary Volumetric Injection Attacks against Dynamic Searchable Encryption

Conference paper (2023) - Xianglong Zhang, Wei Wang, Peng Xu, Laurence T. Yang, Kaitai Liang

Searchable symmetric encryption enables private queries over an encrypted database, but it can also result in information leakages. Adversaries can exploit these leakages to launch injection attacks (Zhang et al., USENIX Security’16) to recover the underlying keywords from queries. The performance of the existing injection attacks is strongly dependent on the amount of leaked information or injection. In this work, we propose two new injection attacks, namely BVA and BVMA, by leveraging a binary volumetric approach. We enable adversaries to inject fewer files than the existing volumetric attacks by using the known keywords and reveal the queries by observing the volume of the query results. Our attacks can thwart well-studied defenses (e.g., threshold countermeasure, padding) without exploiting the distribution of target queries and client databases. We evaluate the proposed attacks empirically in real-world datasets with practical queries. The results show that our attacks can obtain a high recovery rate (> 80%) in the best-case scenario and a roughly 60% recovery even under a large-scale dataset with a small number of injections (< 20 files). ...

ROSE

Robust Searchable Encryption with Forward and Backward Security

Journal article (2022) - Peng Xu, Willy Susilo, Wei Wang, Tianyang Chen, Qianhong Wu, Kaitai Liang, Hai Jin

Dynamic searchable symmetric encryption (DSSE) has been widely recognized as a promising technique to delegate update and search queries over an outsourced database to an untrusted server while guaranteeing the privacy of data. Many efforts on DSSE have been devoted to obtaining a good tradeoff between security and performance. However, it appears that all existing DSSE works miss studying on what will happen if the DSSE client issues irrational update queries carelessly, such as duplicate update queries and delete queries to remove non-existent entries (that have been considered by many popular database system in the setting of plaintext). In this scenario, we find that (1) most prior works lose their claimed correctness or security, and (2) no single approach can achieve correctness, forward and backward security, and practical performance at the same time. To address this problem, we study for the first time the notion of robustness of DSSE. Generally, we say that a DSSE scheme is robust if it can keep the same correctness and security even in the case of misoperations. Then, we introduce a new cryptographic primitive named key-updatable pseudo-random function and apply this primitive to constructing ROSE, a robust DSSE scheme with forward and backward security. Finally, we demonstrate the efficiency of ROSE and give the experimental comparisons. ...

DEKS

A Secure Cloud-Based Searchable Service Can Make Attackers Pay

Conference paper (2022) - Yubo Zheng, Peng Xu, Wei Wang, Tianyang Chen, Willy Susilo, Kaitai Liang, Hai Jin

Many practical secure systems have been designed to prevent real-world attacks via maximizing the attacking cost so as to reduce attack intentions. Inspired by this philosophy, we propose a new concept named delay encryption with keyword search (DEKS) to resist the notorious keyword guessing attack (KGA), in the context of secure cloud-based searchable services. Avoiding the use of complex (and unreasonable) assumptions, as compared to existing works, DEKS optionally leverages a catalyst that enables one (e.g., a valid data user) to easily execute encryption; without the catalyst, any unauthenticated system insiders and outsiders take severe time consumption on encryption. By this, DEKS can overwhelm a KGA attacker in the encryption stage before it obtains any advantage. We leverage the repeated squaring function, which is the core building block of our design, to construct the first DEKS instance. The experimental results show that DEKS is practical in thwarting KGA for both small and large-scale datasets. For example, in the Wikipedia, a KGA attacker averagely takes 7.23 years to break DEKS when the delay parameter T= 2 ²⁴. The parameter T can be flexibly adjusted based on practical needs, and theoretically, its upper bound is infinite. ...

Falcon

Malware Detection and Categorization with Network Traffic Images

Conference paper (2021) - Peng Xu, Claudia Eckert, Apostolis Zarras

Android is the most popular smartphone operating system. At the same time, miscreants have already created malicious apps to find new victims and infect them. Unfortunately, existing anti-malware procedures have become obsolete, and thus novel Android malware techniques are in high demand. In this paper, we present Falcon, an Android malware detection and categorization framework. More specifically, we treat the network traffic classification task as a 2D image sequence classification and handle each network packet as a 2D image. Furthermore, we use a bidirectional LSTM network to process the converted 2D images to obtain the network vectors. We then utilize those converted vectors to detect and categorize the malware. Our results reveal that Falcon could be an accurate and viable solution as we get 97.16% accuracy on average for the malware detection and 88.32% accuracy for the malware categorization. ...

Hybroid

Toward Android Malware Detection and Categorization with Program Code and Network Traffic

Conference paper (2021) - Mohammad Reza Norouzian, Peng Xu, Claudia Eckert, Apostolis Zarras

Android malicious applications have become so sophisticated that they can bypass endpoint protection measures. Therefore, it is safe to admit that traditional anti-malware techniques have become cumbersome, thereby raising the need to develop efficient ways to detect Android malware. In this paper, we present Hybroid, a hybrid Android malware detection and categorization solution that utilizes program code structures as static behavioral features and network traffic as dynamic behavioral features for detection (binary classification) and categorization (multi-label classification). For static analysis, we introduce a natural-language-processing-inspired technique based on function call graph embeddings and design a graph-neural-network-based approach to convert the whole graph structure of an Android app to a vector. For dynamic analysis, we extract network flow features from the raw network traffic by capturing each application’s network flow. Finally, Hybroid utilizes the network flow features combined with the graphs’ vectors to detect and categorize the malware. Our solution demonstrates 97.0% accuracy on average for malware detection and 94.0% accuracy for malware categorization. Also, we report remarkable results in different performance metrics such as F1-score, precision, recall, and AUC. ...

Detecting and categorizing Android malware with graph neural networks

Conference paper (2021) - Peng Xu, Claudia Eckert, Apostolis Zarras

Android is the most dominant operating system in the mobile ecosystem. As expected, this trend did not go unnoticed by miscreants, and quickly enough, it became their favorite platform for discovering new victims through malicious apps. These apps have become so sophisticated that they can bypass anti-malware measures implemented to protect the users. Therefore, it is safe to admit that traditional anti-malware techniques have become cumbersome, sparking the urge to come up with an efficient way to detect Android malware. In this paper, we present a novel Natural Language Processing (NLP) inspired Android malware detection and categorization technique based on Function Call Graph Embedding. We design a graph neural network (graph embedding) based approach to convert the whole graph structure of an Android app to a vector. We then utilize the graphs' vectors to detect and categorize the malware families. Our results reveal that graph embedding yields better results as we get 99.6% accuracy on average for the malware detection and 98.7% accuracy for the malware categorization. ...

HawkEye

Cross-Platform Malware Detection with Representation Learning on Graphs

Conference paper (2021) - Peng Xu, Youyi Zhang, Claudia Eckert, Apostolis Zarras

Malicious software, widely known as malware, is one of the biggest threats to our interconnected society. Cybercriminals can utilize malware to carry out their nefarious tasks. To address this issue, analysts have developed systems that can prevent malware from successfully infecting a machine. Unfortunately, these systems come with two significant limitations. First, they frequently target one specific platform/architecture, and thus, they cannot be ubiquitous. Second, code obfuscation techniques used by malware authors can negatively influence their performance. In this paper, we design and implement HawkEye, a control-flow-graph-based cross-platform malware detection system, to tackle the problems mentioned above. In more detail, HawkEye utilizes a graph neural network to convert the control flow graphs of executable to vectors with the trainable instruction embedding and then uses a machine-learning-based classifier to create a malware detection system. We evaluate HawkEye by testing real samples on different platforms and operating systems, including Linux (x86, x64, and ARM-32), Windows (x86 and x64), and Android. The results outperform most of the existing works with an accuracy of 96.82% on Linux, 93.39% on Windows, and 99.6% on Android. To the best of our knowledge, HawkEye is the first approach to consider graph neural networks in the malware detection field, utilizing natural language processing. ...