K. Liang
Please Note
129 records found
1
Athena
Accelerating KeySwitch and Bootstrapping for Fully Homomorphic Encryption on CUDA GPU
MeetSafe
Enhancing robustness against white-box adversarial examples
Convolutional neural networks (CNNs) are vulnerable to adversarial attacks in computer vision tasks. Current adversarial detections are ineffective against white-box attacks and inefficient when deep CNNs generate high-dimensional hidden features. This study proposes MeetSafe, an effective and scalable adversarial example (AE) detection against white-box attacks. MeetSafe identifies AEs using critical hidden features rather than the entire feature space. We observe a non-uniform distribution of Z-scores between clean samples and adversarial examples (AEs) among hidden features and propose two utility functions to select those most relevant to AEs. We process critical hidden features using feature engineering methods: local outlier factor (LOF), feature squeezing, and whitening, which estimate feature density relative to its k-neighbors, reduce redundancy, and normalize features. To deal with the curse of dimensionality and smooth statistical fluctuations in high-dimensional features, we propose local reachability density (LRD). Our LRD iteratively selects a bag of engineered features with random cardinality and quantifies their average density by its k-nearest neighbors. Finally, MeetSafe constructs a Gaussian Mixture Model (GMM) with the processed features and detects AEs if it is seen as a local outlier, shown by a low density from GMM. Experimental results show that MeetSafe achieves 74%, 96%, and 79% of detection accuracy against adaptive, classic, and white-box attacks, respectively, and at least 2.3× faster than comparison methods.
MUDGUARD
Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering
Federated Learning (FL) exhibits susceptible to model poisoning attacks, which compromise the availability of the collaboratively trained model by introducing detrimental local updates during the training process. The predominant line of defense against such attacks has been to impose stringent restrictions on clients' model updates. However, this strategy raises new vulnerabilities where the global model can be infiltrated by meticulously crafted malicious perturbations. This vulnerability arises due to the model's inherent sensitivity to perturbations, making it exposed and fragile. In response, this work investigates a novel defensive paradigm centered on model stability-specifically, a model's resilience against perturbations within its parameter space. As a solution, we introduce a new method named Model Stability Defense for Federated Learning (MSDFL), designed to fortify the defense of FL systems against model poisoning attacks. MSDFL utilizes a minmax optimization framework, which is fundamentally linked to empirical risk for exploring the effects of model perturbations. The core aim of our approach is to minimize the norm of the model-output Jacobian matrix without compromising predictive performance, thereby establishing defense through enhanced model stability. Moreover, we propose a refined version of MSDFL, named Holistic Model Stability Defense for Federated Learning (HMSDFL), which considers model stability across all output dimensions of the logits to effectively eradicate the disparity in model convergence speed induced by MSDFL. Extensive experimental results fully demonstrate the fidelity, robustness, compatibility, and self-protection of our methods.
A t-out-of-n threshold ring signature allows t parties to jointly sign a message on behalf of n parties without revealing the identities of the signers. In this paper, we introduce a new generic construction for threshold ring signature, called GC-TRS, which can be built on top of a selection on identification schemes, commitment schemes, and a new primitive called t-out-of-n proof protocol which is a special type of zero-knowledge proof. In general, our design enables a group of t signers to first generate an aggregated signature by interacting with each other; then they are able to compute a t-out-of-n proof to convince the verifier that the aggregated signature is indeed produced by t individuals among a particular set. The signature is succinct, as it contains only one aggregated signature and one proof in the final signature. We define all the properties required for the building blocks to capture the security of the GC-TRS and provide a detailed security proof. Furthermore, we propose two lattice-based instantiations for the GC-TRS, named LTRS and CTRS, respectively. Notably, the CTRS scheme is the first scheme that has a logarithmic signature size relative to the ring size. Additionally, during the instantiation process, we construct two t-out-of-n proof protocols, which may be of independent interest.
PrivBox
Privacy-Preserving Deep Packet Inspection with Dual Double-masking Obfuscated Rule Generation
Many network middleboxes have been deployed to perform deep packet inspection (DPI) over packet payloads. However, such middleboxes cannot accomplish their tasks when the traffic is encrypted. BlindBox (SIGCOMM 2015) provided the first solution for performing DPI over encrypted traffic. To improve its efficiency, a later proposal PrivDPI (CCS 2019) introduced a practical technique to generate encrypted rules. However, a recent proposal P2DPI (ASIACCS 2021) showed that the rule generator in PrivDPI can comprise the user's privacy. In this article, we present a new attack on P2DPI and show that the privacy of its endpoints can still be compromised by the rule generator. We comprehensively analyze the vulnerability of prior studies and present PrivBox, a new DPI system that achieves the same privacy guarantee as BlindBox while maintaining practical efficiency. This is based on a new technique called dual double-masking obfuscated rule generation. For a ruleset of 3,000, PrivBox achieves connection establishment time on the endpoint side comparable to PrivDPI and supports up to 4,672 token encryptions per second, which is sufficient for a number of real-world applications. Overall, our experiment demonstrates that PrivBox is practical and well-suited for short, frequently established sessions, especially when token repeating is common.
Power of union
Federated honey password vaults against differential attack
The honey password vault is a promising method for managing user passwords and mitigating password-guessing attacks by creating plausible-looking decoy password vaults. Recently, various methods, such as Chatterjee-PCFG (IEEE S&P’15), Golla-Markov (ACM CCS’16), and Cheng-IUV (USENIX Security’21), have been proposed to construct the cornerstone of honey password vaults, known as the distribution transforming encoder (DTE). These innovations significantly enhance the security and functionality of each kind of DTE. However, our findings indicate that when users employ multiple honey password vaults of distinct DTEs to manage their passwords, a passive attacker can easily compromise user passwords by exploiting differences among those DTEs. Consequently, we propose the differential attack targeting existing honey password vaults. The extensive experimental results confirm the effectiveness of this attack, distinguishing real from decoy password vaults with accuracy from 99.13% to 100.00%. In response, we design a novel, collaborative approach to train DTE, called federated DTE model, and construct a secure honey password vault. This strategy markedly bolsters security, reducing the differential attack's distinguishing accuracy to approximately 52.41%, nearing the ideal threshold of 50.00%. Our findings emphasize the need for collaborative strategies to maintain password security to combat advanced cyber threats.
LogDLR
Unsupervised Cross-System Log Anomaly Detection Through Domain-Invariant Latent Representation
Log anomaly detection aims to discover abnormal events from massive log data to ensure the security and reliability of software systems. However, due to the heterogeneity of log formats and syntaxes across different systems, existing log anomaly detection methods often need to be designed and trained for specific systems, lacking generalization ability. To address this challenge, we propose LogDLR, a novel unsupervised cross-system log anomaly detection method. The core idea of LogDLR is to use universal sentence embeddings and a Transformer-based autoencoder to extract domain-invariant latent representations from log entries, which can effectively adapt to log format changes and capture semantic information and dependencies in log sequences. To obtain domain-invariant latent representations, we adopt a domain-adversarial training strategy, introducing a domain discriminator that competes with the Transformer-based encoder through a gradient reversal layer, forcing the encoder to learn shared knowledge between different system logs. Finally, the Transformer-based decoder detects anomalies based on the domain-invariant representations obtained by the encoder. We evaluate LogDLR in simulated cross-system scenarios using three publicly available log datasets. The experimental results show that LogDLR can handle heterogeneous logs effectively in cross-system scenarios and achieve efficient and accurate anomaly detection on both source and target systems.
Peekaboo, I See Your Queries
Passive Attacks Against DSSE Via Intermittent Observations
Inject Less, Recover More
Unlocking the Potential of Document Recovery in Injection Attacks Against SSE
Searchable symmetric encryption has been vulnerable to inference attacks that rely on uniqueness in leakage patterns. However, many keywords in datasets lack distinctive leakage patterns, limiting the effectiveness of such attacks. The file injection attacks, initially proposed by Cash et al. (CCS 2015), have shown impressive performance with 100% accuracy and no prior knowledge requirement. Nevertheless, this attack fails to recover queries with underlying keywords not present in the injected files. To address these limitations, our research introduces a novel attack strategy called LEAP-Hierarchical Fusion Attack (LHFA) that combines the strengths of both file injection attacks and inference attacks. Before initiating keyword injection, we introduce a new approach for inert/active keyword selection. In the phase of selecting injected keywords, we focus on keywords without unique leakage patterns and recover them, leveraging their presence for document recovery. Our goal is to achieve an amplified effect in query recovery. We demonstrate a minimum query recovery rate of 1.3 queries per injected keyword with a 10% data leakage of a real-life dataset, and initiate further research to overcome challenges associated with non-distinctive keywords.
Query Recovery from Easy to Hard
Jigsaw Attack against SSE
Searchable symmetric encryption schemes often unintentionally disclose certain sensitive information, such as access, volume, and search patterns. Attackers can exploit such leakages and other available knowledge related to the user's database to recover queries. We find that the effectiveness of query recovery attacks depends on the volume/frequency distribution of keywords. Queries containing keywords with high volumes/frequencies are more susceptible to recovery, even when countermeasures are implemented. Attackers can also effectively leverage these “special” queries to recover all others. By exploiting the above finding, we propose a Jigsaw attack that begins by accurately identifying and recovering those distinctive queries. Leveraging the volume, frequency, and co-occurrence information, our attack achieves 90% accuracy in three tested datasets, which is comparable to previous attacks (Oya et al., USENIX' 22 and Damie et al., USENIX' 21). With the same runtime, our attack demonstrates an advantage over the attack proposed by Oya et al (approximately 15% more accuracy when the keyword universe size is 15k). Furthermore, our proposed attack outperforms existing attacks against widely studied countermeasures, achieving roughly 60% and 85% accuracy against the padding and the obfuscation, respectively. In this context, with a large keyword universe (≥3k), it surpasses current state-of-the-art attacks by more than 20%.
d-DSE
Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases
Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs). Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data. Padding strategies are common countermeasures for the leakage, but they significantly increase storage and communication costs. In this work, we develop a new perspective on handling volume leakage. We start with distinct search and further explore a new concept called distinct DSE (d-DSE). We also define new security notions, in particular Distinct with Volume-Hiding security, as well as forward and backward privacy, for the new concept. Based on d-DSE, we construct the d-DSE designed EDB with related constructions for distinct keyword (d-KW-dDSE), keyword (KW-dDSE), and join queries (JOIN-dDSE) and update queries in encrypted databases. We instantiate a concrete scheme BF-SRE, employing Symmetric Revocable Encryption. We conduct extensive experiments on real-world datasets, such as Crime, Wikipedia, and Enron, for performance evaluation. The results demonstrate that our scheme is practical in data search and with comparable computational performance to the SOTA DSE scheme (MITRA*, AURA) and padding strategies (SEAL, ShieldDB). Furthermore, our proposal sharply reduces the communication cost as compared to padding strategies, with roughly 6.36 to 53.14x advantage for search queries.
This paper introduces the Biometrics Data Space framework, which is a secure ecosystem built on Data Spaces technology and it is designed to address the challenges of suspect identification during cross-border crime investigation. Apart from Data Spaces technology, the proposed framework innovates by leveraging also Privacy Enhancing Technologies (PETs) and blockchain to enable secure, trustworthy, and sovereign data exchange between Law Enforcement Agencies (LEAs) across borders. Specifically, it utilizes advanced PETs, including Large-Scale Biometric Data Indexing based on deep hashing techniques and Homomorphic Encryption to allow for suspect identification without disclosing sensitive information of personal biometric data. Thus, it enables LEAs to securely compare and exchange encrypted sensitive biometric data, including facial images, fingerprints and voiceprints, while maintaining data privacy and data sovereignty. LEAs define the usage rules for the biometic data they own and these rules are enforced to and respected by the other LEAs participating in the Biometrics Data Space. The proposed architecture is designed to be scalable, allowing the incorporation of additional biometric modalitiies and the easy expansion and integration with new participant LEAs.
FEVERLESS
Fast and Secure Vertical Federated Learning based on XGBoost for Decentralized Labels
Vertical Federated Learning (VFL) enables multiple clients to collaboratively train a global model over vertically partitioned data without leaking private local information. Tree-based models, like XGBoost and LightGBM, have been widely used in VFL to enhance the interpretation and efficiency of training. However, there is a fundamental lack of research on how to conduct VFL securely over distributed labels. This work is the first to fill this gap by designing a novel protocol, called FEVERLESS, based on XGBoost. FEVERLESS leverages secure aggregation via information masking technique and global differential privacy provided by a fairly and randomly selected noise leader to prevent private information from being leaked in the training process. Furthermore, it provides label and data privacy against honest-but-curious adversaries even in the case of collusion of <inline-formula><tex-math notation="LaTeX">$n - 2$</tex-math></inline-formula> out of n clients. We present a comprehensive security and efficiency analysis for our design, and the empirical results from our experiments demonstrate that FEVERLESS is fast and secure. In particular, it outperforms the solution based on additive homomorphic encryption in runtime cost and provides better accuracy than the local differential privacy approach.
MVOC
A Lighter Multi-Client Verifiable Outsourced Computation for Malicious Lightweight Clients
Gordon et al. systematically studied the Universally Composable (UC) security of Multi-client Verifiable Computation (MVC), in which a set of computationally-weak clients delegate the computation of a general function to an untrusted server based on their private inputs, and proposed a UC-secure scheme ensuring that the protocol remains secure even when arbitrarily composed with other UC-secure instances. However, this scheme imposed a significant computational overhead on clients due to the utilization of fully homomorphic encryption, and the plaintext size scaled linearly with function input size. In this work, we present MVOC, a more efficient UC-secure MVC protocol, that significantly reduces the amortized overhead for clients in both semi-honest and malicious settings, by delegating a larger portion of the computation to the server. We enable clients to verify the garbled circuit before entering the online phase, ensuring security against malicious clients without incurring heavy overhead of compiling a semi-honest protocol into a malicious one. We present the detailed proof and analyze the theoretical complexity of MVOC. Furthermore, we implement our protocol and evaluate the performance, and the results demonstrate that the computation and communication overheads during the input phase can be decreased by at least 95.55% and 87.17%, respectively.
Federated Learning (FL) is a beneficial decentralized learning approach for preserving the privacy of local datasets of distributed agents. However, the distributed property of FL and untrustworthy data introducing the vulnerability to backdoor attacks. In this attack scenario, an adversary manipulates its local data with a specific trigger and trains a malicious local model to implant the backdoor. During inference, the global model would misbehave for any input with the trigger to the attacker-chosen prediction. Most existing backdoor attacks against FL focus on bypassing defense mechanisms, without considering the inspection of model parameters on the server. These attacks are susceptible to detection through dynamic clustering based on model parameter similarity. Besides, current methods provide limited imperceptibility of their trigger in the spatial domain. To address these limitations, we propose a stealthy backdoor attack called "Chironex"against FL with an imperceptible trigger in frequency space to deliver attack effectiveness, stealthiness and robustness against various countermeasures on FL. We first design a frequency trigger function to generate an imperceptible frequency trigger to evade human inspection. Then we fully exploit the attacker's advantage to enhance attack robustness by estimating benign updates and analyzing the impact of the backdoor on model parameters through a task-sensitive neuron searcher. It disguises malicious updates as benign ones by reducing the impact of backdoor neurons that greatly contribute to the backdoor task based on activation value, and encouraging them to update towards benign model parameters trained by the attacker. We conduct extensive experiments on various image classifiers with real-world datasets to provide empirical evidence that Chironex can evade the most recent robust FL aggregation algorithms, and further achieve a distinctly higher attack success rate than existing attacks, without undermining the utility of the global model.
MUDGUARD
Taming Malicious Majorities in Federated Learning using Privacy-preserving Byzantine-robust Clustering
Byzantine-robust Federated Learning (FL) aims to counter malicious clients and train an accurate global model while maintaining an extremely low attack success rate. Most existing systems, however, are only robust when most of the clients are honest. FLTrust (NDSS '21) and Zeno++ (ICML '20) do not make such an honest majority assumption but can only be applied to scenarios where the server is provided with an auxiliary dataset used to filter malicious updates. FLAME (USENIX '22) and EIFFeL (CCS '22) maintain the semi-honest majority assumption to guarantee robustness and the confidentiality of updates. It is therefore currently impossible to ensure Byzantine robustness and confidentiality of updates without assuming a semi-honest majority. To tackle this problem, we propose a novel Byzantine-robust and privacy-preserving FL system, called MUDGUARD, to capture malicious minority and majority for server and client sides, respectively. Our experimental results demonstrate that the accuracy of MUDGUARD is practically close to the FL baseline using FedAvg without attacks (approximate 0.8% gap on average). Meanwhile, the attack success rate is around 0%-5% even under an adaptive attack tailored to MUDGUARD. We further optimize our design by using binary secret sharing and polynomial transformation leading to communication overhead and runtime decreases of 67%-89.17% and 66.05%-68.75%, respectively.
HPAKE
Honey Password-authenticated Key Exchange for Fast and Safer Online Authentication
Password-only authentication is one of the most popular secure mechanisms for real-world online applications. But it easily suffers from a practical threat - password leakage, incurred by external and internal attackers. The external attacker may compromise the password file stored on the authentication server, and the insider may deliberately steal the passwords or inadvertently leak the passwords. So far, there are two main techniques to address the leakage: Augmented password-authentication key exchange (aPAKE) against insiders and honeyword technique for external attackers. But none of them can resist both attacks. To fill the gap, we propose the notion of <italic>honey PAKE (HPAKE)</italic> that allows the authentication server to detect the password leakage and achieve the security beyond the traditional bound of aPAKE. Further, we build an HPAKE construction on the top of the honeyword mechanism, honey encryption, and OPAQUE which is a standardized aPAKE. We formally analyze the security of our design, achieving the insider resistance and the password breach detection. We implement our design and deploy it in the real environment. The experimental results show that our protocol only costs 71.27 ms for one complete run, within 20.67 ms on computation and 50.6 ms on communication. This means our design is secure and practical for real-world applications.