Y. Qiao
Please Note
4 records found
1
While deep learning offers substantial benefits, it faces security challenges stemming from potentially unreliable models and untrustworthy training data. Such vulnerabilities can compromise model functionality through maliciously perturbed inputs or by introducing model Trojans, where adversaries embed triggers in input data to activate harmful behaviors.
Despite significant research on adversarial and backdoor attacks, along with their countermeasures in various deep learning systems, there remains a critical demand for innovative technical solutions to mitigate persistent vulnerabilities and bolster the security and robustness of these systems.
This thesis addresses three key security challenges, including (1) low attack robustness against common image transformations and anomaly frequency perturbations in backdoor triggers under centralized learning; (2) anomaly backdoor features and parameters introduced by current attack methods under decentralized learning; and (3) the significant drop in both clean and robust accuracy caused by global image restoration using diffusion models in adversarial purification.
In examining the vulnerabilities of centralized deep learning systems, Chapter 2 focuses on backdoor attacks against CNNs and Transformers as a malicious data provider. The thesis leverages an evolutionary algorithm to optimize the frequency properties of the designed trigger to maximize attack effectiveness, robustness against image transformation operations, and stealthiness in dual space under the black-box setting.
In investigating the security issues in the decentralized scenarios, Chapters 3 and 4 focuses on backdoor attacks against federated learning from the perspective of a malicious client. In Chapter 3, we propose a backdoor attack to disguise malicious updates of the adversary as benign at the parameter level by backdoor neuron constraint and model camouflage. In Chapter 4, we utilize the power of generative adversarial networks to produce stealthy and flexible triggers that minimize the representation distance between poisoned and benign samples.
To enhance the security of deep learning through data perspective, the thesis focuses on adversarial purification to improve the model robustness against adversarial attacks. In Chapter 5, we identify perturbed image regions through multi-scale superpixel segmentation and occlusion analysis, subsequently using diffusion models for in painting while maintaining visual consistency. ...
While deep learning offers substantial benefits, it faces security challenges stemming from potentially unreliable models and untrustworthy training data. Such vulnerabilities can compromise model functionality through maliciously perturbed inputs or by introducing model Trojans, where adversaries embed triggers in input data to activate harmful behaviors.
Despite significant research on adversarial and backdoor attacks, along with their countermeasures in various deep learning systems, there remains a critical demand for innovative technical solutions to mitigate persistent vulnerabilities and bolster the security and robustness of these systems.
This thesis addresses three key security challenges, including (1) low attack robustness against common image transformations and anomaly frequency perturbations in backdoor triggers under centralized learning; (2) anomaly backdoor features and parameters introduced by current attack methods under decentralized learning; and (3) the significant drop in both clean and robust accuracy caused by global image restoration using diffusion models in adversarial purification.
In examining the vulnerabilities of centralized deep learning systems, Chapter 2 focuses on backdoor attacks against CNNs and Transformers as a malicious data provider. The thesis leverages an evolutionary algorithm to optimize the frequency properties of the designed trigger to maximize attack effectiveness, robustness against image transformation operations, and stealthiness in dual space under the black-box setting.
In investigating the security issues in the decentralized scenarios, Chapters 3 and 4 focuses on backdoor attacks against federated learning from the perspective of a malicious client. In Chapter 3, we propose a backdoor attack to disguise malicious updates of the adversary as benign at the parameter level by backdoor neuron constraint and model camouflage. In Chapter 4, we utilize the power of generative adversarial networks to produce stealthy and flexible triggers that minimize the representation distance between poisoned and benign samples.
To enhance the security of deep learning through data perspective, the thesis focuses on adversarial purification to improve the model robustness against adversarial attacks. In Chapter 5, we identify perturbed image regions through multi-scale superpixel segmentation and occlusion analysis, subsequently using diffusion models for in painting while maintaining visual consistency.
MeetSafe
Enhancing robustness against white-box adversarial examples
Convolutional neural networks (CNNs) are vulnerable to adversarial attacks in computer vision tasks. Current adversarial detections are ineffective against white-box attacks and inefficient when deep CNNs generate high-dimensional hidden features. This study proposes MeetSafe, an effective and scalable adversarial example (AE) detection against white-box attacks. MeetSafe identifies AEs using critical hidden features rather than the entire feature space. We observe a non-uniform distribution of Z-scores between clean samples and adversarial examples (AEs) among hidden features and propose two utility functions to select those most relevant to AEs. We process critical hidden features using feature engineering methods: local outlier factor (LOF), feature squeezing, and whitening, which estimate feature density relative to its k-neighbors, reduce redundancy, and normalize features. To deal with the curse of dimensionality and smooth statistical fluctuations in high-dimensional features, we propose local reachability density (LRD). Our LRD iteratively selects a bag of engineered features with random cardinality and quantifies their average density by its k-nearest neighbors. Finally, MeetSafe constructs a Gaussian Mixture Model (GMM) with the processed features and detects AEs if it is seen as a local outlier, shown by a low density from GMM. Experimental results show that MeetSafe achieves 74%, 96%, and 79% of detection accuracy against adaptive, classic, and white-box attacks, respectively, and at least 2.3× faster than comparison methods.
Federated Learning (FL) is a beneficial decentralized learning approach for preserving the privacy of local datasets of distributed agents. However, the distributed property of FL and untrustworthy data introducing the vulnerability to backdoor attacks. In this attack scenario, an adversary manipulates its local data with a specific trigger and trains a malicious local model to implant the backdoor. During inference, the global model would misbehave for any input with the trigger to the attacker-chosen prediction. Most existing backdoor attacks against FL focus on bypassing defense mechanisms, without considering the inspection of model parameters on the server. These attacks are susceptible to detection through dynamic clustering based on model parameter similarity. Besides, current methods provide limited imperceptibility of their trigger in the spatial domain. To address these limitations, we propose a stealthy backdoor attack called "Chironex"against FL with an imperceptible trigger in frequency space to deliver attack effectiveness, stealthiness and robustness against various countermeasures on FL. We first design a frequency trigger function to generate an imperceptible frequency trigger to evade human inspection. Then we fully exploit the attacker's advantage to enhance attack robustness by estimating benign updates and analyzing the impact of the backdoor on model parameters through a task-sensitive neuron searcher. It disguises malicious updates as benign ones by reducing the impact of backdoor neurons that greatly contribute to the backdoor task based on activation value, and encouraging them to update towards benign model parameters trained by the attacker. We conduct extensive experiments on various image classifiers with real-world datasets to provide empirical evidence that Chironex can evade the most recent robust FL aggregation algorithms, and further achieve a distinctly higher attack success rate than existing attacks, without undermining the utility of the global model.