Security of Visual Neural Networks: Backdoor Attacks and Adversarial Purification

None, None

doi:10.4233/uuid:2e820134-8d68-4acd-a3af-e1dd58c40957

Security of Visual Neural Networks: Backdoor Attacks and Adversarial Purification

Doctoral Thesis (2025)

Author(s)

Y. Qiao (TU Delft - Cyber Security)

Contributor(s)

R.L. Lagendijk – Promotor (TU Delft - Cyber Security)

K. Liang – Copromotor (TU Delft - Cyber Security)

Research Group

Cyber Security

Adversarial Machine Learning Computer Vision Federated Learning Backdoor Attack Adversarial Purification

To reference this document use:

https://doi.org/10.4233/uuid:2e820134-8d68-4acd-a3af-e1dd58c40957

More Info

expand_more

Publication Year

2025

Language

English

Research Group

Cyber Security

ISBN (print)

978-94-6473-956-5

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep learning, a prominent branch of machine learning, leverages artificial neural networks to extract complex patterns and hierarchical representations from large datasets. Notably, advanced architectures such as convolutional neural networks (CNNs) and vision transformers (ViTs) have achieved remarkable success in various computer vision applications, particularly in image classification tasks.
While deep learning offers substantial benefits, it faces security challenges stemming from potentially unreliable models and untrustworthy training data. Such vulnerabilities can compromise model functionality through maliciously perturbed inputs or by introducing model Trojans, where adversaries embed triggers in input data to activate harmful behaviors.
Despite significant research on adversarial and backdoor attacks, along with their countermeasures in various deep learning systems, there remains a critical demand for innovative technical solutions to mitigate persistent vulnerabilities and bolster the security and robustness of these systems.
This thesis addresses three key security challenges, including (1) low attack robustness against common image transformations and anomaly frequency perturbations in backdoor triggers under centralized learning; (2) anomaly backdoor features and parameters introduced by current attack methods under decentralized learning; and (3) the significant drop in both clean and robust accuracy caused by global image restoration using diffusion models in adversarial purification.
In examining the vulnerabilities of centralized deep learning systems, Chapter 2 focuses on backdoor attacks against CNNs and Transformers as a malicious data provider. The thesis leverages an evolutionary algorithm to optimize the frequency properties of the designed trigger to maximize attack effectiveness, robustness against image transformation operations, and stealthiness in dual space under the black-box setting.
In investigating the security issues in the decentralized scenarios, Chapters 3 and 4 focuses on backdoor attacks against federated learning from the perspective of a malicious client. In Chapter 3, we propose a backdoor attack to disguise malicious updates of the adversary as benign at the parameter level by backdoor neuron constraint and model camouflage. In Chapter 4, we utilize the power of generative adversarial networks to produce stealthy and flexible triggers that minimize the representation distance between poisoned and benign samples.
To enhance the security of deep learning through data perspective, the thesis focuses on adversarial purification to improve the model robustness against adversarial attacks. In Chapter 5, we identify perturbed image regions through multi-scale superpixel segmentation and occlusion analysis, subsequently using diffusion models for in painting while maintaining visual consistency.

Files

Phd_thesis_yanqiqiao.pdf

(pdf | 26.2 Mb)

License info not available