S. Koffas
Please Note
20 records found
1
Let's focus
Focused backdoor attack against federated transfer learning
Federated Transfer Learning (FTL) is the most general form of Federated Learning (FL). In FTL, one party, usually the server, pre-trains a feature extractor on public data. Then, clients collaboratively train a classifier by updating only the classification layers on their private data. This raises doubts about whether local poisoning attacks can effectively backdoor the full model. Unlike in FL, where attackers can shift model attention via poisoned inputs, FTL's fixed feature extractor, set during server pre-training, limits this possibility. In this paper, we investigate this scenario to identify and exploit a vulnerability obtained by combining eXplainable AI (XAI) and dataset distillation. Our proposed attack can be carried out by one of the clients during the FL phase of FTL by identifying the optimal position for the trigger through XAI and encapsulating compressed information of the backdoor class. Due to its behavior, we refer to our approach as a focused backdoor approach (FB-FTL for short) and test its performance by referencing image and text classification scenarios. Our attack is effective against existing defenses for FL, as it achieves an average of 80% attack success rate.
We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass previous work like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms. ...
We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass previous work like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms.
We show that poisoning defenses are ineffective if not adjusted specifically for defense against SkipSponge (i.e., they decrease target layer bias values) and that SkipSponge is more effective on GANs and autoencoders than Sponge Poisoning. Additionally, SkipSponge is stealthy, as it does not require significant changes to the victim model’s parameters. Our experiments indicate that SkipSponge can be performed even when an attacker has access to less than 1% of the entire training dataset and reaches up to a 13% energy increase. ...
We show that poisoning defenses are ineffective if not adjusted specifically for defense against SkipSponge (i.e., they decrease target layer bias values) and that SkipSponge is more effective on GANs and autoencoders than Sponge Poisoning. Additionally, SkipSponge is stealthy, as it does not require significant changes to the victim model’s parameters. Our experiments indicate that SkipSponge can be performed even when an attacker has access to less than 1% of the entire training dataset and reaches up to a 13% energy increase.
Large Language Models (LLMs) are susceptible to various attacks but can also improve the security of diverse systems. However, how well do open source LLMs behave as covertext distributions to, e.g., facilitate censorship-resistant communication? In this paper, we explore open-source LLM-based covert channels. We empirically measure the security vs. capacity of two open-source LLM models (Llama-7B and GPT-2) to assess their performance as covert channels. Although our results indicate that such channels are not likely to achieve high practical bitrates, we also show that the chance for an adversary to detect covert communication is low. To ensure our results can be used with the least effort as a general reference, we employ a conceptually simple and concise scheme and only assume public models.
Recently, attackers have targeted machine learning systems, introducing various attacks. The backdoor attack is popular in this field and is usually realized through data poisoning. To the best of our knowledge, we are the first to investigate whether the backdoor attacks remain effective when manifold learning algorithms are applied to the poisoned dataset. We conducted our experiments using two manifold learning techniques (Autoencoder and UMAP) on two benchmark datasets (MNIST and CIFAR10) and two backdoor strategies (clean and dirty label). We performed an array of experiments using different parameters, finding that we could reach an attack success rate of 95% and 75% even after reducing our data to two dimensions using Autoencoders and UMAP, respectively.
ELMs Under Siege
A Study on Backdoor Attacks on Extreme Learning Machines
This paper investigates the effects of backdoor attacks on extreme learning machines. First, we inject the backdoor into the model through data and model poisoning and then examine the pruning technique as a defense to defend against the attack. The core characteristic of extreme learning machines, which makes them interesting for study, is their different structure and learning procedure compared to deep neural networks. These features raise the question of whether they are as vulnerable to backdoor attacks as deep neural networks. Our experiments confirm this assumption and indicate that extreme learning machines can be backdoored with 100% attack success rate. Thus, we believe further study is needed to develop a robust defense technique as a solution to make them less vulnerable. ...
This paper investigates the effects of backdoor attacks on extreme learning machines. First, we inject the backdoor into the model through data and model poisoning and then examine the pruning technique as a defense to defend against the attack. The core characteristic of extreme learning machines, which makes them interesting for study, is their different structure and learning procedure compared to deep neural networks. These features raise the question of whether they are as vulnerable to backdoor attacks as deep neural networks. Our experiments confirm this assumption and indicate that extreme learning machines can be backdoored with 100% attack success rate. Thus, we believe further study is needed to develop a robust defense technique as a solution to make them less vulnerable.
EmoBack
Backdoor Attacks Against Speaker Identification Using Emotional Prosody
Beyond PhantomSponges
Enhancing Sponge Attack on Object Detection Models
Given today's ongoing deployment of deep learning models, ensuring their security against adversarial attacks has become paramount. This paper introduces an enhanced version of the PhantomSponges attack by Shapira et al. The attack exploits the non-maximum suppression (NMS) algorithm in YOLO object detection (OD) models without compromising OD, substantially increasing inference time. Our enhancement focuses on improving the attack's impact on YOLOv5 models by modifying its bounding box area loss term, aiming to directly decrease the intersection over union and, thus, exacerbate the computational load on NMS. Through a parameter study using the Berkeley Deep Drive dataset, we evaluate the enhanced attack's efficacy against various sizes of YOLOv5, demonstrating, under certain circumstances, an improved capability to increase NMS time with a minimal loss in OD accuracy. Furthermore, we propose a novel defense that dynamically resizes input images to mitigate the attack's effectiveness, showcasing a substantial restoration in inference speed and OD accuracy. Our findings show that the enhanced attack could result in a 550% increase in NMS time on the YOLOv5 small configuration. Moreover, our defense's results show a substantial decrease of 90.18% in NMS execution time when applied to an attacked YOLOv5 large model.
BAN
Detecting Backdoors Activated by Adversarial Neuron Noise
Backdoor attacks on deep learning represent a recent threat that has gained significant attention in the research community. Backdoor defenses are mainly based on backdoor inversion, which has been shown to be generic, model-agnostic, and applicable to practical threat scenarios. State-of-the-art backdoor inversion recovers a mask in the feature space to locate prominent backdoor features, where benign and backdoor features can be disentangled. However, it suffers from high computational overhead, and we also find that it overly relies on prominent backdoor features that are highly distinguishable from benign features. To tackle these shortcomings, this paper improves backdoor feature inversion for backdoor detection by incorporating extra neuron activation information. In particular, we adversarially increase the loss of backdoored models with respect to weights to activate the backdoor effect, based on which we can easily differentiate backdoored and clean models. Experimental results demonstrate our defense, BAN, is 1.37× (on CIFAR-10) and 5.11× (on ImageNet200) more efficient with an average 9.99% higher detect success rate than the state-of-the-art defense BTI-DBF. Our code and trained models are publicly available at https://github.com/xiaoyunxxy/ban.
Unveiling the Threat
Investigating Distributed and Centralized Backdoor Attacks in Federated Graph Neural Networks
This article bridges this research gap by investigating two types of backdoor attacks in Federated GNNs: centralized backdoor attack (CBA) and distributed backdoor attack (DBA). Through extensive experiments, we demonstrate that DBA exhibits a higher success rate than CBA across various scenarios. To further explore the characteristics of these backdoor attacks in Federated GNNs, we evaluate their performance under different scenarios, including varying numbers of clients, trigger sizes, poisoning intensities, and trigger densities. Additionally, we explore the resilience of DBA and CBA against two defense mechanisms. Our findings reveal that both defenses cannot eliminate DBA and CBA without affecting the original task. This highlights the necessity of developing tailored defenses to mitigate the novel threat of backdoor attacks in Federated GNNs. ...
This article bridges this research gap by investigating two types of backdoor attacks in Federated GNNs: centralized backdoor attack (CBA) and distributed backdoor attack (DBA). Through extensive experiments, we demonstrate that DBA exhibits a higher success rate than CBA across various scenarios. To further explore the characteristics of these backdoor attacks in Federated GNNs, we evaluate their performance under different scenarios, including varying numbers of clients, trigger sizes, poisoning intensities, and trigger densities. Additionally, we explore the resilience of DBA and CBA against two defense mechanisms. Our findings reveal that both defenses cannot eliminate DBA and CBA without affecting the original task. This highlights the necessity of developing tailored defenses to mitigate the novel threat of backdoor attacks in Federated GNNs.
Backdoor Pony
Evaluating backdoor attacks and defenses in different domains
Outsourced training and crowdsourced datasets lead to a new threat for deep learning models: the backdoor attack. In this attack, the adversary inserts a secret functionality in a model, activated through malicious inputs. Backdoor attacks represent an active research area due to diverse settings where they represent a real threat. Still, there is no framework to evaluate existing attacks and defenses in different domains. Only a few toolboxes have been implemented, but most of them focus on computer vision and are difficult to use. To bridge this gap, we implement Backdoor Pony, a framework for evaluating attacks and defenses in different domains through a user-friendly GUI.
Deep learning found its place in various real-world applications, where many also have security requirements. Unfortunately, as these systems become more pervasive, understanding how they fail becomes more challenging. While there are multiple failure modes in machine learning, one category received significant attention in the last few years-backdoor attacks. Backdoor attacks aim to make a model misclassify some of its inputs to a preset-specific label while other classification results would behave normally. While many works investigate various backdoor attacks and defenses for different domains, no works aim to provide a systematic comparison of backdoor attacks for different scenarios. This work considers backdoor attacks in image, sound, text, and graph domains and provides a comparative analysis of their respective strengths.
Going in Style
Audio Backdoors Through Stylistic Transformations
Graph Neural Networks (GNNs) have achieved promising performance in various real-world applications. Building a powerful GNN model is not a trivial task, as it requires a large amount of training data, powerful computing resources, and human expertise. Moreover, with the development of adversarial attacks, e.g., model stealing attacks, GNNs raise challenges to model authentication. To avoid copyright infringement on GNNs, verifying the ownership of the GNN models is necessary.This paper presents a watermarking framework for GNNs for both graph and node classification tasks. We 1) design two strategies to generate watermarked data for the graph classification task and one for the node classification task, 2) embed the watermark into the host model through training to obtain the watermarked GNN model, and 3) verify the ownership of the suspicious model in a black-box setting. The experiments show that our framework can verify the ownership of GNN models with a very high probability (up to 99%) for both tasks. We also explore our watermarking mechanism against an adaptive attacker with access to partial knowledge of the watermarked data. Finally, we experimentally show that our watermarking approach is robust against a state-of-the-art model extraction technique and four state-of-the-art defenses against backdoor attacks.
We investigate the influence of clock frequency on the success rate of a fault injection attack. In particular, we examine the success rate of voltage and electromagnetic fault attacks for varying clock frequencies. Using three different tests that cover different components of a System-on-Chip, we perform fault injection while its CPU operates at different clock frequencies. Our results show that the attack’s success rate increases with an increase in clock frequency for both voltage and EM fault injection attacks. As the technology advances push the clock frequency further, these results can help assess the impact of fault injection attacks more accurately and develop appropriate countermeasures to address them.
This work explores backdoor attacks for automatic speech recognition systems where we inject inaudible triggers. By doing so, we make the backdoor attack challenging to detect for legitimate users and, consequently, potentially more dangerous. We conduct experiments on two versions of a speech dataset and three neural networks and explore the performance of our attack concerning the duration, position, and type of the trigger. Our results indicate that less than 1% of poisoned data is sufficient to deploy a backdoor attack and reach a 100% attack success rate. We observed that short, non-continuous triggers result in highly successful attacks. Still, since our trigger is inaudible, it can be as long as possible without raising any suspicions making the attack more effective. Finally, we conduct our attack on actual hardware and saw that an adversary could manipulate inference in an Android application by playing the inaudible trigger over the air.
This paper bridges this gap by conducting two types of backdoor attacks in Federated GNNs: centralized backdoor attacks (CBA) and distributed backdoor attacks (DBA). Our experiments show that the DBA attack success rate is higher than CBA in almost all cases. For CBA, the attack success rate of all local triggers is similar to the global trigger, even if the training set of the adversarial party is embedded with the global trigger. To explore the properties of two backdoor attacks in Federated GNNs, we evaluate the attack performance for a different number of clients, trigger sizes, poisoning intensities, and trigger densities. Finally, we explore the robustness of DBA and CBA against two state-of-the-art defenses. We find that both attacks are robust against the investigated defenses, necessitating the need to consider backdoor attacks in Federated GNNs as a novel threat that requires custom defenses. ...
This paper bridges this gap by conducting two types of backdoor attacks in Federated GNNs: centralized backdoor attacks (CBA) and distributed backdoor attacks (DBA). Our experiments show that the DBA attack success rate is higher than CBA in almost all cases. For CBA, the attack success rate of all local triggers is similar to the global trigger, even if the training set of the adversarial party is embedded with the global trigger. To explore the properties of two backdoor attacks in Federated GNNs, we evaluate the attack performance for a different number of clients, trigger sizes, poisoning intensities, and trigger densities. Finally, we explore the robustness of DBA and CBA against two state-of-the-art defenses. We find that both attacks are robust against the investigated defenses, necessitating the need to consider backdoor attacks in Federated GNNs as a novel threat that requires custom defenses.