S. Koffas

Master thesis (1)

1 records found

Backdoor Attacks in Neural Networks

Master thesis (2021) - S. Koffas (author) , S. Picek (mentor) , Reginald L. Lagendijk (graduation committee member) , M.L. Tielman (graduation committee member)

Deep learning has made tremendous success in the past decade. As a result, it is becoming widely deployed in various safety and security-critical applications like autonomous driving, malware detection, fingerprint identification, and financial fraud detection. It was recently shown that deep neural networks are susceptible to multiple attacks and pose a serious security threat for such systems. Such an attack is the backdoor attack. In this attack, the model's training is partially or fully outsourced to a malicious party. The adversary designs a secret property, the trigger, and embeds it into a small fraction of the training data. The trained model associates this property with a specific (chosen by the adversary) target label. The backdoored model behaves well on regular inputs, but a misclassification will occur with a very high probability when the trigger is present. The advent of this attack sparked an arms race in the deep learning community resulting in numerous backdoor attacks and defenses during the last years. In this thesis, we conduct a systematic evaluation that can push our knowledge further, aiming at more effective defenses in the future for various applications like image classification, natural language processing, and sound recognition. We show that the trigger's size is positively correlated to the attack success rate in almost all cases. On the contrary, the trigger's position is not always connected to the attack success rate and depends on the used neural network. Furthermore, we are the first to experiment with inaudible triggers in sound recognition and show that they can be used for stealthy real-world attacks. Moreover, we show that backdoor attacks could be a framework that further pushes our knowledge of how deep neural networks learn.
%Most of these attacks focus on computer vision, and only a few experiment with natural language processing and sound recognition. Additionally, most of the defenses are empirical and can be bypassed by slightly altering the attack scenario. Additionally, we exploited global average pooling's properties aiming at a more effective attack. In particular, we created dynamic backdoors that can be effective even if different trigger positions are used during training and inference without poisoning more data or altering the poisoning procedure. The exploration of this layer also showed that when a model generalizes well on new inputs, it is less probable to be susceptible to backdoor attacks.