S. Koffas

Master thesis (1)

1 records found

Backdoor Attacks in Active Learning

An Extensive Analysis of Backdoor Injection in Active Learning-Trained Computer Vision Models

Master thesis (2025) - S. Mendez (author) , S. Picek (mentor) , S. Koffas (mentor) , G. Smaragdakis (graduation committee member) , Z. Yue (graduation committee member)

Deep learning sustained great success in several domains, particularly in computer vision, where it facilitates tasks such as image classification and object recognition. However, one significant challenge in deep learning is data labeling, due to the high cost and effort required for human annotators to go over this process manually. Active learning addresses this problem by selecting a smaller amount of the most relevant data for this labeling process, maximizing efficiency. Despite its advantages, active learning presents new security threats. In particular, backdoor attacks, where adversaries poison part of the training data to modify the behavior of the model in the presence of a hidden trigger. Although backdoor attacks have been extensively studied in traditional deep learning contexts, their impact on active learning remains largely uncertain.

Here, the vulnerabilities of active learning against backdoor attacks in computer vision models were analyzed. Various configurations, datasets and deep learning models were used to evaluate their effectiveness. Backdoor attacks managed to hit ASR values over 95% with just 1% of the data being poisoned on simple datasets like MNIST, particularly when using certainty-based sampling and CNNs. More complex datasets like CIFAR-10 and models like ResNet proved to be more resilient. Furthermore, different attack techniques were explored, such as progressive parameter adjustment, sub-trigger division and clean label attacks on advanced backdoor triggers like LIRA and WaNet. The analysis revealed that although global LIRA triggers were the most effective, sub-trigger and progressive poisoning methods offered promising alternatives, especially because they allow poisoning smaller parts of images across training cycles. Additionally, it is revealed that attack success in clean-label scenarios was highly dependent on the number of poisoned samples per cycle, due to the post-query constraint that only allows poisoning already-selected samples, often limiting impact when the target label appears infrequently. Finally, different poisoning timings were compared. From this, post-query poisoning consistently outperformed pre-query methods in terms of ASR, even at low poison rates. However, it also pointed out that this approach has its limitations in real-world scenarios, where attackers usually do not have control over the samples being queried. Clean accuracy remained unaltered to a large extent, demonstrating the stealth and hidden threat of backdoor attacks in active learning settings.