Federated Learning Under Attack
Exposing Vulnerabilities Through Data Poisoning Attacks in Computer Networks
Ehsan Nowroozi (University of Greenwich, Bahçeşehir Üniversitesi)
Imran Haider (Bahçeşehir Üniversitesi)
Rahim Taheri (University of Portsmouth)
Mauro Conti (Università degli Studi di Padova, TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Federated Learning is an approach that enables multiple devices to collectively train a shared model without sharing raw data, thereby preserving data privacy. However, federated learning systems are vulnerable to data-poisoning attacks during the training and updating stages. Three data-poisoning attacks-label flipping, feature poisoning, and VagueGAN-are tested on FL models across one out of ten clients using the CIC and UNSW datasets. For label flipping, we randomly modify labels of benign data; for feature poisoning, we alter highly influential features identified by the Random Forest technique; and for VagueGAN, we generate adversarial examples using Generative Adversarial Networks. Adversarial samples constitute a small portion of each dataset. In this study, we vary the percentages by which adversaries can modify datasets to observe their impact on the Client and Server sides. Experimental findings indicate that label flipping and VagueGAN attacks do not significantly affect server accuracy, as they are easily detectable by the Server. In contrast, feature poisoning attacks subtly undermine model performance while maintaining high accuracy and attack success rates, highlighting their subtlety and effectiveness. Therefore, feature poisoning attacks manipulate the server without causing a significant decrease in model accuracy, underscoring the vulnerability of federated learning systems to such sophisticated attacks. To mitigate these vulnerabilities, we explore a recent defensive approach known as Random Deep Feature Selection, which randomizes server features with varying sizes (e.g., 50 and 400) during training. This strategy has proven highly effective in minimizing the impact of such attacks, particularly on feature poisoning.