Detecting Vulnerabilities of Heterogeneous Federated Learning Systems
J. Huang (TU Delft - Data-Intensive Systems)
D.H.J. Epema – Promotor (TU Delft - Data-Intensive Systems)
Y. Chen – Copromotor (TU Delft - Data-Intensive Systems)
S. Roos – Copromotor (TU Delft - Data-Intensive Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Federated learning (FL) has emerged as an important paradigm in distributed machine learning, enabling collaborative model training across decentralized devices while preserving data privacy. FL’s privacy-preserving nature – where raw data remains on local devices and only model updates are shared – has made it suitable in sensitive domains like healthcare and finance. However, the decentralized framework introduces fundamental challenges that threaten its reliability and adoption. Data heterogeneity, security threats, and privacy leakage risks create critical vulnerabilities that demand robust solutions.
To study such vulnerabilities, this thesis considers two kinds of parties: the clients and the servers. Clients act as data owners that perform localized computations and share only model parameters, thereby preserving raw data privacy, yet they introduce vulnerabilities through potential malicious behaviors (e.g., data/model poisoning attacks) or unreliable contributions due to data quality. In contrast, the server, while facilitating model convergence through aggregation, poses inherent privacy risks by potentially inferring sensitive client information from shared gradients, even without direct data access. These two parties create a dual-threat landscape: clients may compromise model performance through adversarial manipulations, while servers break confidentiality via reconstruction methods....