BRISTLE: Decentralized Federated Learning in Byzantine, Non-i.i.d. Environments

Master Thesis (2021)
Author(s)

J. Verbraeken (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

JA Pouwelse – Mentor (TU Delft - Data-Intensive Systems)

M.A. Larson – Mentor (TU Delft - Multimedia Computing)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Joost Verbraeken
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Joost Verbraeken
Graduation Date
01-07-2021
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Federated learning (FL) is a type of machine learning where devices locally train a model on their private data.
The devices iteratively communicate this model to a central server which combines the models and sends the updated model back to all devices.
Because the data stays on the devices and only the model is transmitted, federated learning is considered as a privacy-friendly alternative to regular machine learning where all data is transmitted over the internet.

However, the central server used in typical FL systems not only poses a single point of failure susceptible to crashes or hacks, but may also become a performance bottleneck. These issues are alleviated by decentralized FL (DFL), where the peers communicate model updates with each other instead of with a single server.

Unfortunately, DFL is challenging since (1) the training data possessed by different peers is often non-i.i.d. (i.e., distributed differently between the peers) and (2) malicious, or Byzantine, attackers can share arbitrary model updates with other peers to subvert the training process.

We address these two challenges and present Bristle, middleware between the learning application and the decentralized network layer.
Bristle leverages transfer learning to predetermine and freeze the non-output layers of a neural network, significantly speeding up model training and lowering communication costs.
To securely update the output layer with model updates from other peers, we design a fast distance-based prioritizer and a novel performance-based integrator.
The prioritizer prioritizes the model updates based on their distance to the peer's own model and an explore-exploit trade-off, and the integrator integrates each class of each model update separately based on their performance on a small set of i.i.d. test samples.
Their combined effect results in high resilience to Byzantine attackers and the ability to handle non-i.i.d. classes.

We empirically show that Bristle converges to a consistent 95% accuracy in Byzantine environments, outperforming all evaluated baselines. In non-Byzantine environments, Bristle requires 83% fewer iterations to achieve 90% accuracy compared to state-of-the-art methods. We show that when the training classes are non-i.i.d., Bristle significantly outperforms the accuracy of the most Byzantine-resilient baselines by 2.3x while reducing communication costs by 90%.

Files

Thesis_Joost_Verbraeken_final_... (pdf)
(pdf | 1.85 Mb)
- Embargo expired in 01-09-2021
License info not available