Generative Federated Learning Approaches for Non-IID Data

Enhancing Federated Models with Synthetic Data

Bachelor Thesis (2024)
Author(s)

P.K. Cho (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Swier Garst – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

David M. J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A. Voulimeneas – Graduation committee member (TU Delft - Cyber Security)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
26-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Federated Learning (FL) is a machine learning approach that has gained considerable interest over the years. FL allows global models to train without compromising the data privacy of the clients' training datasets by sending the global model to each client to learn the weights and propagating only the learned weights back to a central location. However, it is not without limitations as several challenges hinder the model's performance. One of those challenges is the presence of non-IID (Independent and Identically Distributed) properties in the training data. Most real-world data is non-IID, and this imbalance in data distribution has been shown to significantly affect the model's performance. To address this issue, we propose a generative federated learning by pre-training the global model on synthetic data created by a generative model that follows the collective distribution of all clients' training datasets. Our research shows that this approach bridges the performance gap between IID and non-IID in FL, except for certain extreme non-IID cases.

Files

Research_paper.pdf
(pdf | 0.613 Mb)
License info not available