Generative Federated Learning Approaches for Non-IID Data

None, None

Generative Federated Learning Approaches for Non-IID Data

Enhancing Federated Models with Synthetic Data

Bachelor Thesis (2024)

Author(s)

P.K. Cho (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Swier Garst – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

David M. J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A. Voulimeneas – Graduation committee member (TU Delft - Cyber Security)

Faculty

Electrical Engineering, Mathematics and Computer Science

Federated Learning Generative Models Non-IID data distributions

To reference this document use:

https://resolver.tudelft.nl/uuid:66e1dac4-ed87-4e0e-a47c-fdea32b85686

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

26-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Federated Learning (FL) is a machine learning approach that has gained considerable interest over the years. FL allows global models to train without compromising the data privacy of the clients' training datasets by sending the global model to each client to learn the weights and propagating only the learned weights back to a central location. However, it is not without limitations as several challenges hinder the model's performance. One of those challenges is the presence of non-IID (Independent and Identically Distributed) properties in the training data. Most real-world data is non-IID, and this imbalance in data distribution has been shown to significantly affect the model's performance. To address this issue, we propose a generative federated learning by pre-training the global model on synthetic data created by a generative model that follows the collective distribution of all clients' training datasets. Our research shows that this approach bridges the performance gap between IID and non-IID in FL, except for certain extreme non-IID cases.

Files

Research_paper.pdf

(pdf | 0.613 Mb)

License info not available