Analysing the Performance of Generative Models Trained in a Federated Manner

Exploring the Impact of GANs and Variational Auto-Encoders on Decentralized Data

Bachelor Thesis (2024)
Author(s)

A.N. Ojică (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S.J.F. Garst – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

David M.J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Alexios Voulimeneas – Graduation committee member (TU Delft - Cyber Security)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
26-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project', 'Generative Federated Learning Approaches for Non-IID Data']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Federated learning (FL) is an innovative approach in machine learning that enables model training across multiple decentralized devices or servers without sharing local data, thus preserving privacy and utilizing decentralized data. However, a significant challenge in FL is handling non-IID (Non-Identical and Independently Distributed) data, which can adversely affect performance. This paper investigates the impact of federated learning on the performance of various generative models, including Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), specifically in the context of image and tabular data generation tasks. Our study aims to determine how these generative models perform when trained in a federated manner compared to centralized training. We evaluate the models using several metrics, including classifier accuracy on generated images, Earth Mover’s Distance (EMD) for distribution comparison, resemblance, discriminability, downstream utility, and privacy metrics for tabular data. Experiments conducted on the MNIST and CIFAR-10 datasets for image generation, and the Adult and Abalone datasets for tabular data generation, reveal that VAEs exhibit robust and consistent performance across federated and centralized setups. In contrast, GANs show significant performance degradation under federated non-IID conditions. The results indicate that VAEs can effectively address the non-IID data challenge in FL by generating high-quality synthetic data, thereby enhancing model generalizability and stability. The framework used for executing the experiments in this study can be found at https://github.com/alexojica/research-project-experiments.

Files

Research_paper.pdf
(pdf | 2.14 Mb)
License info not available