Analysing the Performance of Generative Models Trained in a Federated Manner
Exploring the Impact of GANs and Variational Auto-Encoders on Decentralized Data
A.N. Ojică (TU Delft - Electrical Engineering, Mathematics and Computer Science)
S.J.F. Garst – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
David M.J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Alexios Voulimeneas – Graduation committee member (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Federated learning (FL) is an innovative approach in machine learning that enables model training across multiple decentralized devices or servers without sharing local data, thus preserving privacy and utilizing decentralized data. However, a significant challenge in FL is handling non-IID (Non-Identical and Independently Distributed) data, which can adversely affect performance. This paper investigates the impact of federated learning on the performance of various generative models, including Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), specifically in the context of image and tabular data generation tasks. Our study aims to determine how these generative models perform when trained in a federated manner compared to centralized training. We evaluate the models using several metrics, including classifier accuracy on generated images, Earth Mover’s Distance (EMD) for distribution comparison, resemblance, discriminability, downstream utility, and privacy metrics for tabular data. Experiments conducted on the MNIST and CIFAR-10 datasets for image generation, and the Adult and Abalone datasets for tabular data generation, reveal that VAEs exhibit robust and consistent performance across federated and centralized setups. In contrast, GANs show significant performance degradation under federated non-IID conditions. The results indicate that VAEs can effectively address the non-IID data challenge in FL by generating high-quality synthetic data, thereby enhancing model generalizability and stability. The framework used for executing the experiments in this study can be found at https://github.com/alexojica/research-project-experiments.