Federated Synthetic Data Generation with Stronger Security Guarantees
Ali Reza Ghavamipour (University Medical Center Groningen)
Fatih Turkmen (University Medical Center Groningen)
Rui Wang (Student TU Delft)
K. Liang (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Synthetic data generation plays a crucial role in many areas where data is scarce and privacy/confidentiality is a significant concern. Generative Adversarial Networks (GANs), arguably one of the most widely used data synthesis techniques, allow for the training of a model (i.e., generator) that can generate real-looking data by playing a min-max game with a discriminator model. When multiple organizations are reluctant to share their sensitive data, GANs models can be trained in a federated manner, commonly with the use of differential privacy (DP). In order to achieve a reasonable level of model utility, DP trades privacy exhibiting vulnerability to various attacks (e.g., membership inference attack). In this paper, we propose a hybrid solution, PP-FedGAN, to the asynchronous federated, privacy-preserving training of GANs models by combining the CKKS homomorphic encryption (HE) scheme with differential privacy. The addition of HE results in around 10 seconds of overhead on the client side per round and 115 seconds on the entire training procedure. We also analyze the security of PP-FedGAN under the honest-but-curious security model. Where stronger security guarantees are required, our proposal presents a better alternative to solutions that only employ DP.