Federated Synthetic Data Generation with Stronger Security Guarantees

None, None; None, None; None, None; None, None

Federated Synthetic Data Generation with Stronger Security Guarantees

Conference Paper (2023)

Author(s)

Ali Reza Ghavamipour (University Medical Center Groningen)

Fatih Turkmen (University Medical Center Groningen)

Rui Wang (Student TU Delft)

K. Liang (TU Delft - Cyber Security)

Research Group

Cyber Security

Copyright

DOI related publication

https://doi.org/10.1145/3589608.3593835

Federated learning Homomorphic encryption Synthetic data Differential privacy Gan

To reference this document use:

https://resolver.tudelft.nl/uuid:2cba0f7e-b1ae-4e69-a0f1-153df866891c

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Cyber Security

Pages (from-to)

31-42

ISBN (electronic)

979-8-4007-0173-3

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Synthetic data generation plays a crucial role in many areas where data is scarce and privacy/confidentiality is a significant concern. Generative Adversarial Networks (GANs), arguably one of the most widely used data synthesis techniques, allow for the training of a model (i.e., generator) that can generate real-looking data by playing a min-max game with a discriminator model. When multiple organizations are reluctant to share their sensitive data, GANs models can be trained in a federated manner, commonly with the use of differential privacy (DP). In order to achieve a reasonable level of model utility, DP trades privacy exhibiting vulnerability to various attacks (e.g., membership inference attack). In this paper, we propose a hybrid solution, PP-FedGAN, to the asynchronous federated, privacy-preserving training of GANs models by combining the CKKS homomorphic encryption (HE) scheme with differential privacy. The addition of HE results in around 10 seconds of overhead on the client side per round and 115 seconds on the entire training procedure. We also analyze the security of PP-FedGAN under the honest-but-curious security model. Where stronger security guarantees are required, our proposal presents a better alternative to solutions that only employ DP.

Files

3589608.3593835.pdf

(pdf | 2.25 Mb)