A Survey of Two Open Problems of Privacy-Preserving Federated Learning: Vertically Partitioned Data and Verifiability

More Info
expand_more

Abstract

Federated learning is a machine learning technique proposed by Google AI in 2016, as a solution to the GDPR regulations that made the classical Centralized Training, not only unfeasible, but also illegal, in some cases. In spite of its potential, FL has not gained much trust in the community, especially because of its susceptibility to data-privacy attacks. Incipient solutions to this problem were homomorphic encryption and differential privacy. These techniques, however, do not come with a solution to other open problems in the field, such as the difficulty to perform FL on vertically partitioned data, ensuring aggregation verifiability, resilience to user dropout etc. Fortunately, new strategies have been developed in the last years. This paper provides a comprehensive study of the state of the art privacy preserving techniques aimed to address two of these problems: vertical federated learning environments and aggregation verifiability. To this end, we study FedV, SecureBoost, MP-FEDXGB, VerifyNet, VFL, together with a verifiability approach exploiting bilinear aggregate signatures, analysing their security model, complexity and communication overhead, the accuracy impact, benefits and downsides in a comparative manner.