Individual Fairness Guarantees for Neural Networks
Elias Benussi (University of Oxford)
Andrea Patane (University of Oxford)
Matthew Wicker (University of Oxford)
Luca Laurenti (TU Delft - Team Luca Laurenti)
Marta Kwiatkowska (University of Oxford)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
We consider the problem of certifying the individual fairness (IF) of feed-forward neural networks (NNs). In particular, we work with the ϵ-δ-IF formulation, which, given a NN and a similarity metric learnt from data, requires that the output difference between any pair of ϵ-similar individuals is bounded by a maximum decision tolerance δ ≥ 0. Working with a range of metrics, including the Mahalanobis distance, we propose a method to over-approximate the resulting optimisation problem using piecewise-linear functions to lower and upper bound the NN's non-linearities globally over the input space. We encode this computation as the solution of a Mixed-Integer Linear Programming problem and demonstrate that it can be used to compute IF guarantees on four datasets widely used for fairness benchmarking. We show how this formulation can be used to encourage models' fairness at training time by modifying the NN loss, and empirically confirm our approach yields NNs that are orders of magnitude fairer than state-of-the-art methods.