Consistency and Finite Sample Behavior of Binary Class Probability Estimation

More Info
expand_more

Abstract

We investigate to which extent one can recover class probabilities within the empirical risk minimization (ERM) paradigm. We extend existing results and emphasize the tight relations between empirical risk minimization and class probability estimation. Following previous literature on excess risk bounds and proper scoring rules, we derive a class probability estimator based on empirical risk minimization. We then derive conditions under which this estimator will converge with high probability to the true class probabilities with respect to the L1-norm. One of our core contributions is a novel way to derive finite sample L1-convergence rates of this estimator for different surrogate loss functions. We also study in detail which commonly used loss functions are suitable for this estimation problem and briefly address the setting of model-misspecification.