Active Learning by Discrepancy Minimization

A Comparison of Active Learning Methods Motivated by Generalization Bounds

More Info
expand_more

Abstract

In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant. This is problematic if one wants to train accurate (supervised) predictive models. The main idea behind active learning is that models can perform better with less labeled data, if the model may choose the data from which it learns. Active learning algorithms propose which unlabeled objects should be queried for their labels with the goal of improving the predictive model the most with as little queries as possible. We study active learners that minimize generalization bounds. We introduce a novel active learning algorithm, motivated by the fact that its generalization bound is tighter than the one used by the state-of-the-art MMD active learning algorithm. We find the surprising result that our introduced active learner performs worse in terms of the mean squared error. We find that tighter generalization bounds do not guarantee improved performance for active learning in general, and we characterize which properties of generalization bounds are relevant for performance instead. Inspired by these insights, we introduce a second novel active learning algorithm, which minimizes a looser generalization bound, that compares favorably with the MMD active learner. Our novel active learner compares favorably as well with the state-of-the-art TED active learner in case of model misspecification. Our analysis which describes favorable properties of generalization bounds for active learning is novel, and can be applied to construct new improved active learning algorithms in the future. Our analysis is likely applicable as well to the closely related fields of sample bias correction, transfer learning and domain adaptation.