Nuclear discrepancy for single-shot batch active learning

Journal Article (2019)
Author(s)

Tom Viering (TU Delft - Pattern Recognition and Bioinformatics)

Jesse Krijthe (TU Delft - Pattern Recognition and Bioinformatics)

Marco Loog (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2019 T.J. Viering, J.H. Krijthe, M. Loog
DOI related publication
https://doi.org/10.1007/s10994-019-05817-y
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 T.J. Viering, J.H. Krijthe, M. Loog
Research Group
Pattern Recognition and Bioinformatics
Issue number
8-9
Volume number
108
Pages (from-to)
1561-1599
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Active learning algorithms propose what data should be labeled given a pool of unlabeled data. Instead of selecting randomly what data to annotate, active learning strategies aim to select data so as to get a good predictive model with as little labeled samples as possible. Single-shot batch active learners select all samples to be labeled in a single step, before any labels are observed.We study single-shot active learners that minimize generalization bounds to select a representative sample, such as the maximum mean discrepancy (MMD) active learner.We prove that a related bound, the discrepancy, provides a tighter worst-case bound. We study these bounds probabilistically, which inspires us to introduce a novel bound, the nuclear discrepancy (ND). The ND bound is tighter for the expected loss under optimistic probabilistic assumptions. Our experiments show that the MMD active learner performs better than the discrepancy in terms of the mean squared error, indicating that tighter worst case bounds do not imply better active learning performance. The proposed active learner improves significantly upon the MMD and discrepancy in the realizable setting and a similar trend is observed in the agnostic setting, showing the benefits of a probabilistic approach to active learning. Our study highlights that assumptions underlying generalization bounds can be equally important as bound-tightness, when it comes to active learning performance. Code for reproducing our experimental results can be found at https://github.com/tomviering/ NuclearDiscrepancy.