Sample reusability in importance-weighted active learning

Master thesis (2012)

Authors

G. Van Tulder

Contributors

M. Loog (mentor)

Programme

Pattern Recognition Lab () (TU Delft)

Machine learning Active learning Importance-weighted active learning Sample reusability Importance weighting

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:af4f9074-774e-4ff9-bab2-b58970b1c990

Published Date

31-10-2012

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Programme

Pattern Recognition Lab

Abstract

Recent advances in importance-weighted active learning solve many of the problems of traditional active learning strategies. But does importance-weighted active learning also produce a reusable sample selection? This thesis explains why reusability can be a problem, how importance-weighted active learning removes some of the barriers to reusability and which obstacles still remain. With theoretical arguments and practical demonstrations, this thesis argues that universal reusability is impossible: because every active learning strategy must undersample some areas of the sample space, classifiers that depend on the samples in those areas will learn more from a random sample selection. This thesis describes several reusability experiments with importance-weighted active learning that show the impact of the reusability problem in practice. The experiments confirm that universal reusability does not exist, although in some cases – on some datasets and with some pairs of classifiers – there is sample reusability. This thesis explores the conditions that could guarantee the reusability between two classifiers.

Files

Thesis-gijsvantulder.pdf

(pdf | 3.11 Mb)

Thesis-paper-gijsvantulder.pdf

(pdf | 0.248 Mb)