Sample reusability in importance-weighted active learning

More Info
expand_more

Abstract

Recent advances in importance-weighted active learning solve many of the problems of traditional active learning strategies. But does importance-weighted active learning also produce a reusable sample selection? This thesis explains why reusability can be a problem, how importance-weighted active learning removes some of the barriers to reusability and which obstacles still remain. With theoretical arguments and practical demonstrations, this thesis argues that universal reusability is impossible: because every active learning strategy must undersample some areas of the sample space, classifiers that depend on the samples in those areas will learn more from a random sample selection. This thesis describes several reusability experiments with importance-weighted active learning that show the impact of the reusability problem in practice. The experiments confirm that universal reusability does not exist, although in some cases – on some datasets and with some pairs of classifiers – there is sample reusability. This thesis explores the conditions that could guarantee the reusability between two classifiers.