Evaluating the Effectiveness of Importance Weighting Techniques in Mitigating Sample Selection Bias

More Info
expand_more

Abstract

Importance weighting is a class of domain adaptation techniques for machine learning, which aims to correct the discrepancy in distribution between the train and test datasets, often caused by sample selection bias. In doing so, it frequently uses unlabeled data from the test set. However, this approach has certain drawbacks: it requires retraining for each new test set and fails when the number of test samples is very small. Therefore, we seek to study the performance of importance weighting techniques when the unlabeled data comes from an underlying domain, instead of one specific test set. We propose an evaluation framework inspired from scenarios traditionally known for posing difficulties to importance weighting and apply it to two popular algorithms, KMM and KLIEP. Our results reveal that both algorithms produce statistically significant classification improvements in most experiments. However, their performance is highly dependent on the characteristics of the dataset and the sampling bias. In particular, class overlap seems to influence adaptation ability in the case of unequal conditional probabilities of the source and target domains, while the "intensity" of the sampling bias is an important confounding factor when the train set size is small.