Effects of sampling skewness of the importance-weighted risk estimator on model selection

None, None; None, None

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Conference Paper (2018)

Author(s)

Wouter Kouw (TU Delft - Electrical Engineering, Mathematics and Computer Science, Netherlands eScience Center)

Marco Loog (University of Copenhagen, TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group

Pattern Recognition and Bioinformatics

DOI related publication

https://doi.org/10.1109/ICPR.2018.8546186 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:707b7230-1575-41ad-8d17-06985e2e2bc9

More Info

expand_more

Publication Year

2018

Language

English

Research Group

Pattern Recognition and Bioinformatics

Pages (from-to)

1468-1473

ISBN (print)

978-1-5386-3789-0

ISBN (electronic)

978-1-5386-3788-3

Event

2018 24th International Conference on Pattern Recognition, ICPR 2018 (2018-08-20 - 2018-08-24), Beijing, China

Downloads counter

156

Abstract

Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for data sets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to sub-optimal regularization parameters when used for importance-weighted validation.