The Effect of Domain Shift on Learning Curve Extrapolation
M. Soeters (TU Delft - Electrical Engineering, Mathematics and Computer Science)
T.J. Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
C. Yan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
S. Mukherjee – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Matthijs T.J. Spaan – Graduation committee member (TU Delft - Sequential Decision Making)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Domain shift is when the distribution of data differs between the training of a model and its testing. This can happen when the conditions of training are slightly different from the conditions that will happen when a model is tested or used. This is a problem for generalizability of a model. Learning curves are widely used in machine learning to predict how much data is needed when training a model. This paper will explore how domain shift impacts learning curve extrapolation using Learning Curve Prior Fitted Networks. We will explore the effect of domain shift on the performance of models while comparing different learners and groups of learners, thereby showing that domain shift is relevant to learning curve extrapolation and has a statistically significant impact on the accuracy of such extrapolations. We will also discuss how patterns like well-behavedness have an impact on this effect of domain shift, while also showing that is it not the full solution to predicting the effect.