Benchmarking the effectiveness of domain adaptation techniques in mitigating sample selection bias when leveraging the global domain
A.C. TOCIU (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Joana Gonçalves – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Y.I. Tepeli – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Sample selection bias is a widespread cause of distribution shift between the train and test sets, which can significantly degrade the generalisability and performance of machine learning models. To mitigate distribution shifts, numerous domain adaptation techniques have been developed, which adapt the train set to the test set. However, adapting to a specific test set under sample selection bias might impede the model from properly generalizing across the entire problem domain and requires re-adaptation whenever the test data changes. Therefore, we propose a novel adaptation strategy,called global domain adaptation, in which we instead adapt to a larger (global) domain representative of the distribution from which both the train and test sets originate. We introduce a comprehensive benchmark to investigate the behavior and limitations of domain adaptation techniques when adapting to the global domain, which consists of synthetic datasets and selection biases as well as complex bioinformatics datasets with intrinsic biases. Our benchmark reveals interesting performance patterns across categories of domain adaptation techniques: minimax estimators are very fragile in practice,while deep domain adaptation has lower stability in spite of increased architectural complexity. Lastly, we find that global domain adaptation is a viable approach for certain techniques such as importance weighting, while semi-supervised techniques tend to perform best for existing test set adaptation.