Can Invariant Risk Minimization resist the temptation of learning spurious correlations?

Bachelor Thesis (2022)
Author(s)

J.A.E. van Lith (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Rickard Karlsson – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S.R. Bongers – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

J.H. Krijthe – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Jochem van Lith
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Jochem van Lith
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning algorithms can perform poorly in unseen environments when they learn
spurious correlations. This is known as the out-of-domain (OOD) generalization problem. Invariant Risk Minimization (IRM) is a method that attempts to solve this problem by learning invariant relationships. Motivating examples as well as counterexamples have been proposed about the performance of IRM. This work aims to clarify when the method works well and when it fails by testing its ability to learn invariant relationships. Therefore, experiments are done on a synthetic data model which simulates four data distribution shifts: covariate shift (CS), confounder based shift (CF), anti-causal shift (AC), and hybrid shift (HB). The experiments exploit IRM’s behaviour with respect to hetero- and homoskedasticity and adaptation of the training environments. We measure the error with regards to the optimal invariant predictor and compare to the non invariant Empirical Risk Minimization (ERM). The results show that IRM is generally able to learn invariance for the CS and CF shifts, especially when the deviation between the training environments is large. In the AC and HB shifts, this strongly depends on the values of the training environments.

Files

Final_Paper.pdf
(pdf | 1.87 Mb)
License info not available