Can Invariant Risk Minimization resist the temptation of learning spurious correlations?

Bachelor thesis (2022)

Authors

J.A.E. van Lith Electrical Engineering, Mathematics and Computer Science

Contributors

R.K.A. Karlsson Pattern Recognition and Bioinformatics - (supervisor 1)

S.R. Bongers Pattern Recognition and Bioinformatics - (supervisor 1)

J.H. Krijthe Pattern Recognition and Bioinformatics - (supervisor 1)

Faculty

Electrical Engineering, Mathematics and Computer Science

Machine learning Invariance principle Generalization

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:60339154-302c-407a-9047-05d9b3b21f57

Published Date

24-06-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Learning algorithms can perform poorly in unseen environments when they learn
spurious correlations. This is known as the out-of-domain (OOD) generalization problem. Invariant Risk Minimization (IRM) is a method that attempts to solve this problem by learning invariant relationships. Motivating examples as well as counterexamples have been proposed about the performance of IRM. This work aims to clarify when the method works well and when it fails by testing its ability to learn invariant relationships. Therefore, experiments are done on a synthetic data model which simulates four data distribution shifts: covariate shift (CS), confounder based shift (CF), anti-causal shift (AC), and hybrid shift (HB). The experiments exploit IRM’s behaviour with respect to hetero- and homoskedasticity and adaptation of the training environments. We measure the error with regards to the optimal invariant predictor and compare to the non invariant Empirical Risk Minimization (ERM). The results show that IRM is generally able to learn invariance for the CS and CF shifts, especially when the deviation between the training environments is large. In the AC and HB shifts, this strongly depends on the values of the training environments.

Files

Final_Paper.pdf

(.pdf | 1.87 Mb)