Impact of Dissimilarity Loss on Out of Distribution Generalization

None, None

Impact of Dissimilarity Loss on Out of Distribution Generalization

An introduction of a novel approach for mitigating shortcut learning

Bachelor Thesis (2026)

Author(s)

A.C. Cazacu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.W. Böhmer – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

D.M.J. Tax – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Spurious correlation Shortcut learning Distribution shift Dissimilarity loss Ood generalization Shortcut feature Simplicity bias

To reference this document use

https://resolver.tudelft.nl/uuid:c0f5c900-5c65-483d-b09a-d2a6db8b260c

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

27-01-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

61

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep Learning has made neural networks ubiquitous in all kinds of applications. During training, models extract features that are predictive of labels, achieving high accuracy values when tested on in-distribution data. However, issues arise when these extracted features, while indicative in training, do not capture the actual underlying causal features of the data. This reliance on spurious correlations is known as "shortcut learning" and leads to failure to generalize on unseen data. In this paper, we introduce a novel regularizer, dissimilarity loss, which aims to penalize the excessive similarity between representations of samples that share the same spurious predictors. This encourages the model to move beyond shortcut features and learn more robust, task-relevant representations. We show that this additional regularization provides significant benefits to out-of-distribution accuracy compared to a baseline and discuss its drawbacks. Furthermore, we apply it without the spurious feature labels, a regime in which dissimilarity loss still remains effective under distribution shift, and explore other possible directions in which improvements can be made by future work.

Files

RP_Paper_Final.pdf

(pdf | 3.28 Mb)

License info not available