Generalization by Visual Attention

Bachelor Thesis (2022)
Author(s)

B.J. Collé (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Wendelin Böhmer – Mentor (TU Delft - Algorithmics)

Casper Bach – Graduation committee member (TU Delft - Programming Languages)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Baptiste Collé
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Baptiste Collé
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Most deep learning models fail to generalize in production. Indeed, sometimes data used during training does not completely reflect the deployed environment. The test data is then considered out-of-distribution compared to the training data. In this paper, we focus on out-of-distribution performance for image classification. In fact, transformers, which are a novel neural network architecture compared to the more traditionally used convolutional neural networks (CNN), have been shown to work well for image classification. This is why, in this paper, we firstly explore the different capabilities of both models on out-of-distribution. This is then followed by an in-depth investigation of individual architectural components of the transformer and their impact on the generalization capability of the model.

Files

License info not available