Generalization by Visual Attention

None, None

Generalization by Visual Attention

Bachelor Thesis (2022)

Author(s)

B.J. Collé (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Wendelin Böhmer – Mentor (TU Delft - Algorithmics)

C.B. Poulsen – Graduation committee member (TU Delft - Programming Languages)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

CNN Out-of-distribution Transformer MHA Attention

To reference this document use:

https://resolver.tudelft.nl/uuid:44653bb4-bc7a-42c4-9621-7d12fb759c4b

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

24-06-2022

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Most deep learning models fail to generalize in production. Indeed, sometimes data used during training does not completely reflect the deployed environment. The test data is then considered out-of-distribution compared to the training data. In this paper, we focus on out-of-distribution performance for image classification. In fact, transformers, which are a novel neural network architecture compared to the more traditionally used convolutional neural networks (CNN), have been shown to work well for image classification. This is why, in this paper, we firstly explore the different capabilities of both models on out-of-distribution. This is then followed by an in-depth investigation of individual architectural components of the transformer and their impact on the generalization capability of the model.

Files

Baptiste_colle_bachelor_thesis... (pdf)

(pdf | 2.43 Mb)

License info not available