On the Regularization of Convolutional Neural Networks and Transformers under Distribution Shifts

Bachelor Thesis (2022)
Author(s)

L.Z. Assini (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Wendelin Böhmer – Mentor (TU Delft - Algorithmics)

C.B. Poulsen – Graduation committee member (TU Delft - Programming Languages)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Leo Assini
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Leo Assini
Graduation Date
24-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The use of Transformers outside the realm of natural language processing is becoming more and more prevalent. Already in the classification of data sets such as CIFAR-100 it has shown to be able to perform just as well as the much more established Convolutional Neural Network. This paper investigates the possible out-of-distribution capabilities of the multi-head attention mechanism, through the classification of the MNIST data set with added backgrounds. Additionally, various regularization techniques are applied to increase the generalization capabilities even more. Regularization is shown to be an important tool to improve out-of-distribution accuracy, though it might imply some trade offs for in-distribution settings.

Files

Thesis_LeoAssini.pdf
(pdf | 5.18 Mb)
License info not available