Is the batch size affecting the performance of Regression CNNs ?

Bachelor Thesis (2021)
Author(s)

J.A.D. Lamon (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

O. Taylan Turan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

M Loog – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

David M. J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Tom Viering – Mentor (TU Delft - Computer Science & Engineering-Teaching Team)

Y. Kato – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Z. Wang – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

K. Hildebrandt – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Julien Lamon
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Julien Lamon
Graduation Date
02-07-2021
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

With an expectation of 8.3 trillion photos stored in 2021 [1], convolutional neural networks (CNN) are beginning to be preeminent in the field of image recognition. However, with this deep neural network (DNN) still being seen as a black box, it is hard to fully employ its capabilities. A need to tune hyperparameters is required to have a robust CNN that can more accurately do its task. In this study, the batch size, being one of the most important hyperparameters, is our main concern. The batch size is the number of samples that will be propagated through the network before updating the weights. Moreover, we show how the batch affects the performance of Regression CNNs to the following regression tasks: the mean, median, standard deviation (std) and variance of the pixel intensities of a grey-scale MNIST [2] input image. This will be analyzed by how well regression CNNs converge, given different batch sizes and a fixed learning rate. Additionally, we will also be comparing the final mean squared error given by all different batch sizes. At the end of the research, our findings concluded that a higher batch size leads to a higher Mean Squared Error (MSE) and a slower convergence. Additionally, the best performance obtained was for batch sizes of size 8 to 32, with slight differences between the four different regressions tasks.

Files

License info not available