Is the batch size affecting the performance of Regression CNNs ?

More Info
expand_more

Abstract

With an expectation of 8.3 trillion photos stored in 2021 [1], convolutional neural networks (CNN) are beginning to be preeminent in the field of image recognition. However, with this deep neural network (DNN) still being seen as a black box, it is hard to fully employ its capabilities. A need to tune hyperparameters is required to have a robust CNN that can more accurately do its task. In this study, the batch size, being one of the most important hyperparameters, is our main concern. The batch size is the number of samples that will be propagated through the network before updating the weights. Moreover, we show how the batch affects the performance of Regression CNNs to the following regression tasks: the mean, median, standard deviation (std) and variance of the pixel intensities of a grey-scale MNIST [2] input image. This will be analyzed by how well regression CNNs converge, given different batch sizes and a fixed learning rate. Additionally, we will also be comparing the final mean squared error given by all different batch sizes. At the end of the research, our findings concluded that a higher batch size leads to a higher Mean Squared Error (MSE) and a slower convergence. Additionally, the best performance obtained was for batch sizes of size 8 to 32, with slight differences between the four different regressions tasks.