Data Augmentation for Deep Learning-based Gaze Estimation

Bachelor Thesis (2023)
Author(s)

J.W. Dijk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

L. Du – Mentor (TU Delft - Embedded Systems)

L. A.N. Guohao – Mentor (TU Delft - Embedded Systems)

Xucong Zhang – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Jorn Dijk
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Jorn Dijk
Graduation Date
28-06-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study aims to provide insights in applying different data augmentation techniques to the input data of a convolutional neural network that estimates gaze. Gaze is used in numerous research domains for understanding and predicting emotions and actions from humans. Data augmentations consists of techniques to increase the size, variance and quality of training data to create better deep-learning models. Data augmentation is a widely used technique to reduce overfitting and increase accuracy of deep learning models. This research combines those two fields by first applying different individual data augmentations on the task of gaze estimation and after that combining the most useful methods to decrease the mean angular error even further. The results show that small geometric transformations, such as translating the image a portion of 15% or flipping the image horizontally 50% of the time give the most significant reductions in mean angular error. For individually applied data augmentation methods flipping got the best improvement, with 33% and 35% for both models in comparison to the baseline model. The best result is obtained by combining flipping with translation which got a mean angular error of 1.396 and 1.389 for both models. For obtaining the results a lot of training is necessary, which was the main limitation to conduct the experiments.

Files

License info not available