Data Augmentation for Deep Learning-based Gaze Estimation

None, None

Data Augmentation for Deep Learning-based Gaze Estimation

Bachelor Thesis (2023)

Author(s)

J.W. Dijk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

L. Du – Mentor (TU Delft - Embedded Systems)

Guohao Guohao – Mentor (TU Delft - Embedded Systems)

Xucong Zhang – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Data augmentation Convolutional neural network Gaze estimation

To reference this document use:

https://resolver.tudelft.nl/uuid:63ed04d9-8fcf-4df8-9c10-a0d491408a83

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

28-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study aims to provide insights in applying different data augmentation techniques to the input data of a convolutional neural network that estimates gaze. Gaze is used in numerous research domains for understanding and predicting emotions and actions from humans. Data augmentations consists of techniques to increase the size, variance and quality of training data to create better deep-learning models. Data augmentation is a widely used technique to reduce overfitting and increase accuracy of deep learning models. This research combines those two fields by first applying different individual data augmentations on the task of gaze estimation and after that combining the most useful methods to decrease the mean angular error even further. The results show that small geometric transformations, such as translating the image a portion of 15% or flipping the image horizontally 50% of the time give the most significant reductions in mean angular error. For individually applied data augmentation methods flipping got the best improvement, with 33% and 35% for both models in comparison to the baseline model. The best result is obtained by combining flipping with translation which got a mean angular error of 1.396 and 1.389 for both models. For obtaining the results a lot of training is necessary, which was the main limitation to conduct the experiments.

Files

Data_augmentation_deep_learnin... (pdf)

(pdf | 4.45 Mb)

License info not available