Repetition counting in videos using deep learning

Master thesis (2019)

Authors

D. Batheja Electrical Engineering, Mathematics and Computer Science

Contributors

J.C. van Gemert Pattern Recognition and Bioinformatics - (mentor)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:19ae2a31-5c22-4da0-9627-352a7e66a6b1

Published Date

27-08-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

This work tackles the problem of repetition counting in videos using modern deep learning techniques. For this task, the intention is to build an end-to-end trainable model that could estimate the number of repetitions without having to manually intervene with the feature selection process. The models that exist currently perform well on videos which are stationary but, realistic videos are rarely perfectly static. A series of intermediate experiments are performed to eventually come up with an end-to-end trainable pipeline. Techniques like the University of California Riverside's Matrix profile, bi-directional recurrent neural networks and convolutional neural network architectures that employ dilation are experimented with for the task at hand. For the experiments, a variety of videos from the Qualcomm and University of Amsterdam (QUVA) repetition dataset and the YouTube Segments (YTSegments) dataset are used which both exhibit a good number of non-static videos of real life scenarios like people exercising, chopping vegetables, etc. A proprietary Aircraft inspection dataset which contains repetition of spinning engine blades is also experimented with. The proposed model obtains a lower Mean Absolute Error than the existing deep learning architectures. Finally, the model proposed in this work is able to estimate repetitions on a variety of videos successfully in real time without manual intervention. On the task of repetition estimation, an accuracy of about 60% of correctly labelled frames (with repetitions so far) on unseen test videos is obtained.

Files

MscDhruvBatheja.pdf

(.pdf | 8.78 Mb)