Repetition counting in videos using deep learning

More Info
expand_more

Abstract

This work tackles the problem of repetition counting in videos using modern deep learning techniques. For this task, the intention is to build an end-to-end trainable model that could estimate the number of repetitions without having to manually intervene with the feature selection process. The models that exist currently perform well on videos which are stationary but, realistic videos are rarely perfectly static. A series of intermediate experiments are performed to eventually come up with an end-to-end trainable pipeline. Techniques like the University of California Riverside's Matrix profile, bi-directional recurrent neural networks and convolutional neural network architectures that employ dilation are experimented with for the task at hand. For the experiments, a variety of videos from the Qualcomm and University of Amsterdam (QUVA) repetition dataset and the YouTube Segments (YTSegments) dataset are used which both exhibit a good number of non-static videos of real life scenarios like people exercising, chopping vegetables, etc. A proprietary Aircraft inspection dataset which contains repetition of spinning engine blades is also experimented with. The proposed model obtains a lower Mean Absolute Error than the existing deep learning architectures. Finally, the model proposed in this work is able to estimate repetitions on a variety of videos successfully in real time without manual intervention. On the task of repetition estimation, an accuracy of about 60% of correctly labelled frames (with repetitions so far) on unseen test videos is obtained.