Evaluation of Video Summarization using DSNet and Action Localization Datasets
D.H.E. Groenewegen (TU Delft - Electrical Engineering, Mathematics and Computer Science)
O. Strafforello – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
S Khademi – Graduation committee member (TU Delft - History, Form & Aesthetics)
T. Höllt – Coach (TU Delft - Computer Graphics and Visualisation)
More Info
expand_more
Github repository of slightly modified repository
https://github.com/DaanG96/breakfastDSNetOther than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In this paper, the DSNet framework used for automatic video summarization gets reviewed when using action localization datasets. The problem facing video summarizations using deep learning techniques is that datasets can be subjective depending on preferences of human annotators, making for noise in the labeling. This paper will look at a anchor-based approach and anchor-free approach which were introduced by the DSNet framework. More specific it will evaluate in experiments using different hyper-parameters if these approaches gain an increased performances when using action localization datasets instead. These results will show the increase in accuracy when using action localization datasets. Moreover it will compare the different approaches, meaning anchor-based and anchor-free, and see if they still have comparable performance with the method.