Efficient Temporal Action Localization model development practices

A review and analysis of models and a guide of best methods

More Info
expand_more

Abstract

Temporal Action Localization (TAL) is an important problem in computer vision with uses in video surveillance and recommendation, healthcare, entertainment, and human-computer interaction. Being an inherently data-heavy process, TAL has been bound by the availability of computing power, resulting in its slow pace of innovation. This work aims to accelerate the development of TAL models by conducting a short review of TAL's state-of-the-art, and providing extensive data about the latest models' data and compute efficiency. By researching how TAL models perform in limited data and compute settings, we find that using less data than available is often beneficial to iterating a model quickly, while in some cases, TAL is constrained by the limited amount of data. Finally, we provide general guidelines that create a simple framework for efficient TAL model development.