Annotating, Understanding, and Predicting Long-term Video Memorability

None, None; None, None; None, None; None, None

Annotating, Understanding, and Predicting Long-term Video Memorability

Conference Paper (2018)

Author(s)

Romain Cohendet (Technicolor, France)

Karthik Yadati (Multimedia Computing)

Ngoc Q.K. Duong (Technicolor, France)

Claire Hélène Demarty (Technicolor, France)

Long-term memory Attributes Scene understanding Global video features Measurement protocol Video memorability

DOI related publication

https://doi.org/10.1145/3206025.3206056 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:7b6d9ca8-ee41-4214-9f79-02c498ef5e61

More Info

expand_more

Publication Year

2018

Language

English

Pages (from-to)

178-186

ISBN (print)

978-1-4503-5046-4

Event

8th ACM International Conference on Multimedia Retrieval, ICMR 2018 (2018-06-11 - 2018-06-14), Yokohama, Japan

Downloads counter

105

Abstract

Memorability can be regarded as a useful metric of video importance to help make a choice between competing videos. Research on computational understanding of video memorability is however in its early stages. There is no available dataset for modelling purposes, and the few previous attempts provided protocols to collect video memorability data that would be difficult to generalize. Furthermore, the computational features needed to build a robust memorability predictor remain largely undiscovered. In this article, we propose a new protocol to collect long-term video memorability annotations. We measure the memory performances of 104 participants from weeks to years after memorization to build a dataset of 660 videos for video memorability prediction. This dataset is made available for the research community. We then analyze the collected data in order to better understand video memorability, in particular the effects of response time, duration of memory retention and repetition of visualization on video memorability. We finally investigate the use of various types of audio and visual features and build a computational model for video memorability prediction. We conclude that high level visual semantics help better predict the memorability of videos.