Dialog detection in narrative video by shot and face analysis

More Info
expand_more

Abstract

The proliferation of captured personal and broadcast content in personal consumer archives necessitates comfortable access to stored audiovisual content. Intuitive retrieval and navigation solutions require however a semantic level that cannot be reached by generic multimedia content analysis alone. A fusion with film grammar rules can help to boost the reliability significantly. The current paper describes the fusion of low-level content analysis cues including face parameters and inter-shot similarities to segment commercial content into film grammar rule-based entities and subsequently classify those sequences into so-called shot reverse shots, i.e. dialog sequences. Moreover shot reverse shot specific mid-level cues are analyzed augmenting the shot reverse shot information with dialog specific descriptions.

Files