Multimodal Conversational Events Estimation in Complex Social Scenes
Litian Li (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Conversational events, such as speaking turns, backchannels, topic changes, and laughter, are central to the structure of multiparty interaction and play a key role in shaping its dynamics. However, detecting such events in real world social settings remains challenging due to perceptual ambiguity, visual occlusion, signal noise, and limitations in acquiring high quality audio data. This work addresses these challenges by focusing on spontaneous interactions in socially complex and privacy sensitive environments, exploring multimodal, nonverbal cues that do not rely on audio. The goal is to develop a novel modeling approach for group context awareness to infer conversational events and support social scene understanding under real world constraints..