Towards automatic estimation of conversation floors within F-formations

Conference Paper (2019)
Authors

C.A. Raman (TU Delft - Pattern Recognition and Bioinformatics)

H.S. Hung (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2019 C.A. Raman, H.S. Hung
To reference this document use:
https://doi.org/10.1109/ACIIW.2019.8925065
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 C.A. Raman, H.S. Hung
Research Group
Pattern Recognition and Bioinformatics
Pages (from-to)
175-181
ISBN (print)
978-1-7281-3892-3
ISBN (electronic)
978-1-7281-3891-6
DOI:
https://doi.org/10.1109/ACIIW.2019.8925065
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The detection of free-standing conversing groups has received significant attention in recent years. In the absence of a formal definition, most studies operationalize the notion of a conversation group either through a spatial or a temporal lens. Spatially, the most commonly used representation is the F-formation, defined by social scientists as the configuration in which people arrange themselves to sustain an interaction. However, the use of this representation is often accompanied with the simplifying assumption that a single conversation occurs within an F-formation. Temporally, various categories have been used to organize conversational units; these include, among others, turn, topic, and floor. Some of these concepts are hard to define objectively by themselves. The present work constitutes an initial exploration into unifying these perspectives by primarily posing the question: can we use the observation of simultaneous speaker turns to infer whether multiple conversation floors exist within an F-formation? We motivate a metric for the existence of distinct conversation floors based on simultaneous speaker turns, and provide an analysis using this metric to characterize conversations across F-formations of varying cardinality. We contribute two key findings: firstly, at the average speaking turn duration of about two seconds for humans, there is evidence for the existence of multiple floors within an F-formation; and secondly, an increase in the cardinality of an F-formation correlates with a decrease in duration of simultaneous speaking turns.

Files

8925065.pdf
(pdf | 4.05 Mb)
- Embargo expired in 08-04-2022
License info not available