Detecting F-formations Roles in Crowded Social Scenes with Wearables

None, None; None, None; None, None

Detecting F-formations Roles in Crowded Social Scenes with Wearables

Combining Proxemics Dynamics using LSTMs

Conference Paper (2019)

Author(s)

Alessio Rosatelli (University of Perugia)

E. Gedik (TU Delft - Pattern Recognition and Bioinformatics)

Hayley Hung (TU Delft - Pattern Recognition and Bioinformatics)

Research Group

Pattern Recognition and Bioinformatics

Copyright

DOI related publication

https://doi.org/10.1109/ACIIW.2019.8925179

Recurrent neural networks Conversing groups F-formation detection Role identification Wearable sensing

To reference this document use:

https://resolver.tudelft.nl/uuid:be1f228f-3bf1-437c-a6ed-9a642e627e2d

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Research Group

Pattern Recognition and Bioinformatics

Pages (from-to)

147-153

ISBN (print)

978-1-7281-3892-3

ISBN (electronic)

978-1-7281-3891-6

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this paper, we investigate the use of proxemics and dynamics for automatically identifying conversing groups, or so-called F-formations. More formally we aim to automatically identify whether wearable sensor data coming from 2 people is indicative of F-formation membership. We also explore the problem of jointly detecting membership and more descriptive information about the pair relating to the role they take in the conversation (i.e. speaker or listener). We jointly model the concepts of proxemics and dynamics using binary proximity and acceleration obtained through a single wearable sensor per person. We test our approaches on the publicly available MatchNMingle dataset which was collected during real-life mingling events. We find out that fusion of these two modalities performs significantly better than them independently, providing an AUC of 0.975 when data from 30-second windows are used. Furthermore, our investigation into roles detection shows that each role pair requires a different time resolution for accurate detection.

Files

8925179.pdf

(pdf | 0.729 Mb)

- Embargo expired in 08-04-2022

License info not available