Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

None, None; None, None; None, None

Multimodal Joint Head Orientation Estimation in Interacting Groups via Proxemics and Interaction Dynamics

Journal Article (2021)

Author(s)

S. Tan (TU Delft - Pattern Recognition and Bioinformatics)

David M. J. Tax (TU Delft - Pattern Recognition and Bioinformatics)

HS Hung (TU Delft - Pattern Recognition and Bioinformatics)

Research Group

Pattern Recognition and Bioinformatics

Copyright

DOI related publication

https://doi.org/10.1145/3448122

Head orientation estimation Interaction dynamics Scene understanding

To reference this document use:

https://resolver.tudelft.nl/uuid:7ee7c53f-648b-4ef3-baaa-46010a567f4c

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Research Group

Pattern Recognition and Bioinformatics

Issue number

1

Volume number

5

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Human head orientation estimation has been of interest because head orientation serves as a cue to directed social attention. Most existing approaches rely on visual and high-fidelity sensor inputs and deep learning strategies that do not consider the social context of unstructured and crowded mingling scenarios. We show that alternative inputs, like speaking status, body location, orientation, and acceleration contribute towards head orientation estimation. These are especially useful in crowded and in-the-wild settings where visual features are either uninformative due to occlusions or prohibitive to acquire due to physical space limitations and concerns of ecological validity. We argue that head orientation estimation in such social settings needs to account for the physically evolving interaction space formed by all the individuals in the group. To this end, we propose an LSTM-based head orientation estimation method that combines the hidden representations of the group members. Our framework jointly predicts head orientations of all group members and is applicable to groups of different sizes. We explain the contribution of different modalities to model performance in head orientation estimation. The proposed model outperforms baseline methods that do not explicitly consider the group context, and generalizes to an unseen dataset from a different social event.

Files

3448122.pdf

(pdf | 7.28 Mb)