The Data Barrier to Lightweight Drinking Detection
An Analysis of the Viability of Skeleton-Only Models on In-the-Wild Social Data
J.D. Tijssens (TU Delft - Electrical Engineering, Mathematics and Computer Science)
L. Li – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
S. Tan – Graduation committee member (TU Delft - Interactive Intelligence)
H.S. Hung – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
Julián Urbano – Graduation committee member (TU Delft - Multimedia Computing)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This research addresses the challenge of deploying real-time drinking gesture detection in messy, "in-the-wild" environments. We propose and evaluate two computationally inexpensive systems, one using a Random Forest classifier, another using a 1-Dimensional Convolutional Neural Net (1D-CNN) classifier. Both are trained on 2D skeleton data, or features derived from that solely that data. Tested on the Conflab social interaction dataset, our method is designed to handle sparse labels and significant data occlusion. This study reports on the performance of this light-weight, video-based approach, providing a benchmark for applicability in real-world health and human-computer-interaction applications where privacy and computational efficiency are important factors. Although we were unable to create a robust and reliable classifier (f1 of 0.07 and 0.03 respectively), this work shows that there is potential for future work to succeed (roc-auc’s of 0.63 and 0.55 respectively) and provides critical insights into pitfalls to avoid when designing similar systems.