Anticipating daily human actions
comparing pipelines for long-term skeleton-based prediction in real-world scenarios
Junhan Wen (Honda Research Institute Japan, Wako, Saitama, TU Delft - Electrical Engineering, Mathematics and Computer Science)
Xucong Zhang (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Jouh Yeong Chew (Honda Research Institute Japan, Wako, Saitama)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Human action anticipation remains a key challenge to achieve efficient human-robot interaction due to the difficulties to learn the higher level of abstraction. This work explores three action anticipation pipelines as a guideline for future work. Specifically, two pipelines adopt a top-down approach: they recognize current actions and then anticipate future actions using either traditional machine learning models or Large Language Models (LLMs). The third pipeline follows a bottom-up strategy by first forecasting future motions and then inferring actions. Our results show that top-down pipelines achieve higher accuracy and robustness, demonstrating the advantage of abstract reasoning over direct motion-based inference.