Action Recognition From Variable Viewpoints

Towards a safer living environment for elderly

More Info
expand_more

Abstract

The Dutch Central Bureau of Statistics expects the elderly population to grow from 2.4 million in 2012 to 4.7 million in 2041, putting intense pressure on health care budgets. As elderly get older and older, even more pressure on health care budgets will exist in the near future. Therefore, there is currently a focus on prevention: reduce incidents and allow people to live safely in their own homes for a longer time as this will greatly reduce healthcare costs. One approach to this goal is the development of an autonomous service robot, capable of assisting elderly persons in their daily activities and able to recognize dangerous actions or situations. Although humans are seemingly capable of effortless action recognition, artificial systems employing human action recognition algorithms still have many difficulties doing so. This thesis proposes a new approach: Kinect RGB-D skeletal data is captured in a spatial temporal pattern, making use of 3D motion history. Using a dimensionality reduction technique called orthogonal class learning a novel representation is generated called motion history spatio temporal pattern or MH-STP. This allows the space and time information of the skeleton to be preserved and compressed in a compact feature. Using an action graph model for classification, real time recognition rates can be achieved by modeling probabilities of observed consecutive poses. This thesis aims to provide a novel method for successful, robust and reliable action recognition in an online setting. The system was tested on the Microsoft Research Action 3D dataset as well as a novel dataset (12 actions, 9 subjects, 3 different angles, 3 instances per angle) recorded specifically for this thesis. Although the Microsoft dataset did not have high quality skeleton data, promising initial results were obtained (existing methods 68% - 90%, proposed method 83%). More extensive testing of the method’s parameters on the novel data set led to a much better understanding of the MH-STP representation in both offline and online action recognition methods. This resulted in an excellent offline recognition rate on the novel data set in a more challenging cross subject situation (proposed method: 95%, existing up to 89%). In the online situation the general approach was tested as well using a novel two-stage classifier that adds a secondary classification step on top of the action recognition framework. This allows (re)classification if the action graph result is not confident enough or fails to classify an observed test sequence. A very good recognition rate of 92% was observed in a 50% training cross subject test. Although this result was well above expectations, inter-action variance is a challenging factor, depending on the MH-STP grid size and the amount of available samples