Towards Real-Time Human Pose Estimation For Mobile Device

More Info
expand_more

Abstract

Human pose estimation, a challenging computer vision task of estimating various human body joints' locations, has a wide range of applications such as pedestrian tracking for autonomous cars, baby monitoring, video surveillance, human action recognition, virtual reality, gaming, gait analysis, etc. A majority of the research on the development of models for the task of human pose estimation has been focused on improving the accuracy of the task which also increases the complexity of the models. These models demand devices with high computational power to be deployed for real-world applications. Even though a lot of research has been focused on estimating the human pose from monocular images taken from cameras, the complexity of the models makes them impossible to be implemented on edge devices and embedded devices like mobile phones that have built-in cameras. This reduces the scope of applications where human pose estimation can be used. To address the issue, the research focuses on improving the performance of a baseline human pose estimation architecture by reducing the model size(number of parameters) and thereby its inference time without a significant loss in the accuracy. To improve the performance of the model, a structured Bayesian compression algorithm is used and the network is compressed by engineering the model based on the uncertainty of the parameters. The results show that the Bayesian compression method reduces the model size by around 65 percent with only a very little drop in the model accuracy. Also, the comparison of the inference time of the original baseline and the compressed model in an android device shows that the inference time is reduced by around 50 percent because of the reduction in the number of operations in the compressed model architecture.