3D Human Pose Estimation

None, None

3D Human Pose Estimation

Using a Top-View Depth Camera

Master Thesis (2020)

Author(s)

P.P. Mody (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

K Hildebrandt – Mentor (TU Delft - Computer Graphics and Visualisation)

Fei Zuo – Mentor (Philips Research)

Esther van der Heide – Mentor (Philips Research)

Hayley Hung – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

E. Eisemann – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Deep Learning Computer Vision Convolutional Neural Network 3D Human Pose Estimation

To reference this document use:

https://resolver.tudelft.nl/uuid:c04a6da7-a2c0-4b53-b2e0-2bd42151d4cc

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

19-05-2020

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Abstract

The onset of delirium, a disturbance in the mental activities of a patient, can be potentially detected by understanding activities within an Intensive Care Unit (ICU) room. Such activities can be extracted by estimating human pose via a visual capture of the scene. This work uses a top-view depth camera in an ICU room to estimate pose of the non-patient stakeholders. The top-view leads to self-occlusions of body joints and thus poses a challenge for estimation of complete human pose. In addition, the presence of multiple persons in the room poses a secondary challenge, as detected body-joints need to be parsed into individual poses. To address these challenges, a 3D point cloud is extracted from the top-view depth image and passed through a 3D Convolutional Neural Network (CNN). This baseline method is capable of estimating both body-joints and body-parts to eventually output human pose for multiple persons. To improve the quality of output poses, the baseline method can benefit from additional spatial context since the problem of human pose estimation has a highly structured output. The proposed techniques either increase the receptive field, perform feature extraction at multiple scales or change the order of data processing. An increase in F1-score for the proposed methods highlights the importance of additional spatial context as a crucial tool to improve the performance of pose estimation models.

Files

PrerakMody_MSc_Thesis_3D_Human... (pdf)

(pdf | 17 Mb)

- Embargo expired in 19-05-2022

License info not available