Enriching Diversity of Synthetic Images for Person Detection

Master Thesis (2022)
Author(s)

C. POLYA RAMESH (TU Delft - Mechanical Engineering)

Contributor(s)

Holger Caesar – Graduation committee member (TU Delft - Intelligent Vehicles)

Lu Zhang – Mentor (Koninklijke Philips N.V.)

J.F.P. Kooij – Coach (TU Delft - Intelligent Vehicles)

Faculty
Mechanical Engineering
Copyright
© 2022 CHINMAY POLYA RAMESH
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 CHINMAY POLYA RAMESH
Graduation Date
20-10-2022
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Camera-based patient monitoring is undergoing rapid adoption in the healthcare sector with the recent COVID-19 pandemic acting as a catalyst. It offers round-the-clock monitoring of patients in clinical units (e.g. ICUs, ORs), or at their homes through installed cameras, enabling timely, pre-emptive care. These are powered by Computer Vision based algorithms that pick up critical physiological data, patient activity, sleep pattern, etc., enabling real-time, pre-emptive care. In this work, we develop a person detector to deploy in such scenarios. These algorithms require huge quantities of training data which is often in shortage in the healthcare field due to stringent privacy norms. Therefore looking for solutions to enrich clinical data becomes necessary. An alternative currently popular among the Computer Vision community is to use synthetic data for training, created using 3D modeling software pipelines. However, this type of technique often has limitations in data diversity and data balancing as desired variations need to be provided explicitly. In this thesis, we propose a data augmentation method for enriching diversity in synthetic data without using any additional external data or software. In particular, we introduce a pose augmentation technique, which synthesizes new human characters in poses unseen in the original dataset using Pose-Warp GAN. Additionally, a new metric is proposed to assess diversity in human pose datasets. The proposed method of augmentation is evaluated using YOLOv3. We show that our pose augmentation technique significantly improves person detection performance compared to traditional data augmentation, especially in low data regimes.

Files

License info not available