Extrinsic Camera Calibration using Human-pose Estimations and Automatic Re-identification

None, None

Extrinsic Camera Calibration using Human-pose Estimations and Automatic Re-identification

Master Thesis (2022)

Author(s)

W.J. Tempelaar (TU Delft - Mechanical Engineering)

Contributor(s)

Julian Kooij – Mentor (TU Delft - Intelligent Vehicles)

Marco Hennipman – Mentor (Siemens Mobility)

Faculty

Mechanical Engineering

Copyright

Camera calibration Human-pose estimation Re-identification

To reference this document use:

https://resolver.tudelft.nl/uuid:726a6d3e-d656-4d9e-9c2f-ab22cbc7ca2a

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

29-08-2022

Awarding Institution

Delft University of Technology

Programme

Mechanical Engineering

Abstract

Multi-pedestrian tracking in camera networks has gained enormous interest in the industry because of its applicability in travel-flow analysis, autonomous driving, and surveillance. Essential to tracking in camera networks is camera calibration and, in particular extrinsic camera calibration. Extrinsic camera calibration incorporates the 3D position and orientation of the cameras in the camera network. Current methods for extrinsic calibration require special operators to place calibration objects in sight of all cameras in the network, which is impractical and limits the ease of tracking in camera networks.
In this thesis, an automatic extrinsic calibration model is proposed to calibrate the extrinsic camera parameters from the image data of all cameras in a network. The proposed method utilizes deep learning algorithms for feature extraction and matching. The feature extraction step is a human-pose estimator, extracting the key points of humans such as joints, eyes, and feet. The matching algorithm is a re-ID algorithm using an affinity-based feature extractor.
Two camera network datasets have been used to evaluate the proposed model. A four-camera fully overlapping dataset SALSA, and a more challenging seven-camera partially overlapping dataset WildTrack. The calibration accuracy of the model on both datasets is calculated by comparing the ground truth value with the calibrated extrinsic camera parameters. On dataset WildTrack, the model could calibrate three of the seven cameras, whereas the model had a near-perfect calibration on dataset SALSA. Dataset SALSA achieved a root mean squared error in translation of 0.02 meters and 0.0 radians in orientation, compared to the ground truth values.
The final product of this research is an automated extrinsic camera calibration model that eases the camera calibration in camera networks. The proposed model could not calibrate all datasets, but the model provides a baseline upon which future research can be done.

Files

Thesis_Final_Willem_Jan_Tempel... (pdf)

(pdf | 11.7 Mb)

License info not available