Automatic Camera Pose Estimation by Key-Point Matching of Reference Objects

More Info


In this paper, we aim to design an automatic camera pose estimation pipeline for clinical spaces such as catheterization laboratories. Our proposed pipeline exploits Scaled-YOLOv4 to detect fixed objects. We adopt the self-supervised key-point detector SuperPoint in combination with SuperGlue, a keypoint matching technique based on graph neural networks. Thus, we match key-points on input images with annotated reference points. Reference points are chosen on fixed objects in the scene, such as corners of door posts or windows. The point-correspondences between the image coordinates and the 3D coordinates are applied to the Perspective-n-Point algorithm to estimate the pose of each camera. Compared with other camera pose estimation methods, the proposed pipeline does not require the construction of 3D point-cloud model of the scene or placing a polyhedron object in the scene before each required calibration. Using videos from real procedures, we show that the pipeline can estimate the camera pose with high accuracy.