Evaluating 6D pose estimation accuracy with synthetic data

A comparative analysis of RGB based and RGB-Depth-based images in a chess piece picking task with a robotic arm

More Info
expand_more

Abstract

This study investigates the influence of synthetic data on the accuracy of 6D pose estimation in RGB images compared to RGB-Depth image-based methods. Additionally, it aims to examine how this performance varies across different types of small chess pieces during a picking task with a robotic arm. The methodology involves 3D scanning the chess pieces and generating a dataset of 61,910 synthetic images with diverse domain randomization. Using this synthetic dataset, six unique 6D pose estimation models were trained for each type of chess piece. The models were validated using a real-world 6D pose estimation dataset, and the obtained results were compared with those from the estimations on a synthetic dataset. It was observed that synthetic data can be used for bridging the visual simulation to reality gap. However there is superior performance on synthetic data compared to real-world data. This implies that results obtained in a synthetic environment cannot be directly projected to real-world scenarios. Also a noticeable decrease in accuracy in both object detection and 6D pose estimation was observed with respect to camera distance for real-world images, primarily linked to the reduced size of chess pieces in the images. Notably, each chess piece exhibited improvement after depth refinement, with accuracy closely linked to the performance of the depth camera. A picking experiment was also conducted, revealing that models relying solely on RGB data achieved a positive grasping rate of approximately 52%, while RGB-Depth-based methods reached around 70%. The findings underscore the potential for successfully picking chess pieces with and without depth refinement, emphasizing the feasibility of bridging the visual simulation to reality gap. Additionally, the study suggests several avenues for future research, including comparing synthetic data with real-world data, further exploring the training process, and introducing domain randomization in the picking experiment, such as background changes or different distractors. Furthermore, it suggests investigating diverse approaches to improve depth accuracy.