Deep Learning Based Image Segmentation of RGB-D Data in Warehouse Automation

More Info
expand_more

Abstract

The ability to locate specific objects within images is an essential step in various computer vision based engineering applications. Image segmentation is the task of dividing an image into "segments" that are uniform as well as homogeneous with respect to some characteristics, for example grey tone or texture as in Haralick et al. This thesis seeks to perform image segmentation using a Deep Learning (DL) approach in the area of warehouse automation, specifically focusing on an order picking use case of Vanderlande Industries (VI). Generally in literature, DL algorithms for image segmentation are split into two main classes: algorithms for RGB images and algorithms for RGB-D images. RGB stands for the Red, Green, and Blue values of a pixel. RGB-D stands for the Red, Green, Blue, and Depth values of a pixel. The depth value in this case differs from the RGB values in that it does not give a value for a colour intensity, but rather it gives a value for physical distance between the camera and the object it is capturing. The challenge addressed by this thesis focuses on whether the introduction of depth data results in a substantially better performance than using RGB-only images, based on a data-set provided by VI. Also, this thesis looks into the maximum allowed deviation along the X-axis in the registration of the depth data to the RGB images. Two networks from literature were investigated and implemented in MATLAB for this purpose: the SegNet architecture proposed by Badrinarayanan et al. and the FuseNet architecture proposed by Hazirbas et al. Through experiments we have found that, for this use case, the introduction of complementary depth data leads to an improvement over the use of RGB-only images. We also find that, for this use case, the maximum allowed deviation along the X-axis in the registration of the depth data to the RGB images is approximately equal to 1.67 millimetres. The results in this thesis seem to indicate that investing in acquiring an additional depth band does have a positive effect on the accuracy of image segmentation for order picking in warehouse automation.