Deep Learning Based Image Segmentation of RGB-D Data in Warehouse Automation

Master Thesis (2019)
Author(s)

I. El Doori (TU Delft - Mechanical Engineering)

Contributor(s)

Raf Van de Plas – Mentor (TU Delft - Team Raf Van de Plas)

J. Kober – Graduation committee member (TU Delft - Learning & Autonomous Control)

Tope Agbana – Graduation committee member (TU Delft - Team Raf Van de Plas)

Faculty
Mechanical Engineering
Copyright
© 2019 Isa El Doori
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Isa El Doori
Graduation Date
08-07-2019
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Systems and Control']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The ability to locate specific objects within images is an essential step in various computer vision based engineering applications. Image segmentation is the task of dividing an image into "segments" that are uniform as well as homogeneous with respect to some characteristics, for example grey tone or texture as in Haralick et al. This thesis seeks to perform image segmentation using a Deep Learning (DL) approach in the area of warehouse automation, specifically focusing on an order picking use case of Vanderlande Industries (VI). Generally in literature, DL algorithms for image segmentation are split into two main classes: algorithms for RGB images and algorithms for RGB-D images. RGB stands for the Red, Green, and Blue values of a pixel. RGB-D stands for the Red, Green, Blue, and Depth values of a pixel. The depth value in this case differs from the RGB values in that it does not give a value for a colour intensity, but rather it gives a value for physical distance between the camera and the object it is capturing. The challenge addressed by this thesis focuses on whether the introduction of depth data results in a substantially better performance than using RGB-only images, based on a data-set provided by VI. Also, this thesis looks into the maximum allowed deviation along the X-axis in the registration of the depth data to the RGB images. Two networks from literature were investigated and implemented in MATLAB for this purpose: the SegNet architecture proposed by Badrinarayanan et al. and the FuseNet architecture proposed by Hazirbas et al. Through experiments we have found that, for this use case, the introduction of complementary depth data leads to an improvement over the use of RGB-only images. We also find that, for this use case, the maximum allowed deviation along the X-axis in the registration of the depth data to the RGB images is approximately equal to 1.67 millimetres. The results in this thesis seem to indicate that investing in acquiring an additional depth band does have a positive effect on the accuracy of image segmentation for order picking in warehouse automation.

Files

License info not available