Depth-aware Instance Segmentation with a Discriminative Loss Function

None, None

Depth-aware Instance Segmentation with a Discriminative Loss Function

Master Thesis (2018)

Author(s)

Z. Wang (TU Delft - Mechanical Engineering)

Contributor(s)

Ewoud Pool – Mentor

J.F.P. Kooij – Mentor

D. M. Gavrila – Graduation committee member

Faculty

Mechanical Engineering

Copyright

Deep Learning Computer Vision Intelligent Vehicles Instance segmentation

To reference this document use:

https://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418

More Info

expand_more

Publication Year

2018

Language

English

Copyright

Graduation Date

28-08-2018

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This work explores the possibility of incorporating depth information into a deep neural network to improve accuracy of RGB instance segmentation. The baseline of this work is semantic instance segmentation with discriminative loss function.The baseline work proposes a novel discriminative loss function with which the semantic net-work can learn a n-D embedding for all pixels belonging to instances. Embeddings of the same instances are attracted to their own centers while centers of different instance embeddings repulse each other. Two limitations are set for attraction and repulsion, namely the in-margin and out-margin. A post-processing procedure (clustering) is required to infer instance indices from embeddings with an important parameter bandwidth, the threshold for clustering. The contribution of the work in this thesis are several new methods to incorporate depth information into the baseline work. One simple method is adding scaled depth directly to RGB embeddings, which is named as scaling. Through theorizing and experiments, this work also proposes that depth pixels can be encoded into 1-D embeddings with the same discriminative loss function and combined with RGB embeddings. Explored combination methods are fusion and concatenation. Additionally, two depth pre-processing methods are proposed, replication and coloring. From the experimental result, both scaling and fusion lead to significant improvements over baseline work while concatenation contributes more to classes with lots of similarities.

Files

Thesis_final_version.pdf

(pdf | 34.3 Mb)

License info not available