EV-Mask-RCNN: Instance Segmentation in Event-based Videos

More Info
expand_more

Abstract

Instance segmentation on data from Dynamic Vision Sensors (DVS) is an important computer vision task that needs to be tackled in order to push the research forward on these types of inputs. This paper aims to show that deep learning based techniques can be used to solve the task of instance segmentation on DVS data. A high performing model was used to solve this task, using event-based data that was transformed into RGB-D images. The chosen model for this work was Mask R-CNN, with an alteration for depth images, because of its high performance on frame based data. The N-MNIST dataset provides the event-based input, and the transformation of such an input is presented in this study. Furthermore, the masks are generated with the help of the MNIST dataset and heuristics are used for placing them at the correct positions. The results are promising and comparable to other results from literature on the task of semantic segmentation.