Semantic Segmentation using Deep Neural Networks for MAVs

Master Thesis (2022)
Author(s)

T.V. Tran (TU Delft - Aerospace Engineering)

Contributor(s)

Guido C.H.E.de de Croon – Mentor (TU Delft - Control & Simulation)

Yingfu Xu – Mentor (TU Delft - Control & Simulation)

Christophe De de Wagter – Graduation committee member (TU Delft - Control & Simulation)

Jan van Gemert – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Aerospace Engineering
Copyright
© 2022 Tommy Tran
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Tommy Tran
Graduation Date
19-01-2022
Awarding Institution
Delft University of Technology
Programme
['Aerospace Engineering | Control & Simulation']
Faculty
Aerospace Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Semantic segmentation methods have been developed and applied to single images for object segmentation. However, for robotic applications such as high-speed agile Micro Air Vehicles (MAVs) in Autonomous Drone Racing (ADR), it is more interesting to consider temporal information as video sequences are correlated over time. In this work, we evaluate the performance of state-of-the-art methods such as Recurrent Neural Networks (RNNs), 3D Convolutional Neural Networks (CNNs), and optical flow for video semantic segmentation in terms of accuracy and inference speed on three datasets with different camera motion configurations. The results show that using an RNN with convolutional operators outperforms all methods and achieves a performance boost of 10.8% on the KITTI (MOTS) dataset with 3 degrees of freedom (DoF) motion and a small 0.6% improvement on the CyberZoo dataset with 6 DoF motion over the single-frame-based semantic segmentation method. The inference speed was measured on the CyberZoo dataset, achieving 321 fps on an NVIDIA GeForce RTX 2060 GPU and 30 fps on an NVIDIA Jetson TX2 mobile computer.

Files

Final_Thesis_Tran.pdf
(pdf | 40.4 Mb)
License info not available