Including traffic light recognition in general object detection with YOLOv2

Master thesis (2019)

Authors

E. Bos Mechanical Engineering

Contributors

J.F.P. Kooij Intelligent Vehicles - Mechanical, Maritime and Materials Engineering (supervisor 1)

E.A.I. Pool Intelligent Vehicles - Mechanical, Maritime and Materials Engineering (supervisor 2)

D. Gavrila Intelligent Vehicles - Mechanical, Maritime and Materials Engineering (coach)

J. Kober Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering (coach)

Faculty

Mechanical Engineering

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:09f32632-04eb-4907-9100-766590dc2d03

Published Date

19-06-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

With an in vehicle camera many different things can be done that are essential for ADAS or autonomous driving mode in a vehicle. First, it can be used for detection of general objects, for example cars, cyclists or pedestrians. Secondly, the camera can be used for traffic light recognition, which is localization of traffic light position and traffic light state recognition. No method exists at the moment able to perform general object detection and traffic light recognition at the same time, therefore this work proposes methods to combine general object detection and traffic light recognition. The novel method presented is including traffic light recognition in a general object detection framework. The single shot object detector YOLOv2 is used as base detector. As general object class dataset COCO is used and the traffic light dataset is LISA. Two different methods for combined detection are proposed: adaptive combined training and YOLOv2++. For combined training YOLOv2 is trained on both datasets with the YOLOv2 network unchanged and the loss function adapted to optimize training on both datasets. For YOLOv2++ the feature extractor of YOLOv2 pre-trained on COCO is used as feature extractor. On the features LISA traffic light states are trained with a small sub-network. It is concluded the best performing method is adaptive combined training which reaches for IOU 0.5 a AUC of 24.02% for binary and 21.23% for multi-class classification. For IOU of 0.1 this increases to 56.74% for binary and 41.87% for multi-class classification. The performance of the adaptive combined detector is 20% lower than the baseline performance of an detector only detecting LISA traffic light states and 5% lower than the baseline of a detector only detecting COCO classes, however detection of classes from both dataset is almost twice as fast as separate detection with different networks for both datasets.

Files

Master_Thesis_evertbos_4218566... (.pdf)

(.pdf | 43 Mb)