Including traffic light recognition in general object detection with YOLOv2

More Info
expand_more

Abstract

With an in vehicle camera many different things can be done that are essential for ADAS or autonomous driving mode in a vehicle. First, it can be used for detection of general objects, for example cars, cyclists or pedestrians. Secondly, the camera can be used for traffic light recognition, which is localization of traffic light position and traffic light state recognition. No method exists at the moment able to perform general object detection and traffic light recognition at the same time, therefore this work proposes methods to combine general object detection and traffic light recognition. The novel method presented is including traffic light recognition in a general object detection framework. The single shot object detector YOLOv2 is used as base detector. As general object class dataset COCO is used and the traffic light dataset is LISA. Two different methods for combined detection are proposed: adaptive combined training and YOLOv2++. For combined training YOLOv2 is trained on both datasets with the YOLOv2 network unchanged and the loss function adapted to optimize training on both datasets. For YOLOv2++ the feature extractor of YOLOv2 pre-trained on COCO is used as feature extractor. On the features LISA traffic light states are trained with a small sub-network. It is concluded the best performing method is adaptive combined training which reaches for IOU 0.5 a AUC of 24.02% for binary and 21.23% for multi-class classification. For IOU of 0.1 this increases to 56.74% for binary and 41.87% for multi-class classification. The performance of the adaptive combined detector is 20% lower than the baseline performance of an detector only detecting LISA traffic light states and 5% lower than the baseline of a detector only detecting COCO classes, however detection of classes from both dataset is almost twice as fast as separate detection with different networks for both datasets.