Multi-Sensor Human Detection on an Intelligent Security Robot

More Info
expand_more

Abstract

This thesis aims to combine visual data and distance measurements from a Laser Range Finder (LRF) to recognise the presence of humans. The data from the LRF is used to find regions of interest in order to reduce the load on the visual data analysis. Deep learning convolutional neural networks have shown incredible results on visual recognition tasks in recent years and this framework is used to recognise humans in visual data. A laser-camera calibration is done to be able to use the LRF data for region proposals. This calibration is done with the aid of checker board patterns, in order to translate the laser-based distance measurements to pixel coordinates in the visual data. Experiments for the region proposals are done with clustering the LRF data (passive detection) and deep learning (active detection). Convolutional Neural Network (CNN) experiments are done with different solver types (AdaGrad/AdaDelta), data types (RGB/HSV/Grayscale) and network shapes (ConvPool/ConvConv) and serve to show the feasibility of deep learning for the human detection task. These experiments are extended by an extra set of experiments with three datasets from a new location to show the strength of the learned network as well as the feasibility of using new data to finetune a pre-learned network to a new location.