Deep end-to-end 3D person detection from Camera and Lidar

None, None; None, None; None, None

Deep end-to-end 3D person detection from Camera and Lidar

Conference Paper (2019)

Author(s)

M. Roth (TU Delft - Intelligent Vehicles, Daimler AG)

Dominik Jargot (Student TU Delft)

Dariu Gavrila (TU Delft - Intelligent Vehicles)

Research Group

Intelligent Vehicles

Copyright

DOI related publication

https://doi.org/10.1109/ITSC.2019.8917366

To reference this document use:

https://resolver.tudelft.nl/uuid:83f7a017-a713-4009-9505-f758f58c07e1

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Abstract

We present a method for 3D person detection from camera images and lidar point clouds in automotive scenes. The method comprises a deep neural network which estimates the 3D location and extent of persons present in the scene. 3D anchor proposals are refined in two stages: a region proposal network and a subsequent detection network.For both input modalities high-level feature representations are learned from raw sensor data instead of being manually designed. To that end, we use Voxel Feature Encoders [1] to obtain point cloud features instead of widely used projection-based point cloud representations, thus allowing the network to learn to predict the location and extent of persons in an end-to-end manner.Experiments on the validation set of the KITTI 3D object detection benchmark [2] show that the proposed method outperforms state-of-the-art methods with an average precision (AP) of 47.06% on moderate difficulty.

Files

Roth2019itsc_lidar_person_dete... (pdf)

(pdf | 3.05 Mb)

License info not available