Low Complexity Feature Point Detection and Tracking using CUDA

More Info
expand_more

Abstract

High speed feature point detection and tracking is very demanding for many realtime computer vision applications. In existing work, the commonly used feature point detection algorithms like Harris and KLT (Kanade-Lucas-Tomasi) and feature tracking algorithm (Pyramidal-KLT) were redesigned to increase the performance by reducing the algorithmic complexity, resulting in the Low Complexity Corner detector (LOCOCO) and Robust Low Complexity Feature tracking (RLCT) algorithms. To attain further speedup, this report proposes the implementation of these low complexity detection and tracking algorithms on a massively parallel architecture of the modern graphics processing units (GPUs) using Compute Unified Device Architecture (CUDA). In the computing domain, due to semiconductor scaling limits and associated power and thermal challenges, combined with the difficulty of exploiting greater levels of instruction level parallelism, a paradigm shift is happening from a single core to many-core processors and massively multi-processing platforms. High performance is now available on single-chip commodity GPUs. Moreover, GPUs are no longer limited to graphics applications, but are emerging as usable general purpose computing devices. Advancement in such platforms, are making many computational intensive problems that were solvable only on supercomputing systems, to be computed on desktop systems, at a reduced price, and lower power requirements. The arrival of this new generation of low-cost high performance computing platforms presents both numerous opportunities and challenges. In this report, we present the use of such high performance many-core GPU platforms to obtain speedup by mapping general purpose computations to massively parallel architectures. It is observed, when properly executed, GPU adaptation of algorithms can result in significant savings in computation times. For an image size of 960x960 pixels, the low complexity corner detector and robust low complexity feature tracking algorithms are factor of 16 and 25 times faster on a GeForce 280 GTX GPU than the corresponding implementation on an Intel Core 2 Duo, 2.66 GHz, and 2GB RAM CPU.

Files