High Performance Histograms on SIMT and SIMD Architectures

More Info
expand_more

Abstract

Using the histogram procedure, this work studies performance determining factors in computing in parallel on SIMD and SIMT devices. Modern graphics pro-cessing units (GPUs) support SIMT, multiple threads running the same instruction, whereas central processing units (CPUs) use SIMD, in which one instruction op-erates on multiple operands. As part of this work, a cross-technology framework is developed that allows testing a single-source histogram implementation on multiple devices, providing insight into the performance of various API – hardwareconfigurations. It is shown that in the presence of high contention, the implementation of atomic operations becomes of great influence on performance. This work provides guidelines for the choice between devices based on image features and hardware specifications.

Files