15 records found
1
Evaluating vector data type usage in OpenCL kernels
Balancing CPU-GPU collaborative high-order CFD simulations on the Tianhe-1A supercomputer
Test-driving Intel Xeon Phi
Performance traps in OpenCL for CPUs
Sesame: A user-transparent optimizing framework for many-core processors
Quantifying the performance impacts of using local memory for many-core processors
An application-centric evaluation of OpenCL on multi-core CPUs
ELMO: A user-friendly api to enable local memory in opencl kernels
Parallelizing a high-order CFD software for 3D, multi-block, structural grids on the tianHe-1A supercomputer
Parallelizing a high-order CFD software for 3D, multi-block, structural grids on the TianHe-1A supercomputer
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs
Accelerating cost aggregation for real-time stereo matching
OpenCL vs. OpenMP: A Programmability Debate
Source-to-Source Vectorization for OpenCL Kernels
Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study