15 records found
1
Evaluating vector data type usage in OpenCL kernels
Test-driving Intel Xeon Phi
Balancing CPU-GPU collaborative high-order CFD simulations on the Tianhe-1A supercomputer
Quantifying the performance impacts of using local memory for many-core processors
Sesame: A user-transparent optimizing framework for many-core processors
Parallelizing a high-order CFD software for 3D, multi-block, structural grids on the TianHe-1A supercomputer
An application-centric evaluation of OpenCL on multi-core CPUs
Parallelizing a high-order CFD software for 3D, multi-block, structural grids on the tianHe-1A supercomputer
Performance traps in OpenCL for CPUs
ELMO: A user-friendly api to enable local memory in opencl kernels
OpenCL vs. OpenMP: A Programmability Debate
Source-to-Source Vectorization for OpenCL Kernels
Accelerating cost aggregation for real-time stereo matching
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs
Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study