112 records found
1
Scalable parallel programming applied to H.264/AVC decoding
A predictor-based power-saving policy for DRAM memories
Nexus: hardware support for task-based programming
An Instruction to Accelerate Software Caches
Composable Local Memory Organisation for Streaming Applications on Embedded MPSoCs
Protective redundancy overhead reduction using instruction vulnerability factor
Evaluation of parallel H.264 decoding strategies for the cell broadband engine
Extending the cell SPE with energy efficient branch prediction
Instruction precomputation with memoization for fault detection
A multidimensional software cache for scratchpad-based systems
The SARC architecture
A case for hardware task management support for the StarSS programming
Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture
SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform
Scalar processing overhead on SIMD-only architectures
Performance improvement of multimedia kernels by alleviating overhead instructions on SIMD devices
Intra-vector SIMD instructions for core specialization
Parallel H.264 decoding on an embedded multicore processor
Specialization of the cell SPE for media applications
Scalability of macroblock-level parallelism for H.264 decoding