112 records found
1
A predictor-based power-saving policy for DRAM memories
Scalable parallel programming applied to H.264/AVC decoding
Nexus: hardware support for task-based programming
Composable Local Memory Organisation for Streaming Applications on Embedded MPSoCs
An Instruction to Accelerate Software Caches
Instruction precomputation with memoization for fault detection
Extending the cell SPE with energy efficient branch prediction
A multidimensional software cache for scratchpad-based systems
A case for hardware task management support for the StarSS programming
Evaluation of parallel H.264 decoding strategies for the cell broadband engine
The SARC architecture
Protective redundancy overhead reduction using instruction vulnerability factor
Specialization of the cell SPE for media applications
SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform
Intra-vector SIMD instructions for core specialization
Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture
Scalar processing overhead on SIMD-only architectures
Parallel H.264 decoding on an embedded multicore processor
Performance improvement of multimedia kernels by alleviating overhead instructions on SIMD devices
An efficient software cache for H.264 motion compensation