112 records found
1
Scalable parallel programming applied to H.264/AVC decoding
A predictor-based power-saving policy for DRAM memories
Nexus: hardware support for task-based programming
An Instruction to Accelerate Software Caches
Composable Local Memory Organisation for Streaming Applications on Embedded MPSoCs
Instruction precomputation with memoization for fault detection
Evaluation of parallel H.264 decoding strategies for the cell broadband engine
The SARC architecture
Protective redundancy overhead reduction using instruction vulnerability factor
A case for hardware task management support for the StarSS programming
A multidimensional software cache for scratchpad-based systems
Extending the cell SPE with energy efficient branch prediction
Energy efficient branch prediction on the cell SPE
Limiting the number of dirty cache lines
SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform
Scalability of macroblock-level parallelism for H.264 decoding
Intra-vector SIMD instructions for core specialization
Parallel H.264 decoding on an embedded multicore processor
Scalar processing overhead on SIMD-only architectures
Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture