Improving Near-Memory Processing
Automatic Scratchpad Memory Exploitation via Static Analysis for a Computation-Near-Memory Processor
J.G. van Doorn (TU Delft - Electrical Engineering, Mathematics and Computer Science)
JSSM Wong – Mentor (TU Delft - Computer Engineering)
Taha Shahroodi – Mentor (TU Delft - Computer Engineering)
S.S. Chakraborty – Graduation committee member (TU Delft - Programming Languages)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The increasing demand for data-intensive applications such as artificial intelligence and big data analytics is hitting the limitations of traditional computing architectures. Near-memory processing architectures, like UPMEM's Data Processing Units (DPUs), offer a promising solution by integrating computation with memory, reducing data movement and energy consumption. However, UPMEM's scratchpad-centric design imposes some critical programming challenges, requiring explicit programmer intervention for efficient memory management, which increases program complexity and limits portability.
This thesis investigates a compiler-driven approach to automatically exploit scratchpad memory on UPMEM's DPUs, aiming to simplify programming and achieve performance comparable to hand-optimized code. A novel compilation pipeline is proposed that analyses loops in DMA-unaware C programs and optimizes them into efficient, DMA-aware machine code. The design leverages static analysis, including alias analysis and symbolic analysis, to insert DMA instructions efficiently.
To evaluate the compilation pipeline, the Processing-in-Memory Compiler Benchmarks, based on the Processing-in-Memory Benchmarks proposed by the SAFARI Research Group, are proposed. Experimental results demonstrate significant improvements, achieving an average of 75% of the runtime of hand-optimized programs and sometimes even exceeding the runtimes of the hand-optimized programs.
By automating scratchpad memory management, programmers can focus on high-level functionality while maintaining system performance and compatibility. Future research directions include extending optimizations beyond loops, improving global memory management, and extending the compiler benchmarks with application-based benchmarks.
Files
File under embargo until 31-12-2025