Improving Near-Memory Processing

Automatic Scratchpad Memory Exploitation via Static Analysis for a Computation-Near-Memory Processor

Master Thesis (2024)
Author(s)

J.G. van Doorn (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

JSSM Wong – Mentor (TU Delft - Computer Engineering)

Taha Shahroodi – Mentor (TU Delft - Computer Engineering)

S.S. Chakraborty – Graduation committee member (TU Delft - Programming Languages)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
06-12-2024
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering | Embedded Systems']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The increasing demand for data-intensive applications such as artificial intelligence and big data analytics is hitting the limitations of traditional computing architectures. Near-memory processing architectures, like UPMEM's Data Processing Units (DPUs), offer a promising solution by integrating computation with memory, reducing data movement and energy consumption. However, UPMEM's scratchpad-centric design imposes some critical programming challenges, requiring explicit programmer intervention for efficient memory management, which increases program complexity and limits portability.

This thesis investigates a compiler-driven approach to automatically exploit scratchpad memory on UPMEM's DPUs, aiming to simplify programming and achieve performance comparable to hand-optimized code. A novel compilation pipeline is proposed that analyses loops in DMA-unaware C programs and optimizes them into efficient, DMA-aware machine code. The design leverages static analysis, including alias analysis and symbolic analysis, to insert DMA instructions efficiently.

To evaluate the compilation pipeline, the Processing-in-Memory Compiler Benchmarks, based on the Processing-in-Memory Benchmarks proposed by the SAFARI Research Group, are proposed. Experimental results demonstrate significant improvements, achieving an average of 75% of the runtime of hand-optimized programs and sometimes even exceeding the runtimes of the hand-optimized programs.

By automating scratchpad memory management, programmers can focus on high-level functionality while maintaining system performance and compatibility. Future research directions include extending optimizations beyond loops, improving global memory management, and extending the compiler benchmarks with application-based benchmarks.

Files

License info not available
warning

File under embargo until 31-12-2025