Hardware-Aware Quantization for Accurate Memristor-Based Neural Networks

None, None; None, None; None, None

Hardware-Aware Quantization for Accurate Memristor-Based Neural Networks

Conference Paper (2025)

Author(s)

S.S. Diware (TU Delft - Computer Engineering)

Mohammad Amin Yaldagard (TU Delft - Computer Engineering)

R.K. Bishnoi (TU Delft - Computer Engineering)

Research Group

Computer Engineering

DOI related publication

https://doi.org/10.1145/3676536.3698023

RRAM Quantization Memristors Computation-In-Memory (CIM) Conductance variation Deep neural networks (DNNs) Fixed-point arithmetic Non-ideality Processing-In-Memory (PIM)

To reference this document use:

https://resolver.tudelft.nl/uuid:295d0f77-06de-4cdf-800a-0cf9c3cff095

More Info

expand_more

Publication Year

2025

Language

English

Research Group

Computer Engineering

ISBN (electronic)

979-8-4007-1077-3

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Memristor-based Computation-In-Memory (CIM) has emerged as a compelling paradigm for designing energy-efficient neural network hardware. However, memristors suffer from conductance variation issue, which introduces computational errors in CIM hardware and leads to a degraded inference accuracy. In this paper, we present a hardware-aware quantization to mitigate the impact of conductance variation on CIM-based neural networks. We achieve this using the inherent characteristics of fixed-point arithmetic in CIM hardware. By tuning the bit-precision of weights, we align the conductance variation-induced errors with lower-order output bits. This reduces their numerical impact on the fixed-point output. We further decrease the residual errors by selectively discarding bits with low information and high error. This leads to error-free computations and a high inference accuracy. Our proposed methodology achieves 5.6× correct operations per unit energy compared to the conventional approach, while incurring very low hardware overheads.

Files

3676536.3698023.pdf

(pdf | 1.74 Mb)