Hardware-Aware Quantization for Accurate Memristor-Based Neural Networks

Conference Paper (2025)
Author(s)

S.S. Diware (TU Delft - Computer Engineering)

Mohammad Amin Yaldagard (TU Delft - Computer Engineering)

R.K. bishnoi (TU Delft - Computer Engineering)

Research Group
Computer Engineering
DOI related publication
https://doi.org/10.1145/3676536.3698023
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Computer Engineering
ISBN (electronic)
979-8-4007-1077-3
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Memristor-based Computation-In-Memory (CIM) has emerged as a compelling paradigm for designing energy-efficient neural network hardware. However, memristors suffer from conductance variation issue, which introduces computational errors in CIM hardware and leads to a degraded inference accuracy. In this paper, we present a hardware-aware quantization to mitigate the impact of conductance variation on CIM-based neural networks. We achieve this using the inherent characteristics of fixed-point arithmetic in CIM hardware. By tuning the bit-precision of weights, we align the conductance variation-induced errors with lower-order output bits. This reduces their numerical impact on the fixed-point output. We further decrease the residual errors by selectively discarding bits with low information and high error. This leads to error-free computations and a high inference accuracy. Our proposed methodology achieves 5.6× correct operations per unit energy compared to the conventional approach, while incurring very low hardware overheads.