Hardware-Aware Quantization for Accurate Memristor-Based Neural Networks

Conference Paper (2025)
Author(s)

Sumit Diware (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Mohammad Amin Yaldagard (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Rajendra Bishnoi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Computer Engineering
DOI related publication
https://doi.org/10.1145/3676536.3698023 Final published version
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Computer Engineering
Article number
24
ISBN (electronic)
979-8-4007-1077-3
Event
43rd IEEE/ACM International Conference on Computer-Aided Design (2024-10-27 - 2024-10-31), New York, United States
Downloads counter
231
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Memristor-based Computation-In-Memory (CIM) has emerged as a compelling paradigm for designing energy-efficient neural network hardware. However, memristors suffer from conductance variation issue, which introduces computational errors in CIM hardware and leads to a degraded inference accuracy. In this paper, we present a hardware-aware quantization to mitigate the impact of conductance variation on CIM-based neural networks. We achieve this using the inherent characteristics of fixed-point arithmetic in CIM hardware. By tuning the bit-precision of weights, we align the conductance variation-induced errors with lower-order output bits. This reduces their numerical impact on the fixed-point output. We further decrease the residual errors by selectively discarding bits with low information and high error. This leads to error-free computations and a high inference accuracy. Our proposed methodology achieves 5.6× correct operations per unit energy compared to the conventional approach, while incurring very low hardware overheads.