RJ

R.V. Joshi

info

Please Note

8 records found

Recent advances in Resistive RAM (RRAM) based Computation-In-Memory (CIM) architectures highlight significant potential for accelerating data-intensive computing tasks. However, non-idealities in RRAM devices, such as variability, result in small sensing margins that can significantly affect the computational efficiency. This issue becomes even more pronounced when dealing with complex multi-operand logic operations. This paper introduces a circuit-level scheme for CIM-based multi-operand XOR logic operations, leveraging a Voltage-To-Time converter (VTC) to perform multi-phased XORs in a single clock cycle. In this approach, we exploit bitline capacitances for voltage-based sensing during computation, generating an output voltage that is linearly proportional to the operand values. This voltage is then converted into the desired logic output using the VTC design. Furthermore, low-power techniques are employed in the deployment of sense amplifiers, such as regulating power consumption during operation and disabling the amplifiers once the decision is made. Simulation results for a post-layout extracted 512x512 (256Kb) RRAM-based CIM array show that up to 16-operand XOR operation can be accurately and reliably performed as opposed to a maximum of three operands supported by state-of-the-art solutions, while offering up to 49× better figure-of-merit combining energy-efficiency and throughput. ...
Computation-in-memory (CIM) using memristors can facilitate data processing within the memory itself, leading to superior energy efficiency than conventional von-Neumann architecture. This makes CIM well-suited for data-intensive applications like neural networks. However, a large number of read operations can induce an undesired resistance change in the memristor, known as read-disturb. As memristor resistances represent the neural network weights in CIM hardware, read-disturb causes an unintended change in the network’s weights that leads to poor accuracy. In this paper, we propose a methodology for read-disturb detection and mitigation in CIM-based neural networks. We first analyze the key insights regarding the read-disturb phenomenon. We then introduce a mechanism to dynamically detect the occurrence of read-disturb in CIM-based neural networks. In response to such detections, we develop a method that adapts the sensing conditions of CIM hardware to provide error-free operation even in the presence of read-disturb. Simulation results show that our proposed methodology achieves up to 2× accuracy and up to 2× correct operations per unit energy compared to conventional CIM architectures. ...
Conference paper (2024) - Asmae El Arrassi, Mohammad Amin Yaldagard, Xingjian Tao, Taha Shahroodi, Fouwad Mir, Yashvardhan Biyani, Manil Dev Gomony, Anteneh Gebregiorgis, Rajiv Joshi, Said Hamdioui
Binary Neural Networks (BNNs) have demonstrated significant advantages in reducing computation and memory costs, all while maintaining acceptable accuracy on various image detection tasks. Thus, BNNs have the potential to support practical cognitive tasks on resource-constrained platforms, such as edge computing devices. To realize this, SRAM-based digital Computation-in-Memory (CIM) has gained growing attention as it overcomes the analog CIM architecture bottlenecks such as limited computing accuracy due to process variation, non-linearity, power and area-hungry Analog-to-Digital Converters (ADCs), etc. However, digital CIM architectures are highly dominated by power-hungry adder-trees, which can nullify the benefits of SRAM-based digital CIM. To address this issue, this paper proposes an adder free SRAM-based digital CIM, AFSRAM-CIM, for BNN acceleration. The proposed CIM architecture utilizes a multi-functional 10-T SRAM cell-based crossbar array and a new energy-efficient approach to perform the popcount operation. Simulation results using the MNIST dataset show that the proposed architecture maintains the state-of-the-art inference accuracy of 99.21% with only 11.86 fJ energy per operation. Moreover, AFSRAM-CIM achieves over 3× energy and ≈17× area savings when compared to the conventional digital CIM approaches. ...
Analog computation-in-memory (CIM) architecture alleviates massive data movement between the memory and the processor, thus promising great prospects to accelerate certain computational tasks in an energy-efficient manner. However, data converters involved in these architectures typically achieve the required computing accuracy at the expense of high area and energy footprint which can potentially determine CIM candidacy for low-power and compact edge-AI devices. In this work, we present a memory-periphery co-design to perform accurate A/D conversions of analog matrix-vector-multiplication (MVM) outputs. Here, we introduce a scheme where select-lines and bit-lines in the memory are virtually fixed to improve conversion accuracy and aid a ring-oscillator-based A/D conversion, equipped with component sharing and inter-matching of the reference blocks. In addition, we deploy a self-timed technique to further ensure high robustness addressing global design and cycle-to-cycle variations. Based on measurement results of a 4Kb CIM chip prototype equipped with TSMC 40nm, a relative accuracy of up to 99.71% is achieved with an energy efficiency of 115.1 TOPS/W and computational density of 12.1 TOPS/mm2 for the MNIST dataset. Thus, an improvement of up to 11.3X and 7.5X compared to the state-of-the-art, respectively. ...
Resistive random access memory (RRAM) based computation-in-memory (CIM) architectures can meet the unprecedented energy efficiency requirements to execute AI algorithms directly on edge devices. However, the read-disturb problem associated with these architectures can lead to accumulated computational errors. To achieve the necessary level of computational accuracy, after a specific number of read cycles, these devices must undergo a reprogramming process which is a static approach and needs a large counter. This paper proposes a circuit-level RRAM read-disturb detection technique by monitoring real-time conductance drifts of RRAM devices, which initiate the reprogramming when actually it needs. Moreover, an analytic method is presented to determine the minimum conductance detection requirements, and our proposed read-disturb detection technique is tuned for the same to detect it dynamically. SPICE simulation result using TSMC 40 nm shows the correct functionality of our proposed detection technique. ...
Memristor-based computation-in-memory (CIM) can achieve high energy efficiency by processing the data within the memory, which makes it well-suited for applications like neural networks. However, memristors suffer from conductance variation problem where their programmed conductance values deviate from the desired values. Such variations lead to computational errors that result in degraded inference accuracy in CIM-based neural networks. In this paper, we present a mapping-aware biased training methodology to mitigate the impact of conductance variation on CIM-based neural networks. We first determine which conductance states of the memristor are inherently more immune to variation. The neural network is then trained under the constraint that important weights can only take numeric values which directly get mapped to such favorable states. Simulation results show that our proposed mapping-aware biased training achieves up to 2.4× hardware accuracy compared to the conventional training. ...
Emerging non-volatile resistive RAM (RRAM) device technology has shown great potential to cultivate not only high-density memory storage, but also energy-efficient computing units. However, the unique challenges related to RRAM fabrication process render the traditional memory testing solutions inefficient and inadequate for high product quality. This paper presents low-cost design-for-testability (DFT) solutions that augment the testing process and improve the fault coverage. A computation-in-memory (CIM) based DFT is realized to expedite the detection and diagnosis of faults by developing logic designs involving multi-row activation. A novel addressing scheme is introduced to facilitate the diagnosis of faults. Reconfigurable logic designs are developed to detect unique RRAM faults that offer features such as programmable reference generations, period, and voltage of operation. DFT implementations are validated on a post-layout extracted platform and testing sequences are introduced by incorporating the proposed DFTs. Results show that more than 2.3× speedup and better coverage are achieved with 6× area reduction when compared with state-of-the-art solutions. ...
Conference paper (2022) - Abhairaj Singh, Mahdi Zahedi, Taha Shahroodi, Mohit Gupta, Anteneh Gebregiorgis, Manu Komalan, Rajiv V. Joshi, Francky Catthoor, Rajendra Bishnoi, Said Hamdioui
Spin-transfer torque magnetic random access memory (STT-MRAM) based computation-in-memory (CIM) architectures have shown great prospects for an energy-efficient computing. However, device variations and non-idealities narrow down the sensing margin that severely impacts the computing accuracy. In this work, we propose an adaptive referencing mechanism to improve the sensing margin of a CIM architecture for boolean binary logic (BBL) operations. We generate reference signals using multiple STT-MRAM devices and place them strategically into the array such that these signals can address the variations and trace the wire parasitics effectively. We have demonstrated this behavior using an STT-MRAM model, which is calibrated using 1Mbit characterized array. Results show that our proposed architecture for binary neural networks (BNN) achieves up to 17.8 TOPS/W on the MNIST dataset and 130× performance improvement for the text encryption compared to the software implementation on Intel Haswell processor. ...