S.S. Diware
Please Note
21 records found
1
Resistive random-access memory (RRAM)-based computation-in-memory (CIM) architectures offer a promising solution to meet the stringent energy efficiency demands of executing artificial intelligence (AI) algorithms directly on edge devices. However, these architectures suffer from the read-disturb problem, which can lead to accumulated computational errors over time. To maintain the required level of computational accuracy, conventional approaches rely on a static reprogramming process after a predefined number of read cycles, necessitating large counters and resulting in inefficiencies. This paper presents experimental results using real RRAM devices to analyze the read-disturb effect and builds on these insights to propose a circuit-level detection methodology for real-time monitoring of conductance drifts. The proposed method initiates reprogramming only when the device drift exceeds a defined threshold and reprogramming is actually needed. Additionally, an analytical method is developed to determine the minimum conductance state ratio needed to meet reliable detection criteria. Based on this foundation, the proposed detection technique is further optimized for dynamic identification of read-disturb effects. Experiment-augmented SPICE simulation results, using a calibrated model implemented in TSMC 40 nm CMOS technology, validate the functionality and effectiveness of the proposed detection approach. These results demonstrate its potential to improve both the reliability and efficiency of RRAM-based CIM architectures that provide up to a 4x improvement in energy-efficiency compared to traditional periodic reprogramming methods.
Approximately one-third of individuals with chronic epilepsy, a condition resulting from uncontrolled brain activity, do not respond to medication. Animal models are widely used to investigate the mechanism underlying epilepsy, so better drug treatments can be developed for this disease. In such studies, epileptiform activity, assessed by EEG recordings, can be used as a marker for the development of the disease. However, the analysis of EEG recordings is typically done manually, which is time-consuming, subject to observer bias, error-prone, and lacks consistency and efficiency. In this paper, we develop a novel automated methodology for detecting and classifying epileptiform activity, which is tested using the intrahippocampal kainic acid (IHKA) mouse model, a representation of human temporal lobe epilepsy. For that, EEG/LFP recordings are obtained from biological experiments using the IHKA mouse model for data acquisition. We use a spike detection method that combines an improved version of the nonlinear energy operator (NEO) with the automatic NEO thresholding (ANT) algorithm. The proposed method is implemented in Python as an automated and time-efficient algorithm, given its adaptability to different spike and epileptiform event criteria, making it suitable for use in preclinical and potentially future clinical studies. Using our proposed methodology, we achieve a 93.1% accuracy in detecting epileptiform events and a 95.8% accuracy in classification. Moreover, the time for analysis of EEG recordings was reduced by 98.8% compared to manual analysis. Additionally, to demonstrate the potential of the algorithm for brain–machine interfaces (BMI) applications, we develop a hardware architecture and implement it using both an application-specific integrated circuit (ASIC) and a field programmable gate array (FPGA). The FPGA shows the feasibility of near real-time implementation, and for our ASIC implementation, we achieve a post-layout area of 9114 µm2 with a dynamic power consumption of 16.09 μW using TSMC 40 nm technology.
Computation-In-Memory based Edge-AI for Healthcare
A Cross-Layer Approach
Computation-in-memory (CIM) overcomes these challenges by in-situ data processing using emerging memory technologies called memristors. Thus, CIM can facilitate energy efficient and compact edge-AI hardware design. Healthcare domain stands out as a prime target for CIM-based edge-AI hardware, due to two main reasons. Firstly, it holds significant real-world importance due to its direct impact on human well-being. Secondly, the increasing adoption of AI in healthcare can significantly benefit from efficient hardware for data processing. CIM-based edge hardware can greatly enhance the effectiveness of AI-based healthcare through rapid, reliable, and secure processing of medical data at its source. Hence, design of CIM-based edge-AI hardware for healthcare applications presents a promising research direction.
The process of designing CIM-based edge-AI hardware for healthcare can be expressed as a stack of six abstraction layers: application, algorithm, optimization, mapping, micro-architecture and circuits, and device. These abstraction layers can be further grouped into two distinct design phases. The first phase is application-dependent, covering the first three abstraction layers (application, algorithm and optimization). It involves creating a customized neural network model for the given healthcare application. The challenge in this phase is to achieve strong algorithmic performance, while incorporating features to exploit the full potential of CIM hardware. Conversely, the second phase is application-independent and comprises of the remaining abstraction layers (mapping, micro-architecture and circuits, and device). It solely focuses on translating the model computations into CIM hardware operations. However, the non-ideal characteristics of memristor devices introduce computational errors in hardware operations. This undermines the advantages of CIM as energy-efficient computations are of no use if they are incorrect. Hence, mitigating memristor non-idealities becomes the primary challenge in this phase. Moreover, it is important to integrate the customized model and non-ideality mitigation strategies into a comprehensive hardware solution and realize it through prototyping. This gives rise to the following three research topics: 1) healthcare AI models for CIM-based edge hardware, 2) dealing with memristor non-idealities, and 3) CIM edge-AI prototyping for healthcare.
We adopt a cross-layer approach in this thesis to address these research topics, covering all six layers of the CIM abstraction stack. We begin by creating neural network models for two healthcare applications: cardiac arrhythmia classification and diabetic retinopathy screening. Our contributions in this application-dependent design phase span across the first three abstraction layers (application, algorithm and optimization). At the application layer, we introduce new features in the model tailored to the specific healthcare application. This enhances its real-world impact by addressing the unique medical needs more effectively. Moving to the algorithm layer, we customize the computational flow within the model to exploit the characteristics of the healthcare data. This improves design performance in key aspects like accuracy and energy efficiency. Moreover, we strategically refine the model computations to further maximize post-deployment benefits on CIM hardware. At the optimization layer, we employ techniques like resampling, quantization and pruning to optimize hardware resource requirements, without compromising the model's algorithmic performance.
After creating the neural network models, we proceed to the application-independent design phase. Focusing on RRAM-based memristor devices, we first identify three key non-idealities that significantly impact inference accuracy on CIM hardware. We then devise mitigation strategies against these non-idealities, encompassing the remaining abstraction layers (mapping, micro-architecture and circuits, and device). At mapping layer, we propose a hardware-aware training methodology to combat the conductance variation non-ideality. Moving to the micro-architecture level, we present two mitigation strategies. The first addresses non-zero Gmin error non-ideality through a novel approach to CIM micro-architecture design. The second introduces an adaptive micro-architecture that adjusts its sensing conditions to counteract the effects of read-disturb non-ideality. At the device level, these strategies indirectly contribute by circumventing the necessity for extensive device engineering, ensuring accurate inference even in the presence of non-idealities. Building upon this foundation of model development and non-ideality mitigation, we integrate the optimal ECG classification model with the proposed mitigation strategies to create a CIM edge-AI prototype. Thus, our contributions pave the way towards a future with enhanced effectiveness and efficiency of AI-powered healthcare. ...
Computation-in-memory (CIM) overcomes these challenges by in-situ data processing using emerging memory technologies called memristors. Thus, CIM can facilitate energy efficient and compact edge-AI hardware design. Healthcare domain stands out as a prime target for CIM-based edge-AI hardware, due to two main reasons. Firstly, it holds significant real-world importance due to its direct impact on human well-being. Secondly, the increasing adoption of AI in healthcare can significantly benefit from efficient hardware for data processing. CIM-based edge hardware can greatly enhance the effectiveness of AI-based healthcare through rapid, reliable, and secure processing of medical data at its source. Hence, design of CIM-based edge-AI hardware for healthcare applications presents a promising research direction.
The process of designing CIM-based edge-AI hardware for healthcare can be expressed as a stack of six abstraction layers: application, algorithm, optimization, mapping, micro-architecture and circuits, and device. These abstraction layers can be further grouped into two distinct design phases. The first phase is application-dependent, covering the first three abstraction layers (application, algorithm and optimization). It involves creating a customized neural network model for the given healthcare application. The challenge in this phase is to achieve strong algorithmic performance, while incorporating features to exploit the full potential of CIM hardware. Conversely, the second phase is application-independent and comprises of the remaining abstraction layers (mapping, micro-architecture and circuits, and device). It solely focuses on translating the model computations into CIM hardware operations. However, the non-ideal characteristics of memristor devices introduce computational errors in hardware operations. This undermines the advantages of CIM as energy-efficient computations are of no use if they are incorrect. Hence, mitigating memristor non-idealities becomes the primary challenge in this phase. Moreover, it is important to integrate the customized model and non-ideality mitigation strategies into a comprehensive hardware solution and realize it through prototyping. This gives rise to the following three research topics: 1) healthcare AI models for CIM-based edge hardware, 2) dealing with memristor non-idealities, and 3) CIM edge-AI prototyping for healthcare.
We adopt a cross-layer approach in this thesis to address these research topics, covering all six layers of the CIM abstraction stack. We begin by creating neural network models for two healthcare applications: cardiac arrhythmia classification and diabetic retinopathy screening. Our contributions in this application-dependent design phase span across the first three abstraction layers (application, algorithm and optimization). At the application layer, we introduce new features in the model tailored to the specific healthcare application. This enhances its real-world impact by addressing the unique medical needs more effectively. Moving to the algorithm layer, we customize the computational flow within the model to exploit the characteristics of the healthcare data. This improves design performance in key aspects like accuracy and energy efficiency. Moreover, we strategically refine the model computations to further maximize post-deployment benefits on CIM hardware. At the optimization layer, we employ techniques like resampling, quantization and pruning to optimize hardware resource requirements, without compromising the model's algorithmic performance.
After creating the neural network models, we proceed to the application-independent design phase. Focusing on RRAM-based memristor devices, we first identify three key non-idealities that significantly impact inference accuracy on CIM hardware. We then devise mitigation strategies against these non-idealities, encompassing the remaining abstraction layers (mapping, micro-architecture and circuits, and device). At mapping layer, we propose a hardware-aware training methodology to combat the conductance variation non-ideality. Moving to the micro-architecture level, we present two mitigation strategies. The first addresses non-zero Gmin error non-ideality through a novel approach to CIM micro-architecture design. The second introduces an adaptive micro-architecture that adjusts its sensing conditions to counteract the effects of read-disturb non-ideality. At the device level, these strategies indirectly contribute by circumventing the necessity for extensive device engineering, ensuring accurate inference even in the presence of non-idealities. Building upon this foundation of model development and non-ideality mitigation, we integrate the optimal ECG classification model with the proposed mitigation strategies to create a CIM edge-AI prototype. Thus, our contributions pave the way towards a future with enhanced effectiveness and efficiency of AI-powered healthcare.
Diabetic retinopathy (DR) is a leading cause of permanent vision loss worldwide. It refers to irreversible retinal damage caused due to elevated glucose levels and blood pressure. Regular screening for DR can facilitate its early detection and timely treatment. Neural network-based DR classifiers can be leveraged to achieve such screening in a convenient and automated manner. However, these classifiers suffer from reliability issue where they exhibit strong performance during development but degraded performance after deployment. Moreover, they do not provide supplementary information about the prediction outcome, which severely limits their widespread adoption. Furthermore, energy-efficient deployment of these classifiers on edge devices remains unaddressed, which is crucial to enhance their global accessibility. In this paper, we present a reliable and energy-efficient hardware for DR detection, suitable for deployment on edge devices. We first develop a DR classification model using custom training data that incorporates diverse image quality and image sources along with improved class balance. This enables our model to effectively handle both on-field variations in retinal images and minority DR classes, enhancing its post-deployment reliability. We then propose a pseudo-binary classification scheme to further improve the model performance and provide supplementary information about the model prediction. Additionally, we present an energy-efficient hardware design for our model using memristor-based computation-in-memory, to facilitate its deployment on edge devices. Our proposed approach achieves reliable DR classification with three orders of magnitude reduction in energy consumption over state-of-the-art hardware platforms.
Memristor-based computation-in-memory (CIM) can achieve high energy efficiency by processing the data within the memory, which makes it well-suited for applications like neural networks. However, memristors suffer from conductance variation problem where their programmed conductance values deviate from the desired values. Such variations lead to computational errors that result in degraded inference accuracy in CIM-based neural networks. In this paper, we present a mapping-aware biased training methodology to mitigate the impact of conductance variation on CIM-based neural networks. We first determine which conductance states of the memristor are inherently more immune to variation. The neural network is then trained under the constraint that important weights can only take numeric values which directly get mapped to such favorable states. Simulation results show that our proposed mapping-aware biased training achieves up to 2.4× hardware accuracy compared to the conventional training.
Computation-in-memory (CIM) paradigm leverages emerging memory technologies such as resistive random access memories (RRAMs) to process the data within the memory itself. This alleviates the memory-processor bottleneck resulting in much higher hardware efficiency compared to von-Neumann architecture-based conventional hardware. Hence, CIM becomes an attractive alternative for applications like neural networks which require a huge number of data transfer operations in conventional hardware. CIM-based neural networks typically employ bit-slicing scheme which represents a single neural weight using multiple RRAM devices (called slices) to meet the high bit-precision demand. However, such neural networks suffer from significant accuracy degradation due to non-zero Gmin error where a zero weight in the neural network is represented by an RRAM device with a non-zero conductance. This paper proposes an unbalanced bit-slicing scheme to mitigate the impact of non-zero Gmin error. It achieves this by allocating appropriate sensing margins for different slices based on their binary positions. It also tunes the sensing margins to meet the demands of either high accuracy or energy-efficiency. The sensing margin allocation is supported by 2's complement arithmetic which further reduces the influence of non-zero Gmin error. Simulation results show that our proposed scheme achieves up to 7.3× accuracy and up to 7.8× correct operations per unit energy consumption compared to state-of-the-art.
Timely detection of cardiac arrhythmia characterized by abnormal heartbeats can help in the early diagnosis and treatment of cardiovascular diseases. Wearable healthcare devices typically use neural networks to provide the most convenient way of continuously monitoring heart activity for arrhythmia detection. However, it is challenging to achieve high accuracy and energy efficiency in these smart wearable healthcare devices. In this work, we provide architecture-level solutions to deploy neural networks for cardiac arrhythmia classification. We have created a hierarchical structure after analyzing various neural network topologies where only required network components are activated to improve energy efficiency while maintaining high accuracy. In our proposed architecture, we introduce a severity-based classification approach to directly help the users of the wearable healthcare device as well as the medical professionals. Additionally, we have employed computation-in-memory based hardware to improve energy efficiency and area consumption by leveraging in-situ data processing and scalability of emerging memory technologies such as resistive random access memory (RRAM). Simulation experiments conducted using the MIT-BIH arrhythmia dataset show that the proposed architecture provides high accuracy while consuming average energy of 0.11 $\mu$J per heartbeat classification and 0.11 mm2 area, thereby achieving 25× improvement in average energy consumption and 12× improvement in area compared to the state-of-the-art.
Analog computation-in-memory (CIM) architecture alleviates massive data movement between the memory and the processor, thus promising great prospects to accelerate certain computational tasks in an energy-efficient manner. However, data converters involved in these architectures typically achieve the required computing accuracy at the expense of high area and energy footprint which can potentially determine CIM candidacy for low-power and compact edge-AI devices. In this work, we present a memory-periphery co-design to perform accurate A/D conversions of analog matrix-vector-multiplication (MVM) outputs. Here, we introduce a scheme where select-lines and bit-lines in the memory are virtually fixed to improve conversion accuracy and aid a ring-oscillator-based A/D conversion, equipped with component sharing and inter-matching of the reference blocks. In addition, we deploy a self-timed technique to further ensure high robustness addressing global design and cycle-to-cycle variations. Based on measurement results of a 4Kb CIM chip prototype equipped with TSMC 40nm, a relative accuracy of up to 99.71% is achieved with an energy efficiency of 115.1 TOPS/W and computational density of 12.1 TOPS/mm2 for the MNIST dataset. Thus, an improvement of up to 11.3X and 7.5X compared to the state-of-the-art, respectively.
Deep Learning (DL) has recently led to remark-able advancements, however, it faces severe computation related challenges. Existing Von-Neumann-based solutions are dealing with issues such as memory bandwidth limitations and energy inefficiency. Computation-In-Memory (CIM) has the potential to address this problem by integrating processing elements directly into the memory architecture, reducing data movement and enhancing the overall efficiency of the system. In this work, we propose CIM architecture using three distinct emerging technologies. Firstly, a CIM architecture utilizing Ferroelectric Field-Effect Transistors (FeFET) is shown and the resulting errors from the analog compute scheme are injected into the emerging algorithm of Hyperdimensional Computing. Subsequently, we explore Vertical Nanowire Field-Effect Transistors (VNWFETs) based CIM within a 3D computing architecture, demonstrating improved energy efficiency and reconfigurability for CIM. Additionally, we improve the accuracy of the Resistive Random Access Memories (RRAM) based CIM architecture using two mapping-based solutions. These three technologies exhibit non-volatile characteristics, and when integrated into the CIM architecture, they yield significant advantages, including enhanced energy efficiency, reliability, and accuracy in computing processes.
With the rise of the Internet of Things (IoT), a huge market for so-called smart edge-devices is foreseen for millions of applications, like personalized healthcare and smart robotics. These devices have to bring smart computing directly where the data is generated, while coping with the limited energy budget. Conventional von-Neumann architecture fail to meet these requirements due to e.g., memory-processor data transfer bottleneck. Memristor-based computation-in-memory (CIM) has the potential to realize smart local computing for highly parallel data-dominated AI applications by exploiting the inherent properties of the architecture and the physical characteristics of the memristors. This paper provides a broad overview of CIM architecture highlighting its potential and unique properties in enabling smart local computing. Moreover, it discusses design considerations of such architectures including both crossbar array as well as peripheral circuits; special attention is given to analog-to-digital converter (ADC), as it is the most critical unit of analog-based CIM operation e.g., vector-matrix multiplication (VMM). Finally, the paper outlines the potential future directions for CIM-based edge smart computing.