S. Hamdioui
Please Note
272 records found
1
Spike-based neuromorphic computing
An overview from bio-inspiration to hardware architectures and learning mechanisms
The endeavor to emulate the extraordinary efficiency and adaptability inherent in the human brain via spike-based neuromorphic computing presents significant potential across a diverse array of applications. The attainment of this objective necessitates the translation of biological principles into artificial systems, a task that continues to pose a complex challenge requiring a profound comprehension of the mechanisms by which neural systems produce robust computational outcomes. This tutorial paper provides a comprehensive overview of the foundational concepts and emerging design trends in spike-based neuromorphic computing, covering advances from materials and circuits to hardware architectures and learning mechanisms. It begins with an examination of key aspects of brain biology and their influence on neuromorphic design, followed by a brief discussion of biologically plausible neuron and synapse models. The paper then defines the core principles and defining attributes of neuromorphic computing, highlighting the trade-offs and design choices underlying current implementations. Building on these foundations, it explores the critical properties of neuromorphic systems, surveys a variety of learning algorithms, and reviews hardware-level realizations of bioinspired neurons and synapses. Subsequent sections discuss state-of-the-art spiking neural network architectures, mapping and compilation strategies, and representative application domains. By providing this end-to-end perspective, the article aims to guide the development of future neuromorphic systems that more closely emulate brain efficiency, scalability, and resilience.
PdNeuRAM
Forming-free, multi-bit Pd/HfO2 ReRAM for energy-efficient neuromorphic computing
Theoretically speaking, Majority logic, originally proposed in the ^{\prime }70s, enables more compact and efficient arithmetic implementations than the conventional Boolean counterpart. Nonetheless, CMOS technology based Majority logic realizations remain challenging, as standard transistor-based approaches are unable to directly exhibit majority behavior. However, recent exploration on beyond CMOS technologies created a resurgence of the interest in majority logic. In this work, we propose and analyze a novel approach towards the 3-input Majority gate (MAJ3) implementation by means of piezoelectric materials. By leveraging their intrinsic electromechanical properties, we convert the digital input signals into mechanical deformations, which are accumulated in a transfer layer. Subsequently, we transform the combined deformation back to the electric domain with a piezoelectronics element properly designed to perform majority functionality. We first present the underlying principles behind our proposal with a short introduction on majority logic, piezoelectronics, and the utilized simulation framework. Afterwards we introduce the proposed piezoelectric 3-input Majority gate (piezo-MAJ3) and strategies for optimizing its behavior and performance. We also detail the material parameters and structural design impact on device performance by utilizing both analytical discussion and physics-based simulations. Finally, we shortly highlight how our proposal can be directly integrated into CMOS circuits and compare the piezo-MAJ3 potential cost and performance with the ones of state of the art implementations. Our results indicate that when compared with its CMOS counterpart, the piezo-MAJ3 gate requires half the area, it is 7x faster, while reducing with 44% the energy consumption.
Resistive random-access memory (RRAM)-based computation-in-memory (CIM) architectures offer a promising solution to meet the stringent energy efficiency demands of executing artificial intelligence (AI) algorithms directly on edge devices. However, these architectures suffer from the read-disturb problem, which can lead to accumulated computational errors over time. To maintain the required level of computational accuracy, conventional approaches rely on a static reprogramming process after a predefined number of read cycles, necessitating large counters and resulting in inefficiencies. This paper presents experimental results using real RRAM devices to analyze the read-disturb effect and builds on these insights to propose a circuit-level detection methodology for real-time monitoring of conductance drifts. The proposed method initiates reprogramming only when the device drift exceeds a defined threshold and reprogramming is actually needed. Additionally, an analytical method is developed to determine the minimum conductance state ratio needed to meet reliable detection criteria. Based on this foundation, the proposed detection technique is further optimized for dynamic identification of read-disturb effects. Experiment-augmented SPICE simulation results, using a calibrated model implemented in TSMC 40 nm CMOS technology, validate the functionality and effectiveness of the proposed detection approach. These results demonstrate its potential to improve both the reliability and efficiency of RRAM-based CIM architectures that provide up to a 4x improvement in energy-efficiency compared to traditional periodic reprogramming methods.
Vector–matrix multiplication (VMM), implemented through multiply–accumulate (MAC) operations, represents the dominant computational primitive in many artificial intelligence (AI) workloads. When executed on conventional von Neumann architectures, VMM operations suffer from important energy consumption and latency due to the separation between memory and processing units. To overcome these limitations, crossbar arrays built from Resistive Random Access Memory (RRAM) cells have been proposed for accelerating VMM computations. In this work, we investigate the key optimization trade-offs associated with implementing RRAM-based neural networks for classification applications. A simple two-layer neural network is first defined and trained in software to generate the weight matrices and bias parameters. Next, three hardware implementation scenarios are evaluated depending on whether negative floating-point numbers are used: Positive Weights Only (PWO), Positive and Negative Weights Only (PNWO), and Positive and Negative Weights with Biases (PNWB). The different implementations are analyzed at the hardware level by examining classification accuracy, energy efficiency, latency, and area overhead. The study further incorporates important RRAM limitations, including restricted conductance range and device variability. Hardware results show that the PWO scenario offers the lowest energy consumption (189 fJ/MAC) and area overhead but results in the lowest accuracy. PNWO and PNWB significantly improve accuracy (+177% and +180%) but increase energy consumption (+63% and +87%) and area (×2 and ×2.1). Under variability effects, PWO achieves better accuracy (94.65%), followed by PNWO (93.11%) and PNWB (92.11%).
Computation-in-Memory (CIM) architectures address the rising demand for energy-efficient artificial intelligence (AI) solutions, by minimizing costly data movements between memory and processor. Within such architectures, SRAM-based digital CIM is especially attractive as it preserves the advantages of CIM while avoiding analog complexity. Recent studies have revealed potential weaknesses in these architectures, particularly to power side-channel attacks (SCA) capable of extracting sensitive model parameters (e.g., neural network (NN) weights), which represent the intellectual property of CIM-based neural network systems. In this study, we propose and evaluate two countermeasures to secure SRAM-based CIM architectures against power attacks: (1) Balanced Obfuscated-path countermeasure, and (2) Glitch Aware countermeasure. To validate their effectiveness, we conducted a comprehensive power analysis that successfully demonstrated attacks against an unprotected implementation. Our experimental results demonstrate that both countermeasures significantly improve resistance to power attacks. Although the Balanced Obfuscated-path offers better area overhead and run-time performance, the Glitch Aware approach achieves higher protection against advanced attacks, making each suitable for different design constraints.
Memristor-based neural network accelerators for space applications
Enhancing performance with temporal averaging and SIRENs
A Data-Driven ANN-Based Model for FeCAP and FeFET
Orienting to SPICE and Circuit Design
Physics-based compact models for emerging non-volatile memories (NVMs) are often limited by the complex interactions of microscopic domains and defects that are difficult to capture analytically, resulting in reduced accuracy and simulation efficiency. To address this challenge, a machine learning (ML)-based approach is proposed using artificial neural networks (ANNs) trained entirely on device measurement data, enabling a direct translation of fabrication characteristics into SPICE-compatible circuit models. The resulting models achieve high accuracy (MSE: 0.724, adjusted R2 : 0.998), significantly outperforming physics-based baselines with an 18× lower MSE for polarization and a two-order-of-magnitude precision improvement in FeFET current simulation, while accurately capturing the wake-up process. Furthermore, the model demonstrates robust out-of-distribution (OOD) extrapolation to unseen ferroelectric thicknesses and a 33.7% improvement in simulation speed. These results validate the ML-based approach as a highly efficient, SPICE-compatible solution for next-generation memory.
Addressing non-idealities in Resistive Random Access Memories (RRAMs) is crucial for their successful commercialization. For example, the inherent resistance drift that occurs during consecutive read operations can induce Read Disturb Faults (RDF), leading to functional errors. This paper analyzes and characterizes the resistance drift and the RDF based on data measurements and presents a physics-based RRAM compact model that incorporates these non-idealities. Additionally, an in-field mitigation scheme is proposed, leveraging bidirectional read operations to balance the resistance. The scheme is implemented and validated through circuit simulations, both for RRAM used as memory and for RRAM-based computation-in-memory microarchitectures for deep neural networks. The results demonstrate that RRAM without any mitigation scheme can start failing after 8,000 consecutive reads, while our mitigation scheme ensures that the memory remains functional even after 106 consecutive reads. Furthermore, the results indicate that using the MNIST dataset as a case study, the accuracy can drop significantly from 86% to as low as 12.5% without any mitigation scheme. In contrast, the proposed mitigation scheme improves this accuracy up to 84.2%.
Full-System (FS) simulation is essential for performance evaluation of complete systems that execute complex applications on a complete software stack consisting of an operating system and user applications. Nevertheless, they require careful fine-tuning against real hardware to obtain reliable performance statistics, which can become tedious, error-prone, and time-consuming with typical trial-and-error approaches. We propose a novel, streamlined, component-level calibration methodology to address these shortcomings to validate FS simulation models. Our methodology greatly accelerates the validation process without sacrificing accuracy. It is Instruction Set Architecture (ISA)-agnostic, and can tackle hardware specifications at different levels of detail. We demonstrate its effectiveness by validating FS models against both open-hardware and IP-protected (closed hardware) RISC-V silicon, achieving a mean error of 19%-23% for the SPEC CPU2017 suite in the two cases. We introduce the first open-source RISC-V-based FS-validated simulation models with a complete and replicable methodology.
Resistive RAM (RRAM) design optimization and error monitoring is crucial for memory storage applications but also to enable future brain-inspired systems beyond the capabilities of today’s hardware. The figure-of-merit confirming the presence of resistive switching in RRAM devices is its resistance window expressed by the HRS/LRS ratio (High Resistance State over the Low Resistance State). This ratio guarantees the proper operation of the RRAM: the larger the ratio, the more reliable and robust the RRAM cell becomes in storing and retrieving data. From this perspective, this paper proposes an analysis of RRAM intermittent errors with respect to the RRAM resistance ratio. The impact of intermittent errors on the HRS/LRS ratio is analyzed at the RRAM cell electrical level using a dedicated test chip. Silicon measurements show that all detected RRAM intermittent errors directly result from resistance drifts due to ineffective programming operations. In view of these findings, intermittent error mitigation schemes are proposed to address these errors at the circuit level.
Approximately one-third of individuals with chronic epilepsy, a condition resulting from uncontrolled brain activity, do not respond to medication. Animal models are widely used to investigate the mechanism underlying epilepsy, so better drug treatments can be developed for this disease. In such studies, epileptiform activity, assessed by EEG recordings, can be used as a marker for the development of the disease. However, the analysis of EEG recordings is typically done manually, which is time-consuming, subject to observer bias, error-prone, and lacks consistency and efficiency. In this paper, we develop a novel automated methodology for detecting and classifying epileptiform activity, which is tested using the intrahippocampal kainic acid (IHKA) mouse model, a representation of human temporal lobe epilepsy. For that, EEG/LFP recordings are obtained from biological experiments using the IHKA mouse model for data acquisition. We use a spike detection method that combines an improved version of the nonlinear energy operator (NEO) with the automatic NEO thresholding (ANT) algorithm. The proposed method is implemented in Python as an automated and time-efficient algorithm, given its adaptability to different spike and epileptiform event criteria, making it suitable for use in preclinical and potentially future clinical studies. Using our proposed methodology, we achieve a 93.1% accuracy in detecting epileptiform events and a 95.8% accuracy in classification. Moreover, the time for analysis of EEG recordings was reduced by 98.8% compared to manual analysis. Additionally, to demonstrate the potential of the algorithm for brain–machine interfaces (BMI) applications, we develop a hardware architecture and implement it using both an application-specific integrated circuit (ASIC) and a field programmable gate array (FPGA). The FPGA shows the feasibility of near real-time implementation, and for our ASIC implementation, we achieve a post-layout area of 9114 µm2 with a dynamic power consumption of 16.09 μW using TSMC 40 nm technology.
Efficient and Realistic Brain Simulation
A Review and Design Guide for Memristor-Based Approaches