SH

S. Hamdioui

info

Please Note

272 records found

An overview from bio-inspiration to hardware architectures and learning mechanisms

Journal article (2026) - Anteneh Gebregiorgis, Amirreza Yousefzadeh, Sherif Eissa, Muhammad Ali Siddiqi, Charlotte Frenkel, Friedemann Zenke, Sander Bohte, Abdulqader Nael Mahmoud, Said Hamdioui, More authors...
The endeavor to emulate the extraordinary efficiency and adaptability inherent in the human brain via spike-based neuromorphic computing presents significant potential across a diverse array of applications. The attainment of this objective necessitates the translation of biological principles into artificial systems, a task that continues to pose a complex challenge requiring a profound comprehension of the mechanisms by which neural systems produce robust computational outcomes. This tutorial paper provides a comprehensive overview of the foundational concepts and emerging design trends in spike-based neuromorphic computing, covering advances from materials and circuits to hardware architectures and learning mechanisms. It begins with an examination of key aspects of brain biology and their influence on neuromorphic design, followed by a brief discussion of biologically plausible neuron and synapse models. The paper then defines the core principles and defining attributes of neuromorphic computing, highlighting the trade-offs and design choices underlying current implementations. Building on these foundations, it explores the critical properties of neuromorphic systems, surveys a variety of learning algorithms, and reviews hardware-level realizations of bioinspired neurons and synapses. Subsequent sections discuss state-of-the-art spiking neural network architectures, mapping and compilation strategies, and representative application domains. By providing this end-to-end perspective, the article aims to guide the development of future neuromorphic systems that more closely emulate brain efficiency, scalability, and resilience. ...

Forming-free, multi-bit Pd/HfO2 ReRAM for energy-efficient neuromorphic computing

Memristor technology offers a promising route toward energy-efficient computing but faces challenges including resistance drift, variability, and the need for electroforming. Filamentary resistive random-access memory, one of the most studied memristive platforms, typically requires a high-voltage electroforming step to initiate conductive filaments, leading to increased power overhead and reduced endurance. Here we report HfO2-based forming-free memristive devices (PdNeuRAM) that operate at low voltages, support multi-bit functionality, and exhibit reduced variability. Through combined electrical and materials characterization, we identify a Pd-O-Hf interfacial configuration that lowers oxygen-vacancy formation and migration barriers, creating a dense network of shallow defect states. Together with a Ti top electrode acting as an oxygen reservoir and an ultrathin (5 nm) HfO2 layer, this interfacial engineering enables charge redistribution at room temperature and eliminates the need for electroforming. The fabricated devices provide tunable resistance states and reduce programming and read energy by 43% and 38%, respectively, in spiking neural network inference tasks. These results provide mechanistic insight into forming-free resistive switching and demonstrate the potential of Pd/HfO2 devices for energy-efficient neuromorphic computing. ...
Instruction Set Architecture (ISA) extensions, particularly scalar cryptography extensions (Zk), combine the performance advantages of hardware with the adaptability of software, enabling the direct and efficient execution of cryptographic functions within the processor pipeline. This integration eliminates the need to communicate with external cores, substantially reducing latency, power consumption, and hardware overhead, making it especially suitable for embedded systems with constrained resources. However, current scalar cryptography extension implementations remain vulnerable to physical threats, notably power side-channel attacks (PSCAs). These attacks allow adversaries to extract confidential information, such as secret keys, by analyzing the power consumption patterns of the hardware during operation. This paper presents an optimized and secure implementation of the RISC-V scalar Advanced Encryption Standard (AES) extension (Zkne/Zknd) using Domain-Oriented Masking (DOM) to mitigate first-order PSCAs. Our approach features optimized assembly implementations for partial rounds and key scheduling alongside pipeline-aware microarchitecture optimizations. We evaluated the security and performance of the proposed design using the Xilinx Artix7 FPGA platform. The results indicate that our design is side-channel-resistant while adding a very low area overhead of 0.39% to the full 32-bit CV32E40S RISC-V processor. Moreover, the performance overhead is zero when the extension-related instructions are properly scheduled. ...
Journal article (2026) - A. V. Zegbroeck, E. V. Meirvenne, P. Anagnostou, F. Ciubotaru, C. Adelmann, S. Hamdioui, S. Cotofana
Theoretically speaking, Majority logic, originally proposed in the ^{\prime }70s, enables more compact and efficient arithmetic implementations than the conventional Boolean counterpart. Nonetheless, CMOS technology based Majority logic realizations remain challenging, as standard transistor-based approaches are unable to directly exhibit majority behavior. However, recent exploration on beyond CMOS technologies created a resurgence of the interest in majority logic. In this work, we propose and analyze a novel approach towards the 3-input Majority gate (MAJ3) implementation by means of piezoelectric materials. By leveraging their intrinsic electromechanical properties, we convert the digital input signals into mechanical deformations, which are accumulated in a transfer layer. Subsequently, we transform the combined deformation back to the electric domain with a piezoelectronics element properly designed to perform majority functionality. We first present the underlying principles behind our proposal with a short introduction on majority logic, piezoelectronics, and the utilized simulation framework. Afterwards we introduce the proposed piezoelectric 3-input Majority gate (piezo-MAJ3) and strategies for optimizing its behavior and performance. We also detail the material parameters and structural design impact on device performance by utilizing both analytical discussion and physics-based simulations. Finally, we shortly highlight how our proposal can be directly integrated into CMOS circuits and compare the piezo-MAJ3 potential cost and performance with the ones of state of the art implementations. Our results indicate that when compared with its CMOS counterpart, the piezo-MAJ3 gate requires half the area, it is 7x faster, while reducing with 44% the energy consumption. ...
Journal article (2026) - Mohammad Amin Yaldagard, Ankit Bende, Sumit Diware, Vikas Rana, Said Hamdioui, Rajendra Bishnoi
Resistive random-access memory (RRAM)-based computation-in-memory (CIM) architectures offer a promising solution to meet the stringent energy efficiency demands of executing artificial intelligence (AI) algorithms directly on edge devices. However, these architectures suffer from the read-disturb problem, which can lead to accumulated computational errors over time. To maintain the required level of computational accuracy, conventional approaches rely on a static reprogramming process after a predefined number of read cycles, necessitating large counters and resulting in inefficiencies. This paper presents experimental results using real RRAM devices to analyze the read-disturb effect and builds on these insights to propose a circuit-level detection methodology for real-time monitoring of conductance drifts. The proposed method initiates reprogramming only when the device drift exceeds a defined threshold and reprogramming is actually needed. Additionally, an analytical method is developed to determine the minimum conductance state ratio needed to meet reliable detection criteria. Based on this foundation, the proposed detection technique is further optimized for dynamic identification of read-disturb effects. Experiment-augmented SPICE simulation results, using a calibrated model implemented in TSMC 40 nm CMOS technology, validate the functionality and effectiveness of the proposed detection approach. These results demonstrate its potential to improve both the reliability and efficiency of RRAM-based CIM architectures that provide up to a 4x improvement in energy-efficiency compared to traditional periodic reprogramming methods. ...
Journal article (2026) - Hassen Aziza, Hanzhi Xun, Moritz Fieback, Mottaqiallah Taouil, Said Hamdioui
Vector–matrix multiplication (VMM), implemented through multiply–accumulate (MAC) operations, represents the dominant computational primitive in many artificial intelligence (AI) workloads. When executed on conventional von Neumann architectures, VMM operations suffer from important energy consumption and latency due to the separation between memory and processing units. To overcome these limitations, crossbar arrays built from Resistive Random Access Memory (RRAM) cells have been proposed for accelerating VMM computations. In this work, we investigate the key optimization trade-offs associated with implementing RRAM-based neural networks for classification applications. A simple two-layer neural network is first defined and trained in software to generate the weight matrices and bias parameters. Next, three hardware implementation scenarios are evaluated depending on whether negative floating-point numbers are used: Positive Weights Only (PWO), Positive and Negative Weights Only (PNWO), and Positive and Negative Weights with Biases (PNWB). The different implementations are analyzed at the hardware level by examining classification accuracy, energy efficiency, latency, and area overhead. The study further incorporates important RRAM limitations, including restricted conductance range and device variability. Hardware results show that the PWO scenario offers the lowest energy consumption (189 fJ/MAC) and area overhead but results in the lowest accuracy. PNWO and PNWB significantly improve accuracy (+177% and +180%) but increase energy consumption (+63% and +87%) and area (×2 and ×2.1). Under variability effects, PWO achieves better accuracy (94.65%), followed by PNWO (93.11%) and PNWB (92.11%). ...
Computation-in-Memory (CIM) architectures address the rising demand for energy-efficient artificial intelligence (AI) solutions, by minimizing costly data movements between memory and processor. Within such architectures, SRAM-based digital CIM is especially attractive as it preserves the advantages of CIM while avoiding analog complexity. Recent studies have revealed potential weaknesses in these architectures, particularly to power side-channel attacks (SCA) capable of extracting sensitive model parameters (e.g., neural network (NN) weights), which represent the intellectual property of CIM-based neural network systems. In this study, we propose and evaluate two countermeasures to secure SRAM-based CIM architectures against power attacks: (1) Balanced Obfuscated-path countermeasure, and (2) Glitch Aware countermeasure. To validate their effectiveness, we conducted a comprehensive power analysis that successfully demonstrated attacks against an unprotected implementation. Our experimental results demonstrate that both countermeasures significantly improve resistance to power attacks. Although the Balanced Obfuscated-path offers better area overhead and run-time performance, the Glitch Aware approach achieves higher protection against advanced attacks, making each suitable for different design constraints. ...
Binary Neural Networks (BNNs) have obtained a strong foothold in the field of machine learning at the edge due to their minimal hardware requirements. However, their energy and performance efficiency remain hindered by frequent data transfer between memory and processors. Computation-in-memory (CIM) architectures address this problem by embedding processing units within the memory. Unfortunately, current implementations of CIM are susceptible to IP piracy attacks through side channels. This paper presents a novel secure periphery scheme for NN accelerators with sequential accumulation that conceals IP information by obscuring the power consumption of the counter responsible for the leakage. This is achieved by combining two innovative techniques: operand schedule randomization and an always-count Gray code counter. The results demonstrate that the proposed design effectively resists power side channel attacks (SCAs). Moreover, Signal-to-Noise Ratio (SNR) and Test Vector Leakage Assessment (TVLA) show safe leakage levels. Compared to the state-of-the-art, our countermeasure reduces area and power overheads by up to 12.7× and 13.3×, achieving only 37% area and 51.2% power overhead with the added protection logic. Notably, this enhanced security comes with zero latency overhead, maintaining the performance of the baseline design. ...
Conference paper (2026) - Y. Biyani, A. Singh, R. Bishnoi, S. Hamdioui
Analog Compute-in-Memory (CIM), leveraging non-volatile memristive devices to perform in-place computations in the analog domain, holds great potential to efficiently accelerate vector-matrix multiplications (VMM) and realize AI (Artificial Intelligence) at the edge. However, the data converters in such architectures often trade-off accuracy for high energy and area overheads, practically limiting the benefits of CIM. In this work, we present SABCIM, an array-periphery co-design approach for CIM that enables accurate computation as well as digitization of analog VMM outputs with high energy efficiency and competitive area overhead. By leveraging complementary input activations and data storage, each crossbar column generates differential analog output corresponding to the vector-vector multiplication (VVM) result, while inherently addressing underlying non-idealities. This is digitized using a compact, dual-ramp voltage-to-time converter (VTC)-based analog-to-digital converter (ADC). Benchmark results indicate that our work achieves up to $19.6 \times$ higher energy efficiency compared to state-of-the-art (SOTA), while maintaining comparable accuracies. ...

Enhancing performance with temporal averaging and SIRENs

Journal article (2026) - Zacharia A. Rudge, Dominik Dold, Moritz Fieback, Dario Izzo, Said Hamdioui
Memristors are an emerging technology that enables artificial intelligence (AI) accelerators with high energy efficiency and radiation robustness — properties that are vital for the deployment of AI on-board spacecraft. However, space applications require reliable and precise computations, while memristive devices suffer from non-idealities, such as device variability, conductance drifts, and device faults. Thus, porting neural networks (NNs) to memristive devices often faces the challenge of severe performance degradation. In this work, we show in simulations that memristor-based NNs achieve competitive performance levels on on-board tasks, such as navigation & control and geodesy of asteroids. Through bit-slicing, temporal averaging of NN layers, and periodic activation functions, we improve initial results from around 0.07 to 0.01 and 0.3 to 0.007 for both tasks using RRAM devices, coming close to state-of-the-art levels (0.003−0.005 and 0.003, respectively). Our results demonstrate the potential of memristors for on-board space applications, and we are convinced that future technology and NN improvements will further close the performance gap to fully unlock the benefits of memristors. ...

Orienting to SPICE and Circuit Design

Journal article (2026) - Changhao Wang, Sicong Yuan, Nicolo Bellarmino, Danyang Chen, Hanzhi Xun, Lin Wang, Mottaqiallah Taouil, Moritz Fieback, Said Hamdioui, More Authors
Physics-based compact models for emerging non-volatile memories (NVMs) are often limited by the complex interactions of microscopic domains and defects that are difficult to capture analytically, resulting in reduced accuracy and simulation efficiency. To address this challenge, a machine learning (ML)-based approach is proposed using artificial neural networks (ANNs) trained entirely on device measurement data, enabling a direct translation of fabrication characteristics into SPICE-compatible circuit models. The resulting models achieve high accuracy (MSE: 0.724, adjusted R2 : 0.998), significantly outperforming physics-based baselines with an 18× lower MSE for polarization and a two-order-of-magnitude precision improvement in FeFET current simulation, while accurately capturing the wake-up process. Furthermore, the model demonstrates robust out-of-distribution (OOD) extrapolation to unseen ferroelectric thicknesses and a 33.7% improvement in simulation speed. These results validate the ML-based approach as a highly efficient, SPICE-compatible solution for next-generation memory. ...
This paper presents the first cryogenic characterization of Hot Carrier Degradation (HCD) in 5-V thick-oxide transistors fabricated in a 160-nm CMOS technology. HCD significantly worsens in nMOS devices at 4.2 K, leading to a more severe degradation, especially of threshold voltage and current in the linear regime. Contrary to expectations, pMOS devices exhibit a temporary performance improvement after stress, showing for the first time at 4.2 K a HCD-induced turn-around effect in threshold voltage and current. The threshold-voltage shift follows a power law with stress time, showing a much higher exponent at $4.2 K$ than at $300 K$ for nMOS, but not for pMOS devices. The threshold-voltage shift also follows a power law with stress voltage, strongly accelerated for nMOS at 4.2 K, but unchanged for pMOS. ...
Mapping Binary Neural Networks (BNNs) on computation-in-memory (CIM) architectures enables a highly efficient approach for energy-constrained edge computing. In-memory processing significantly reduces critical performance bottlenecks in conventional architectures. Despite their efficiency, current optimized CIM implementations remain vulnerable to IP theft via side-channel analysis. This work investigates the side-channel leakage of a digital BNN-CIM accelerator that employs popcount-based accumulation. A range of circuit-level modifications in counter implementations are proposed and evaluated, exploring their impact on security metrics and design overhead. Results demonstrate that the Hamming weight (HW) and Hamming distance (HD) equalizing techniques combined with power equalization through duplication perform better than traditional dual-rail countermeasures. The findings provide practical guidance for designing secure and efficient peripheral components for popcount-based BNN accelerators. ...
Conference paper (2025) - Hanzhi Xun, Moritz Fieback, Sicong Yuan, Changhao Wang, Erbing Hua, Hassen Aziza, Rajendra Bishnoi, Mottaqiallah Taouil, Said Hamdioui, More Authors...
Addressing non-idealities in Resistive Random Access Memories (RRAMs) is crucial for their successful commercialization. For example, the inherent resistance drift that occurs during consecutive read operations can induce Read Disturb Faults (RDF), leading to functional errors. This paper analyzes and characterizes the resistance drift and the RDF based on data measurements and presents a physics-based RRAM compact model that incorporates these non-idealities. Additionally, an in-field mitigation scheme is proposed, leveraging bidirectional read operations to balance the resistance. The scheme is implemented and validated through circuit simulations, both for RRAM used as memory and for RRAM-based computation-in-memory microarchitectures for deep neural networks. The results demonstrate that RRAM without any mitigation scheme can start failing after 8,000 consecutive reads, while our mitigation scheme ensures that the memory remains functional even after 106 consecutive reads. Furthermore, the results indicate that using the MNIST dataset as a case study, the accuracy can drop significantly from 86% to as low as 12.5% without any mitigation scheme. In contrast, the proposed mitigation scheme improves this accuracy up to 84.2%. ...
Conference paper (2025) - Rajendra Bishnoi, Mohammad Amin Yaldagard, Said Hamdioui, Kanishkan Vadivel, Manolis Sifalakis, Nicolas Daniel Rodriguez, Pedro Julian, Lothar Ratschbacher, Maen Mallah, More Authors...
The goal of the NEUROKIT2E project is to create an open-source Deep Learning framework for edge and embedded AI built around an established European value chain. This framework, called AIDGE, supports a wide range of application areas that operate independently and serve a global user community. It provides easy and fast full-stack solutions from Neural Network design and optimization to AI application development all the way down to hardware implementations while enabling code generation for application-specific targets. This platform provides flexibility for academic users in the AI domain to explore and innovate while allowing them the possibility to prototype systems, ensuring their work aligns well with industrial needs. This paper presents the results and achievements of the first part of this three-year project, along with its roadmap and expected outcomes. ...
Conference paper (2025) - Changhao Wang, S. Yuan, More Authors..., N Kolahimahmoud, H. Xun, Nicolo Bellarmino, Danyang Chen, Chujun Yin, M. Taouil, M. Fieback, S. Hamdioui
Ferroelectric Field-Effect Transistors (FeFETs) are promising candidates for non-volatile memory (NVM) technologies, especially in embedded systems and edge computing. However, due to their physical characteristics, FeFETs exhibit unique defects—such as Threshold Voltage Shifting (TVS) caused by trap charges in the oxide layer—that are not captured by conventional defect models. This study adopts the Device-Aware Test (DAT) methodology to model these defects by incorporating their impact into the electrical parameters, calibrated using measurement data. Defect injection, circuit-level simulations, and fault analysis are performed to derive realistic fault models. Finally, the March algorithm and Design-for-Test (DfT) techniques are proposed to effectively detect these defects. ...
Journal article (2025) - Karan Pathak, Joshua Klein, Giovanni Ansaloni, Said Hamdioui, Georgi Gaydadjiev, Marina Zapater, David Atienza
Full-System (FS) simulation is essential for performance evaluation of complete systems that execute complex applications on a complete software stack consisting of an operating system and user applications. Nevertheless, they require careful fine-tuning against real hardware to obtain reliable performance statistics, which can become tedious, error-prone, and time-consuming with typical trial-and-error approaches. We propose a novel, streamlined, component-level calibration methodology to address these shortcomings to validate FS simulation models. Our methodology greatly accelerates the validation process without sacrificing accuracy. It is Instruction Set Architecture (ISA)-agnostic, and can tackle hardware specifications at different levels of detail. We demonstrate its effectiveness by validating FS models against both open-hardware and IP-protected (closed hardware) RISC-V silicon, achieving a mean error of 19%-23% for the SPEC CPU2017 suite in the two cases. We introduce the first open-source RISC-V-based FS-validated simulation models with a complete and replicable methodology. ...
Journal article (2025) - H. Aziza, H. Xun, M. Fieback, M. Taouil, S. Hamdioui
Resistive RAM (RRAM) design optimization and error monitoring is crucial for memory storage applications but also to enable future brain-inspired systems beyond the capabilities of today’s hardware. The figure-of-merit confirming the presence of resistive switching in RRAM devices is its resistance window expressed by the HRS/LRS ratio (High Resistance State over the Low Resistance State). This ratio guarantees the proper operation of the RRAM: the larger the ratio, the more reliable and robust the RRAM cell becomes in storing and retrieving data. From this perspective, this paper proposes an analysis of RRAM intermittent errors with respect to the RRAM resistance ratio. The impact of intermittent errors on the HRS/LRS ratio is analyzed at the RRAM cell electrical level using a dedicated test chip. Silicon measurements show that all detected RRAM intermittent errors directly result from resistance drifts due to ineffective programming operations. In view of these findings, intermittent error mitigation schemes are proposed to address these errors at the circuit level. ...
Journal article (2025) - Jeroen J.A. Vermeulen, Georgii Krivoshein, Sumit Diware, Muhammad Ali Siddiqi, Arn M.J.M. van den Maagdenberg, Else A. Tolner, Said Hamdioui, Rajendra Bishnoi
Approximately one-third of individuals with chronic epilepsy, a condition resulting from uncontrolled brain activity, do not respond to medication. Animal models are widely used to investigate the mechanism underlying epilepsy, so better drug treatments can be developed for this disease. In such studies, epileptiform activity, assessed by EEG recordings, can be used as a marker for the development of the disease. However, the analysis of EEG recordings is typically done manually, which is time-consuming, subject to observer bias, error-prone, and lacks consistency and efficiency. In this paper, we develop a novel automated methodology for detecting and classifying epileptiform activity, which is tested using the intrahippocampal kainic acid (IHKA) mouse model, a representation of human temporal lobe epilepsy. For that, EEG/LFP recordings are obtained from biological experiments using the IHKA mouse model for data acquisition. We use a spike detection method that combines an improved version of the nonlinear energy operator (NEO) with the automatic NEO thresholding (ANT) algorithm. The proposed method is implemented in Python as an automated and time-efficient algorithm, given its adaptability to different spike and epileptiform event criteria, making it suitable for use in preclinical and potentially future clinical studies. Using our proposed methodology, we achieve a 93.1% accuracy in detecting epileptiform events and a 95.8% accuracy in classification. Moreover, the time for analysis of EEG recordings was reduced by 98.8% compared to manual analysis. Additionally, to demonstrate the potential of the algorithm for brain–machine interfaces (BMI) applications, we develop a hardware architecture and implement it using both an application-specific integrated circuit (ASIC) and a field programmable gate array (FPGA). The FPGA shows the feasibility of near real-time implementation, and for our ASIC implementation, we achieve a post-layout area of 9114 µm2 with a dynamic power consumption of 16.09 μW using TSMC 40 nm technology. ...

A Review and Design Guide for Memristor-Based Approaches

Computational-neuroscience research is increasingly in need of larger, biophysically realistic brain models. These analog-in-nature models build upon the Hodgkin-Huxley (HH) formalism and are run on digital, high-performance computing systems making simulation very computationally expensive. In circuit form, these models are theoretically suitable for efficient analog implementation. However, the ion-channel components –predominantly, sodium and potassium– are nonlinear, time-varying resistors, lacking an efficient implementation. Chua et al. proved that these ion-channel models are in fact memristors –devices with a conductance as a function of applied-voltage history– claiming that “memristors are the right stuff for building brains”. However, the kind of actual memristor implementation that is the right one for building brains is not defined. In this article, the device class and characteristics of such memristors are defined and existing memristive implementations of HH-like designs are then reviewed. Surprisingly, although often misclassified as such, no physical implementation currently exists that replicates the original HH equations faithfully or efficiently. Having put forward the desired memristor properties, a design guide for screening suitable memristor designs is then proposed. Screening the existing literature reveals that suitable devices likely already exist for potassium ion-channel emulation, while none exists for sodium; this calls for further investigation of higher-order, voltage-controlled and volatile memristors. ...