A.E. El Arrassi
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
5 records found
1
Conference paper
(2026)
-
Fouwad Jamil Mir, Asmae El Arrassi, Abdullah Aljuffri, Said Hamdioui, Mottaqiallah Taouil
Binary Neural Networks (BNNs) have obtained a strong foothold in the field of machine learning at the edge due to their minimal hardware requirements. However, their energy and performance efficiency remain hindered by frequent data transfer between memory and processors. Computation-in-memory (CIM) architectures address this problem by embedding processing units within the memory. Unfortunately, current implementations of CIM are susceptible to IP piracy attacks through side channels. This paper presents a novel secure periphery scheme for NN accelerators with sequential accumulation that conceals IP information by obscuring the power consumption of the counter responsible for the leakage. This is achieved by combining two innovative techniques: operand schedule randomization and an always-count Gray code counter. The results demonstrate that the proposed design effectively resists power side channel attacks (SCAs). Moreover, Signal-to-Noise Ratio (SNR) and Test Vector Leakage Assessment (TVLA) show safe leakage levels. Compared to the state-of-the-art, our countermeasure reduces area and power overheads by up to 12.7× and 13.3×, achieving only 37% area and 51.2% power overhead with the added protection logic. Notably, this enhanced security comes with zero latency overhead, maintaining the performance of the baseline design.
...
Binary Neural Networks (BNNs) have obtained a strong foothold in the field of machine learning at the edge due to their minimal hardware requirements. However, their energy and performance efficiency remain hindered by frequent data transfer between memory and processors. Computation-in-memory (CIM) architectures address this problem by embedding processing units within the memory. Unfortunately, current implementations of CIM are susceptible to IP piracy attacks through side channels. This paper presents a novel secure periphery scheme for NN accelerators with sequential accumulation that conceals IP information by obscuring the power consumption of the counter responsible for the leakage. This is achieved by combining two innovative techniques: operand schedule randomization and an always-count Gray code counter. The results demonstrate that the proposed design effectively resists power side channel attacks (SCAs). Moreover, Signal-to-Noise Ratio (SNR) and Test Vector Leakage Assessment (TVLA) show safe leakage levels. Compared to the state-of-the-art, our countermeasure reduces area and power overheads by up to 12.7× and 13.3×, achieving only 37% area and 51.2% power overhead with the added protection logic. Notably, this enhanced security comes with zero latency overhead, maintaining the performance of the baseline design.
APX-DREAM-CIM
An Approximate Digital SRAM-Based CIM Accelerator for Edge AI
With the increasing demand for energy-efficient solutions in smart edge applications, there is a pressing need for computing architectures that can effectively manage di-verse and computation-intensive workloads. To address this, we propose APX-DREAM-CIM, an approximate digital SRAM-based Computation-In-Memory (CIM) accelerator specifically designed to maximize energy and area efficiency with minimal impact on accuracy by investigating cross-layer optimizations. The architecture uses, on the one hand, quantized models and, on the other hand, integrates a range of approximate adder designs, each offering distinct trade-offs between hard-ware efficiency and computational precision. To further enhance resilience to approximation-induced errors, we introduce a novel Approximate-Aware Training (AAT) methodology, which models the approximate behavior of the hardware during training, enabling the network to adapt accordingly. We evaluate APX-DREAM-CIM using the MNIST and CIFAR-10 datasets with LeNet-5 and ResNet-20 topologies. Experimental results show that APX-DREAM-CIM achieves up to 17% energy and 16% area reduction compared to the exact baseline architecture. Moreover, AAT enables the system to retain, and in some cases, even exceed, the accuracy of standard quantized models.
...
With the increasing demand for energy-efficient solutions in smart edge applications, there is a pressing need for computing architectures that can effectively manage di-verse and computation-intensive workloads. To address this, we propose APX-DREAM-CIM, an approximate digital SRAM-based Computation-In-Memory (CIM) accelerator specifically designed to maximize energy and area efficiency with minimal impact on accuracy by investigating cross-layer optimizations. The architecture uses, on the one hand, quantized models and, on the other hand, integrates a range of approximate adder designs, each offering distinct trade-offs between hard-ware efficiency and computational precision. To further enhance resilience to approximation-induced errors, we introduce a novel Approximate-Aware Training (AAT) methodology, which models the approximate behavior of the hardware during training, enabling the network to adapt accordingly. We evaluate APX-DREAM-CIM using the MNIST and CIFAR-10 datasets with LeNet-5 and ResNet-20 topologies. Experimental results show that APX-DREAM-CIM achieves up to 17% energy and 16% area reduction compared to the exact baseline architecture. Moreover, AAT enables the system to retain, and in some cases, even exceed, the accuracy of standard quantized models.
Journal article
(2025)
-
A.E. El Arrassi, L.C.A. Huijbregts, Manil Dev Gomony, Anteneh Gebregiorgis, Francky Catthoor, M. Taouil, Rajiv V. Joshi, S. Hamdioui
With the rise of energy-constrained smart edge applications, there is a pressing need for energy-efficient computing engines that process generated data locally, at least for small and medium-sized applications. To address this issue, this paper proposes DREAM-CIM, a digital SRAM-based computation-in-memory (CIM) accelerator. It targets an energy- and area-efficient implementation of the multiply-and-accumulate (MAC) operation, which is the core operation of neural networks. The accelerator is based on a multi-sub-array macro to increase parallelism, integrates multiplication operations within the memory cells such that they are executed while reading the cells, makes use of pipelining to further optimize the throughput of the MAC operations, and gets rid of the expensive adder-tree structures commonly used in State-of-The-Art (SOTA) digital CIM solutions by replacing them with a custom accumulation circuit to reduce power and area. The SPICE simulation results of the DREAM-CIM accelerator show an energy efficiency of 5097 TOPS/W (normalized to a 1-bit × 1-bit MAC operation) and an area efficiency of 3854 TOPS/mm$^2$ using 22 nm technology node.
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy. ...
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy. ...
With the rise of energy-constrained smart edge applications, there is a pressing need for energy-efficient computing engines that process generated data locally, at least for small and medium-sized applications. To address this issue, this paper proposes DREAM-CIM, a digital SRAM-based computation-in-memory (CIM) accelerator. It targets an energy- and area-efficient implementation of the multiply-and-accumulate (MAC) operation, which is the core operation of neural networks. The accelerator is based on a multi-sub-array macro to increase parallelism, integrates multiplication operations within the memory cells such that they are executed while reading the cells, makes use of pipelining to further optimize the throughput of the MAC operations, and gets rid of the expensive adder-tree structures commonly used in State-of-The-Art (SOTA) digital CIM solutions by replacing them with a custom accumulation circuit to reduce power and area. The SPICE simulation results of the DREAM-CIM accelerator show an energy efficiency of 5097 TOPS/W (normalized to a 1-bit × 1-bit MAC operation) and an area efficiency of 3854 TOPS/mm$^2$ using 22 nm technology node.
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy.
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy.
Conference paper
(2024)
-
Asmae El Arrassi, Mohammad Amin Yaldagard, Xingjian Tao, Taha Shahroodi, Fouwad Mir, Yashvardhan Biyani, Manil Dev Gomony, Anteneh Gebregiorgis, Rajiv Joshi, Said Hamdioui
Binary Neural Networks (BNNs) have demonstrated significant advantages in reducing computation and memory costs, all while maintaining acceptable accuracy on various image detection tasks. Thus, BNNs have the potential to support practical cognitive tasks on resource-constrained platforms, such as edge computing devices. To realize this, SRAM-based digital Computation-in-Memory (CIM) has gained growing attention as it overcomes the analog CIM architecture bottlenecks such as limited computing accuracy due to process variation, non-linearity, power and area-hungry Analog-to-Digital Converters (ADCs), etc. However, digital CIM architectures are highly dominated by power-hungry adder-trees, which can nullify the benefits of SRAM-based digital CIM. To address this issue, this paper proposes an adder free SRAM-based digital CIM, AFSRAM-CIM, for BNN acceleration. The proposed CIM architecture utilizes a multi-functional 10-T SRAM cell-based crossbar array and a new energy-efficient approach to perform the popcount operation. Simulation results using the MNIST dataset show that the proposed architecture maintains the state-of-the-art inference accuracy of 99.21% with only 11.86 fJ energy per operation. Moreover, AFSRAM-CIM achieves over 3× energy and ≈17× area savings when compared to the conventional digital CIM approaches.
...
Binary Neural Networks (BNNs) have demonstrated significant advantages in reducing computation and memory costs, all while maintaining acceptable accuracy on various image detection tasks. Thus, BNNs have the potential to support practical cognitive tasks on resource-constrained platforms, such as edge computing devices. To realize this, SRAM-based digital Computation-in-Memory (CIM) has gained growing attention as it overcomes the analog CIM architecture bottlenecks such as limited computing accuracy due to process variation, non-linearity, power and area-hungry Analog-to-Digital Converters (ADCs), etc. However, digital CIM architectures are highly dominated by power-hungry adder-trees, which can nullify the benefits of SRAM-based digital CIM. To address this issue, this paper proposes an adder free SRAM-based digital CIM, AFSRAM-CIM, for BNN acceleration. The proposed CIM architecture utilizes a multi-functional 10-T SRAM cell-based crossbar array and a new energy-efficient approach to perform the popcount operation. Simulation results using the MNIST dataset show that the proposed architecture maintains the state-of-the-art inference accuracy of 99.21% with only 11.86 fJ energy per operation. Moreover, AFSRAM-CIM achieves over 3× energy and ≈17× area savings when compared to the conventional digital CIM approaches.
Spiking Neural Networks (SNNs) can drastically improve the energy efficiency of neuromorphic computing through network sparsity and event-driven execution. Thus, SNNs have the potential to support practical cognitive tasks on resource constrained platforms, such as edge devices. To realize this, SNN requires energy-efficient hardware which can run applications with a limited energy budget. However, the conventional CMOS implementations cannot achieve this goal due to the various architectural and technological challenges. In this work, we address these issues by developing an energy-efficient and accurate SNN hardware based on Computation In-Memory (CIM) architecture using Resistive Random Access Memory (RRAM) devices. The developed SNN architecture is based on unsupervised Spike Time Dependent Plasticity (STDP) learning algorithm with online learning capability. Simulation results show that the proposed architecture is energy-efficient with a consumption of ≈20 fJ per spike, while maintaining state-of-the-art inference accuracy of 95% when evaluated using the MNIST dataset.
...
Spiking Neural Networks (SNNs) can drastically improve the energy efficiency of neuromorphic computing through network sparsity and event-driven execution. Thus, SNNs have the potential to support practical cognitive tasks on resource constrained platforms, such as edge devices. To realize this, SNN requires energy-efficient hardware which can run applications with a limited energy budget. However, the conventional CMOS implementations cannot achieve this goal due to the various architectural and technological challenges. In this work, we address these issues by developing an energy-efficient and accurate SNN hardware based on Computation In-Memory (CIM) architecture using Resistive Random Access Memory (RRAM) devices. The developed SNN architecture is based on unsupervised Spike Time Dependent Plasticity (STDP) learning algorithm with online learning capability. Simulation results show that the proposed architecture is energy-efficient with a consumption of ≈20 fJ per spike, while maintaining state-of-the-art inference accuracy of 95% when evaluated using the MNIST dataset.