L.C.A. Huijbregts
Please Note
2 records found
1
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy. ...
The obtained circuit-level results were fed into a python-based system-level simulator to benchmark the system architecture using two applications, i.e., image classification (using MNIST and CIFAR-10 dataset on LeNet5 and Resnet-20 models) and object detection (using COCO dataset on the YoloV6 model). The system-level results show that DREAM-CIM can achieve an energy efficiency of 0.1mJ, 0.2mJ, and 11.02mJ per inference for the MNIST, YOLOv6, and CIFAR-10 datasets, respectively, while maintaining SOTA accuracy.
Current Artificial Intelligence (AI) computation systems face challenges, primarily from the memory-wall issue, limiting overall system-level performance, especially for Edge devices with constrained battery budgets, such as smartphones, wearables, and Internet-of-Things sensor systems. In this paper, we propose a new SRAM-based Compute-In-Memory (CIM) accelerator optimized for Spiking Neural Networks (SNNs) Inference. Our proposed architecture employs a multiport SRAM design with multiple decoupled Read ports to enhance the throughput and Transposable Read-Write ports to facilitate online learning. Furthermore, we develop an Arbiter circuit for efficient data-processing and port allocations during the computation. Results for a 128×128 array in 3nm FinFET technology demonstrate a 3.1× improvement in speed and a 2.2× enhancement in energy efficiency with our proposed multiport SRAM design compared to the traditional single-port design. At system-level, a throughput of 44 MInf/s at 607 pJ/Inf and 29mW is achieved.