Cerebron

A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Journal Article (2022)
Author(s)

Qinyu Chen (University of Shanghai for Science and Technology)

Chang Gao (TU Delft - Electronics)

Yuxiang Fu (Nanjing University)

Research Group
Electronics
Copyright
© 2022 Qinyu Chen, C. Gao, Yuxiang Fu
DOI related publication
https://doi.org/10.1109/TVLSI.2022.3196839
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Qinyu Chen, C. Gao, Yuxiang Fu
Research Group
Electronics
Issue number
10
Volume number
30
Pages (from-to)
1425 - 1437
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> prediction energy reduction and 20<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

Files

Cerebron_A_Reconfigurable_Arch... (pdf)
(pdf | 4.58 Mb)
- Embargo expired in 01-07-2023
License info not available