Cerebron

None, None; None, None; None, None

Cerebron

A Reconfigurable Architecture for Spatio-Temporal Sparse Spiking Neural Networks

Journal Article (2022)

Author(s)

Qinyu Chen (University of Shanghai for Science and Technology)

C. Gao (TU Delft - Electronics)

Yuxiang Fu (Nanjing University)

Research Group

Electronics

Copyright

DOI related publication

https://doi.org/10.1109/TVLSI.2022.3196839

Field-programmable gate array (FPGA) Gate array (FPGA) Spiking neural network (SNN) Workload balancing

To reference this document use:

https://resolver.tudelft.nl/uuid:24b834f4-4b3f-4aca-80c5-45ed89c5a22f

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Electronics

Issue number

10

Volume number

30

Pages (from-to)

1425 - 1437

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least 17.5<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> prediction energy reduction and 20<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

Files

Cerebron_A_Reconfigurable_Arch... (pdf)

(pdf | 4.58 Mb)

- Embargo expired in 01-07-2023

License info not available