SASCNN: A Systolic Array Simulator for CNN

More Info
expand_more

Abstract

Convolution Neural Networks (CNN) are used in many applications ranging from real-time object detection to robot-motion planning. CNNs are implemented on high-performance systems like multi-core CPU and GPU, these are of high power in nature and thus cannot be deployed in edge devices due to their limited battery power. The edge device has to provide real-time performance along with being low power, this prompts for an exploration of novel architectures catered towards the processing of CNNs. The recent works towards this goal have been the development of CNN accelerators using systolic array spatial architectures. The row-column stationary data-flow approach maximizes the reuse of weights, input feature maps and output feature maps across the array. Different applications require different performance, area and energy needs, and this makes it imperative to quickly prototype the architectural ideas and perform design space exploration. The challenging part is the non-trivial interactions between different architectural design parameters, as they play an important part in the complex design decisions. Hence, a hardware simulator to accelerate CNN is designed in this work. It is based on systolic array and uses row-column stationary data-flow with a near memory computing approach. The simulator supports different numerical precision such as 16-bit and 8-bit floating-point along with numerous design parameters such as the size of the systolic array, latency of MAC operation, PE local memory size, PE local memory latency and external memory latency. The functionality of the proposed design is verified on AlexNet. The Destiny memory modelling tool, along with energy and area estimation model, is used to perform a system study to investigate the trade-offs between different architectural design parameters.