Near-Precise Parameter Approximation for Multiple Multiplications on A Single DSP Block

Journal Article (2022)
Author(s)

E. Kalali (TU Delft - Signal Processing Systems)

R. van Leuken (TU Delft - Signal Processing Systems)

Research Group
Signal Processing Systems
Copyright
© 2022 E. Kalali, T.G.R.M. van Leuken
DOI related publication
https://doi.org/10.1109/TC.2021.3119187
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 E. Kalali, T.G.R.M. van Leuken
Research Group
Signal Processing Systems
Issue number
9
Volume number
71
Pages (from-to)
2036-2047
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

DSP blocks are one of the efficient solutions to implement multiply-accumulate (MAC) operations on FPGAs. However, since the DSP blocks have wide multiplier and adder blocks, MAC operations using low bit-length parameters lead to an underutilization. Hence, an efficient approximation technique is introduced. The technique includes manipulation and approximation of the low bit-length parameters based upon a Single DSP - Multiple Multiplication (SDMM) execution. The accuracy of the developed optimization technique was evaluated for different CNN weight bit precisions using the Alexnet and VGG-16 networks and the ImageNet ILSVRC-2012 dataset. The optimization can be implemented without loss of accuracy in almost all cases, while it causes slight accuracy losses in a few cases. Through these optimizations, multiple parameter multiplications are performed in a single DSP block at the cost of a small hardware overhead. As a result of our optimizations, the parameters are represented in a different format on off-chip memory, providing up to 33% compression without any hardware cost. A prototype systolic array architecture was implemented employing our optimizations on a Xilinx Zynq FPGA. It reduced the number of DSP blocks by 66.6%, 75%, and 83.3% for 8, 6, and 4-bit input variables, respectively.

Files

Near_Precise_Parameter_Approxi... (pdf)
(pdf | 1.11 Mb)
- Embargo expired in 01-07-2023
License info not available