Title
HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN Accelerators
Author
Li, Siyue (Nanjing University)
Zhou, Shize (Nanjing University)
Xue, Yongqi (Nanjing University)
Fan, Wenjie (Nanjing University)
Cheng, Tong (Nanjing University)
Ji, Jinlun (Nanjing University)
Dai, Chenyang (Nanjing University)
Song, Wenqing (Nanjing University)
Gao, C. (TU Delft Electronics)
Date
2024
Abstract
Network-on-Chip (NoC) is a scalable on-chip communication architecture for the NN accelerator, but with the increase in the number of nodes, the communication delay becomes higher. Applications such as machine learning have a certain resilience to noisy/erroneous transmitted data. Therefore, approximate communication becomes a promising solution to improving performance by reducing traffic loads under the constraint of the acceptable maximum accuracy loss of neural networks. It is a key issue to balance the result quality and the communication delay for approximate NoC systems. The traditional approximate NoC only considers the node-to-node approximation-based dynamic traffic regulation. However, the dynamically changing traffic patterns across different nodes, different times, and different applications lead to a huge search space, which makes it hard to explore an optimal global approximation solution. In this paper, we propose a quality model for different neural networks, which presents the relationship between the quality loss and the data approximate rate. Then, a hierarchical approximate scheme optimized with reinforcement learning (HAS-RL) is proposed and we reduce the complexity of the HAS-RL by reducing the state space and action space, which will reduce the resource overhead as well. After that, we embed a global approximate controller in the NoC system, in which we deploy a policy network trained with the offline reinforcement learning algorithm to adjust the data approximate rates of each node at run time. Compared with the state-of-the-art method, the proposed scheme reduces the average network delay by $13.5\%$ while their accuracies are similar. The proposed HAS-RL only causes an additional area overhead of $1.24\%$ and power consumption of $0.77\%$ compared with the traditional router design.
Subject
Offline reinforcement learning
neural network
approximate communication
network-on-chip
To reference this document use:
http://resolver.tudelft.nl/uuid:9b96d465-8916-4499-8147-6b0f9faee334
DOI
https://doi.org/10.1109/TCSI.2024.3359912
Embargo date
2024-08-19
ISSN
1558-0806
Source
IEEE Transactions on Circuits and Systems Part 1: Regular Papers, 71 (4), 1863-1875
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Part of collection
Institutional Repository
Document type
journal article
Rights
© 2024 Siyue Li, Shize Zhou, Yongqi Xue, Wenjie Fan, Tong Cheng, Jinlun Ji, Chenyang Dai, Wenqing Song, C. Gao, More Authors