HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN Accelerators

Li, Siyue; Zhou, Shize; Xue, Yongqi; Fan, Wenjie; Cheng, Tong; Ji, Jinlun; Dai, Chenyang; Song, Wenqing; Gao, C.

doi:10.1109/TCSI.2024.3359912

HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN Accelerators

Title

HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN Accelerators

Author

Li, Siyue (Nanjing University)
Zhou, Shize (Nanjing University)
Xue, Yongqi (Nanjing University)
Fan, Wenjie (Nanjing University)
Cheng, Tong (Nanjing University)
Ji, Jinlun (Nanjing University)
Dai, Chenyang (Nanjing University)
Song, Wenqing (Nanjing University)
Gao, C. (TU Delft Electronics)

Date

2024

Abstract

Network-on-Chip (NoC) is a scalable on-chip communication architecture for the NN accelerator, but with the increase in the number of nodes, the communication delay becomes higher. Applications such as machine learning have a certain resilience to noisy/erroneous transmitted data. Therefore, approximate communication becomes a promising solution to improving performance by reducing traffic loads under the constraint of the acceptable maximum accuracy loss of neural networks. It is a key issue to balance the result quality and the communication delay for approximate NoC systems. The traditional approximate NoC only considers the node-to-node approximation-based dynamic traffic regulation. However, the dynamically changing traffic patterns across different nodes, different times, and different applications lead to a huge search space, which makes it hard to explore an optimal global approximation solution. In this paper, we propose a quality model for different neural networks, which presents the relationship between the quality loss and the data approximate rate. Then, a hierarchical approximate scheme optimized with reinforcement learning (HAS-RL) is proposed and we reduce the complexity of the HAS-RL by reducing the state space and action space, which will reduce the resource overhead as well. After that, we embed a global approximate controller in the NoC system, in which we deploy a policy network trained with the offline reinforcement learning algorithm to adjust the data approximate rates of each node at run time. Compared with the state-of-the-art method, the proposed scheme reduces the average network delay by $13.5\%$ while their accuracies are similar. The proposed HAS-RL only causes an additional area overhead of $1.24\%$ and power consumption of $0.77\%$ compared with the traditional router design.

Subject

Offline reinforcement learning
neural network
approximate communication
network-on-chip

To reference this document use:

http://resolver.tudelft.nl/uuid:9b96d465-8916-4499-8147-6b0f9faee334

DOI

https://doi.org/10.1109/TCSI.2024.3359912

Embargo date

2024-08-19

ISSN

1558-0806

Source

IEEE Transactions on Circuits and Systems Part 1: Regular Papers, 71 (4), 1863-1875

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Part of collection

Institutional Repository

Document type

journal article

Rights

Files

file embargo until 2024-08-19