Hardware Acceleration of Linear Recurrent Units

None, None

Hardware Acceleration of Linear Recurrent Units

Master Thesis (2024)

Author(s)

F. Ayala Le Brun (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Frenkel – Mentor (TU Delft - Electronic Instrumentation)

Przemyslaw Pawełczak – Mentor (TU Delft - Embedded Systems)

João Sacramento – Graduation committee member (Google)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:efebfb3b-ee53-4b10-a120-da3c0f0fd0c5

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

19-08-2024

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

State-space models (SSMs) combine attention-like parallelization with RNN-like inference efficiency, using internal states with linear update and output functions, similar to RNNs but without non-linearities in the update function. Linear Recurrent Units (LRUs), a type of SSM, are well-suited to keyword spotting due to their ability to handle long-range dependencies. However, hardware acceleration for LRUs remains unexplored and presents challenges due to high hardware cost components such as the GELU, LayerNorm and complex multiplication. This work modifies the LRU model architecture to enable a more efficient hardware implementation and designs an accelerator tailored to the modified architecture. We propose the GRELU, a new activation function well-suited to inference on hardware. The modified model architecture achieved an accuracy of 95.5% on the Google Speech Commands dataset. The accelerator's vector unit supports complex operations and reductions, with respective overheads of only 19.1% and 27.2% over basic operations. Our results demonstrate the proposed hardware accelerator's efficiency and effectiveness at keyword spotting applications.

Files

Msc_Francisco_Ayala_Le_Brun.pd... (pdf)

(pdf | 0 Mb)

License info not available

File under embargo until 19-08-2026