Hardware Acceleration of Linear Recurrent Units
More Info
expand_more
Abstract
State-space models (SSMs) combine attention-like parallelization with RNN-like inference efficiency, using internal states with linear update and output functions, similar to RNNs but without non-linearities in the update function. Linear Recurrent Units (LRUs), a type of SSM, are well-suited to keyword spotting due to their ability to handle long-range dependencies. However, hardware acceleration for LRUs remains unexplored and presents challenges due to high hardware cost components such as the GELU, LayerNorm and complex multiplication. This work modifies the LRU model architecture to enable a more efficient hardware implementation and designs an accelerator tailored to the modified architecture. We propose the GRELU, a new activation function well-suited to inference on hardware. The modified model architecture achieved an accuracy of 95.5% on the Google Speech Commands dataset. The accelerator's vector unit supports complex operations and reductions, with respective overheads of only 19.1% and 27.2% over basic operations. Our results demonstrate the proposed hardware accelerator's efficiency and effectiveness at keyword spotting applications.
Files
File under embargo until 19-08-2026