An 8.62-μW 75-dB DR<sub>SoC</sub> Fully Integrated SoC for Spoken Language Understanding

None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None

An 8.62-μW 75-dB DR_SoC Fully Integrated SoC for Spoken Language Understanding

Journal Article (2025)

Author(s)

Sheng Zhou (Universitat Zurich, ETH Zürich)

Zixiao Li (Universitat Zurich, ETH Zürich)

Longbiao Cheng (Universitat Zurich, ETH Zürich)

Jerome Hadorn (Universitat Zurich, ETH Zürich)

Chang Gao (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Qinyu Chen (Universiteit Leiden)

Tobi Delbruck (Universitat Zurich, ETH Zürich)

Kwantae Kim (Aalto University)

Shih Chii Liu (ETH Zürich, Universitat Zurich)

Research Group

Electronics

Ultra-low power Recurrent neural network (RNN) Feature extractor (FEx) Voice interface Automatic gain control (AGC) Edge artificial intelligence (AI) Hardware–software co-design Spoken language understanding (SLU) Tiny machine learning (TinyML)

DOI related publication

https://doi.org/10.1109/JSSC.2025.3602936 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:eae7356c-26a5-4526-bde4-821a418b8eae

More Info

expand_more

Publication Year

2025

Language

English

Research Group

Electronics

Bibliographical Note

Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl. Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Journal title

IEEE Journal of Solid-State Circuits

Issue number

11

Volume number

60

Pages (from-to)

4002-4017

Downloads counter

157

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We present a sub-10-µW fully integrated SoC for on-device spoken language understanding (SLU). Its analog feature extractor (FEx) applies global and per-channel automatic gain control (AGC) to extend the system’s dynamic range (DR)—a critical requirement for real-world scenarios, including far-field operations. The on-chip streaming-mode recurrent neural network (RNN) accelerator exploits temporal sparsity and pooling, reducing its power by 2.3x. By combining hardware-aware training with a behavioral model of the FEx that captures circuit nonidealities, the network is trained to maintain SLU accuracy despite chip-to-chip variation. Fabricated in a 65-nm CMOS process, the SoC occupies 2.23 mm
² and consumes 8.62 µW for end-to-end SLU. The 16-channel FEx achieves 93-dB DR while dissipating 1.85 µW at 100-Hz feature frame rate. The SoC is evaluated on the 32-class Fluent Speech Commands dataset (FSCD), achieving 92.9% accuracy for 2.8-mV
_rms inputs while maintaining >85% accuracy over a 75-dB input range.

Files

An_8.62-W_75-dB_DRSoC_Fully_In... (pdf)

(pdf | 5.69 Mb)

- Embargo expired in 16-03-2026

License info not available

An 8.62-μW 75-dB DRSoC Fully Integrated SoC for Spoken Language Understanding

Abstract

Files

An 8.62-μW 75-dB DR_SoC Fully Integrated SoC for Spoken Language Understanding