A 23-μW Keyword Spotting IC With Ring-Oscillator-Based Time-Domain Feature Extraction

None, None; None, None; None, None; None, None; None, None; None, None; None, None

A 23-μW Keyword Spotting IC With Ring-Oscillator-Based Time-Domain Feature Extraction

Journal Article (2022)

Author(s)

Kwantae Kim (Universitat Zurich, Korea Advanced Institute of Science and Technology)

Chang Gao (TU Delft - Electronics)

Rui Graca (Universitat Zurich)

Ilya Kiselev (Universitat Zurich)

Hoi Jun Yoo (Korea Advanced Institute of Science and Technology)

Tobi Delbruck (Universitat Zurich)

Shih Chii Liu (Universitat Zurich)

Research Group

Electronics

DOI related publication

https://doi.org/10.1109/JSSC.2022.3195610

Analog Time domain Rectifier Classifier Ring oscillator Recurrent neural network (RNN) Bandpass filter (BPF) Feature extractor (FEx) Google Speech Command dataset (GSCD) Keyword spotting (KWS)

To reference this document use:

https://resolver.tudelft.nl/uuid:a7f8ad3a-cf55-4350-a71b-6b605ca908bb

More Info

expand_more

Publication Year

2022

Language

English

Research Group

Electronics

Issue number

11

Volume number

57

Pages (from-to)

3298-3311

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This article presents the first keyword spotting (KWS) IC that uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front end. Benefiting from fundamental building blocks based on digital logic gates, it offers better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65-nm CMOS process, the prototyped KWS IC occupies 2.03 mm 2 and dissipates 23- $\mu \text{W}$ power consumption, including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves a 54.89-dB dynamic range for 16-ms frame shift size while consuming 9.3 $\mu \text{W}$. The measurement result verifies that the proposed IC performs a 12-class KWS task on the Google Speech Command dataset (GSCD) with >86% accuracy and 12.4-ms latency.

Files

A_23_W_Keyword_Spotting_IC_Wit... (pdf)

(pdf | 6.15 Mb)

- Embargo expired in 01-07-2023

License info not available