A 23-μW Keyword Spotting IC With Ring-Oscillator-Based Time-Domain Feature Extraction

Journal Article (2022)
Author(s)

Kwantae Kim (Universitat Zurich, Korea Advanced Institute of Science and Technology)

C. Gao (TU Delft - Electronics)

Rui Graca (Universitat Zurich)

Ilya Kiselev (Universitat Zurich)

Hoi Jun Yoo (Korea Advanced Institute of Science and Technology)

Tobi Delbruck (Universitat Zurich)

Shih Chii Liu (Universitat Zurich)

Research Group
Electronics
Copyright
© 2022 Kwantae Kim, C. Gao, Rui Graca, Ilya Kiselev, Hoi Jun Yoo, Tobi Delbruck, Shih Chii Liu
DOI related publication
https://doi.org/10.1109/JSSC.2022.3195610
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Kwantae Kim, C. Gao, Rui Graca, Ilya Kiselev, Hoi Jun Yoo, Tobi Delbruck, Shih Chii Liu
Research Group
Electronics
Issue number
11
Volume number
57
Pages (from-to)
3298-3311
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This article presents the first keyword spotting (KWS) IC that uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front end. Benefiting from fundamental building blocks based on digital logic gates, it offers better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65-nm CMOS process, the prototyped KWS IC occupies 2.03 mm 2 and dissipates 23- $\mu \text{W}$ power consumption, including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves a 54.89-dB dynamic range for 16-ms frame shift size while consuming 9.3 $\mu \text{W}$. The measurement result verifies that the proposed IC performs a 12-class KWS task on the Google Speech Command dataset (GSCD) with >86% accuracy and 12.4-ms latency.

Files

A_23_W_Keyword_Spotting_IC_Wit... (pdf)
(pdf | 6.15 Mb)
- Embargo expired in 01-07-2023
License info not available