A 23-μW Keyword Spotting IC With Ring-Oscillator-Based Time-Domain Feature Extraction
Kwantae Kim (Universitat Zurich, Korea Advanced Institute of Science and Technology)
Chang Gao (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Rui Graca (Universitat Zurich)
Ilya Kiselev (Universitat Zurich)
Hoi Jun Yoo (Korea Advanced Institute of Science and Technology)
Tobi Delbruck (Universitat Zurich)
Shih Chii Liu (Universitat Zurich)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This article presents the first keyword spotting (KWS) IC that uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front end. Benefiting from fundamental building blocks based on digital logic gates, it offers better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65-nm CMOS process, the prototyped KWS IC occupies 2.03 mm 2 and dissipates 23- $\mu \text{W}$ power consumption, including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves a 54.89-dB dynamic range for 16-ms frame shift size while consuming 9.3 $\mu \text{W}$. The measurement result verifies that the proposed IC performs a 12-class KWS task on the Google Speech Command dataset (GSCD) with >86% accuracy and 12.4-ms latency.