Characterising the Role of Pre-Processing Parameters in Audio-based Embedded Machine Learning

Conference Paper (2021)
Author(s)

Wiebke (Toussaint) Toussaint (TU Delft - Information and Communication Technology)

Akhil Mathur (Nokia Bell Labs)

Aaron Yi Ding (TU Delft - Information and Communication Technology)

F. Kawsar (Nokia Bell Labs)

Research Group
Information and Communication Technology
Copyright
© 2021 Wiebke Hutiri, Akhil Mathur, Aaron Yi Ding, F. Kawsar
DOI related publication
https://doi.org/10.1145/3485730.3493448
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Wiebke Hutiri, Akhil Mathur, Aaron Yi Ding, F. Kawsar
Research Group
Information and Communication Technology
Pages (from-to)
439-445
ISBN (electronic)
9781450390972
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

When deploying machine learning (ML) models on embedded and IoT devices, performance encompasses more than an accuracy metric: inference latency, energy consumption, and model fairness are necessary to ensure reliable performance under heterogeneous and resource-constrained operating conditions. To this end, prior research has studied model-centric approaches, such as tuning the hyperparameters of the model during training and later applying model compression techniques to tailor the model to the resource needs of an embedded device. In this paper, we take a data-centric view of embedded ML and study the role that pre-processing parameters in the data pipeline can play in balancing the various performance metrics of an embedded ML system. Through an in-depth case study with audio-based keyword spotting (KWS) models, we show that pre-processing parameter tuning is a remarkable tool that model developers can adopt to trade-off between a model's accuracy, fairness, and system efficiency, as well as to make an embedded ML model resilient to unseen deployment conditions.