F. Kawsar | TU Delft Repository

Characterising the Role of Pre-Processing Parameters in Audio-based Embedded Machine Learning

Conference paper (2021) - Wiebke Toussaint, Akhil Mathur, Aaron Yi Ding, Fahim Kawsar

When deploying machine learning (ML) models on embedded and IoT devices, performance encompasses more than an accuracy metric: inference latency, energy consumption, and model fairness are necessary to ensure reliable performance under heterogeneous and resource-constrained operating conditions. To this end, prior research has studied model-centric approaches, such as tuning the hyperparameters of the model during training and later applying model compression techniques to tailor the model to the resource needs of an embedded device. In this paper, we take a data-centric view of embedded ML and study the role that pre-processing parameters in the data pipeline can play in balancing the various performance metrics of an embedded ML system. Through an in-depth case study with audio-based keyword spotting (KWS) models, we show that pre-processing parameter tuning is a remarkable tool that model developers can adopt to trade-off between a model's accuracy, fairness, and system efficiency, as well as to make an embedded ML model resilient to unseen deployment conditions. ...

Pervasive Video and Audio

Journal article (2021) - Fahim Kawsar, Romit Roy Choudhury, Ganesh Ananthanarayanan

EPerceptive

Energy reactive embedded intelligence for batteryless sensors

Conference paper (2020) - Alessandro Montanari, Manuja Sharma, Dainius Jenkus, Mohammed Alloulah, Lorena Qendro, Fahim Kawsar

For long, we have studied tiny energy harvesters to liberate sensors from batteries. With remarkable progress in embedded deep learning, we are now re-imagining these sensors as intelligent compute nodes. Naturally, we are approaching a crossroad where sensor intelligence is meeting energy autonomy enabling maintenance-free swarm intelligence and unleashing a plethora of applications ranging from precision agriculture to ubiquitous asset tracking to infrastructure monitoring. One of the critical challenges, however, is to adapt intelligence fidelity in response to available energy to maximise the overall system availability. To this end, we present the design and implementation of ePerceptive: a novel framework for best-effort embedded intelligence, i.e., inference fidelity varies in proportion to the instantaneous energy supplied. ePerceptive operates on two core principles. First, it enables training a single deep neural network (DNN) to operate on multiple input resolutions without compromising accuracy or incurring memory overhead. Second, it modifies a DNN architecture by injecting multiple exits to guarantee valid, albeit lower-fidelity inferences in the event of energy interruption. The combination of these techniques offers a smooth adaptation between inference latency and recognition accuracy while matching the computational load to the available power budget. We report the manifestation of ePerceptive in designing batteryless cameras and microphones built with TI MSP430 MCU and off-the-shelf RF and solar energy harvesters. Our evaluation of these batteryless sensors with multiple vision and acoustic workloads suggest that the dynamic adaptation of ePerceptive can increase the inference throughput by up to 80% compared to a static baseline while ensuring a maximum accuracy drop of less than 6%. ...

A systematic study of unsupervised domain adaptation for robust human-activity recognition

Journal article (2020) - Youngjae Chang, Akhil Mathur, Anton Isopoussu, Junehwa Song, Fahim Kawsar

Wearable sensors are increasingly becoming the primary interface for monitoring human activities. However, in order to scale human activity recognition (HAR) using wearable sensors to million of users and devices, it is imperative that HAR computational models are robust against real-world heterogeneity in inertial sensor data. In this paper, we study the problem of wearing diversity which pertains to the placement of the wearable sensor on the human body, and demonstrate that even state-of-the-art deep learning models are not robust against these factors. The core contribution of the paper lies in presenting a first-of-its-kind in-depth study of unsupervised domain adaptation (UDA) algorithms in the context of wearing diversity - we develop and evaluate three adaptation techniques on four HAR datasets to evaluate their relative performance towards addressing the issue of wearing diversity. More importantly, we also do a careful analysis to learn the downsides of each UDA algorithm and uncover several implicit data-related assumptions without which these algorithms suffer a major degradation in accuracy. Taken together, our experimental findings caution against using UDA as a silver bullet for adapting HAR models to new domains, and serve as practical guidelines for HAR practitioners as well as pave the way for future research on domain adaptation in HAR. ...

A closer look at quality-aware runtime assessment of sensing models in multi-device environments

Conference paper (2019) - Chulhong Min, Alessandro Montanari, Akhil Mathur, Fahim Kawsar

The increasing availability of multiple sensory devices on or near a human body has opened brand new opportunities to leverage redundant sensory signals for powerful sensing applications. For instance, personal-scale sensory inferences with motion and audio signals can be done individually on a smartphone, a smartwatch, and even an earbud - each offering unique sensor quality, model accuracy, and runtime behaviour. At execution time, however, it is incredibly challenging to assess these characteristics to select the best device for accurate and resource-efficient inferences. To this end, we look at a quality-aware collaborative sensing system that actively interplays across multiple devices and respective sensing models. It dynamically selects the best device as a function of model accuracy at any given context. We propose two complementary techniques for the runtime quality assessment. Borrowing principles from active learning, our first technique runs on three heuristic-based quality assessment functions that employ confidence, margin sampling, and entropy of models' output. Our second technique is built with a siamese neural network and acts on the premise that runtime sensing quality can be learned from historical data. Our evaluation across multiple motion and audio datasets shows that our techniques provide 12% increase in overall accuracy through dynamic device selection at the average expense of 13 mW power on each device as compared to traditional single-device approaches. ...

Situation-Aware Emotion Regulation of Conversational Agents with Kinetic Earables

Conference paper (2019) - Shin Katayama, Akhil Mathur, Marc Van Den Broeck, Tadashi Okoshi, Jin Nakazawa, Fahim Kawsar

Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users' situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a mixed-method study that informs the design of a situation-and emotion-aware conversational agent for kinetic earables. We surveyed 280 users, and qualitatively interviewed 12 users to understand their expectation from a conversational agent in adapting the interaction style. Grounded on our findings, we develop a first-of-its-kind emotion regulator for a conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, motion signals and ambient sound. We describe these context models, the end-to-end system including a purpose-built kinetic earable and their real-world assessment. The experimental results demonstrate that our regulation mechanism invariably elicits better and affective user experience in comparison to baseline conditions in different real-world settings. ...

AudiDoS

Real-time denial-of-service adversarial attacks on deep audio models

Conference paper (2019) - Taesik Gong, Alberto Gil C.P. Ramos, Sourav Bhattacharya, Akhil Mathur, Fahim Kawsar

Deep learning has enabled personal and IoT devices to rethink microphones as a multi-purpose sensor for understanding conversation and the surrounding environment. This resulted in a proliferation of Voice Controllable Systems (VCS) around us. The increasing popularity of such systems is also prone to attracting miscreants, who often want to take advantage of the VCS without the knowledge of the user. Consequently, understanding the robustness of VCS, especially under adversarial attacks, has become an important research topic. Although there exists some previous work on audio adversarial attacks, their scopes are limited to embedding the attacks onto pre-recorded music clips, which when played through speakers cause VCS to misbehave. As an attack-audio needs to be played, the occurrence of this type of attacks can be suspected by a human listener. In this paper, we focus on audio-based Denial-of-Service (DoS) attack, which is unexplored in the literature. Contrary to previous work, we show that adversarial audio attacks in real-time and overthe-air are possible, while a user interacts with VCS. We show that the attacks are effective regardless of the user's command and interaction timings. In this paper, we present a first-of-itskind imperceptible and always-on universal audio perturbation technique that enables such DoS attack to be successful. We thoroughly evaluate the performance of the attacking scheme across (i) two learning tasks, (ii) two model architectures and (iii) three datasets. We demonstrate that the attack can introduce as high as 78% error rate in audio recognition tasks. ...

The city as a personal assistant

Conference paper (2019) - Utku Günay Acer, Marc Van Den Broeck, Fahim Kawsar

Conversational agents are increasingly becoming digital partners in our everyday computational experiences. Although rich, and fresh in content, they are oblivious to users’ locality beyond geospatial weather and traffic conditions. We introduce conversational agents that are hyper-local, embedded deeply into the urban infrastructure providing rich, purposeful, detail, and in some cases playful information relevant to a neighborhood. These agents are spatially constrained, and one can only interact with them once she is in close vicinity at street-level granularity. In other words, the city provides personal, stateful, spontaneous service to its citizens through the agents installed in urban landmarks. Drawing lessons from two user studies, we identify the requirements for this system. We then discuss the architecture of these agents that leverage covert communication channels and machine learning algorithms that run on the edge and wearable devices to offer meaningful conversational experience in urban settings. ...

Degradable inference for energy autonomous vision applications

Conference paper (2019) - Alessandro Montanari, Mohammed Alloulah, Fahim Kawsar

Mobile vision systems, often battery-powered, are now incredibly powerful in capturing, analyzing, and understanding real-world events uncovering interminable opportunities for new applications in the areas of life-logging, cognitive augmentation, security, safety, wildlife surveillance, etc. There are two complementary challenges in the design of a mobile vision system today - improving the recognition accuracy at the expense of minimum energy consumption. In this work, we posit that best-effort sensing with degradable featurization and an elastic inference pipeline offers an interesting avenue to bring energy autonomy to mobile vision systems while ensuring acceptable recognition performance. Borrowing principles from Intermittent Computing, and Numerical Computing we propose such best-effort sensing using a Degradable-Inference pipeline supported by a parameterized Discrete Cosine Transformation (DCT) based featurization and an Anytime Deep Neural Network. These two principles aim at extending the lifetime of a mobile vision system while minimizing compute and communication cost without compromising recognition performance. We report the design and early characterization of our proposed solution. ...

Automatic Smile and Frown Recognition with Kinetic Earables

Conference paper (2019) - Seungchul Lee, Chulhong Min, Alessandro Montanari, Akhil Mathur, Youngjae Chang, Junehwa Song, Fahim Kawsar

In this paper, we introduce inertial signals obtained from an earable placed in the ear canal as a new compelling sensing modality for recognising two key facial expressions: Smile and frown. Borrowing principles from Facial Action Coding Systems, we first demonstrate that an inertial measurement unit of an earable can capture facial muscle deformation activated by a set of temporal microexpressions. Building on these observations, we then present three different learning schemes - shallow models with statistical features, hidden Markov model, and deep neural networks to automatically recognise smile and frown expressions from inertial signals. The experimental results show that in controlled non-conversational settings, we can identify smile and frown with high accuracy (F1 score: 0.85). ...

An early characterisation of wearing variability on motion signals for wearables

Conference paper (2019) - Chulhong Min, Akhil Mathur, Alessandro Montanari, Fahim Kawsar

We explore a new variability observed in motion signals acquired from modern wearables. Wearing variability refers to the variations of the device orientation and placement across wearing events. We collect the accelerometer data on a smartwatch and an earbud and analyse how motion signals change due to the wearing variability. Our analysis shows that the wearing variability can bring an unexpected change to motion signals, not only from different users but also from different wearing sessions of the same user. We also provide empirical ranges of changes in device orientations. ...

Trusted and GDPR-Compliant Research with the Internet of Things

Conference paper (2018) - Jacky Bourgeois, Gerd Kortuem, Fahim Kawsar

The Internet of Things has become a key enabling technology for data-intensive research across universities and private organisations alike. However, the recent introduction of the General Data Protection Regulation (GDPR) in Europe has raised concerns that the GDPR might hamper data-intensive research. In this paper, we address the question of how to enable ethical and compliant research with personal IoT data in an academic environment. We identify three novel trust principles for GDPR compliant use of personal IoT data in science and research (private-by-default, analytics transparency and Accountable analytics) and propose an architecture for a trusted IoT research infrastructure. ...

Poster

Audio-Kinetic Model for Automatic Dietary Monitoring with Earable Devices

Poster (2018) - Chulhong Min, Akhil Mathur, Fahim Kawsar

Multimodal Deep Learning for Activity and Context Recognition

Conference paper (2018) - V. Radu, C. Tong, S. Bhattacharya, Nicholas D. Lane, C. Mascolo, M.K. Marina, Fahim Kawsar

Monitoring Daily Activities of Multiple Sclerosis Patients with Connected Health Devices

Conference paper (2018) - Sourav Bhattacharya, Alberto Gil C.P. Ramos, Fahim Kawsar, Nicholas D. Lane, Lynn M. Gionta, Joanne Manidis, Greg Silvesti, Mathieu Vegreville

We report results from a pilot study that focuses mainly on understanding the everyday life quality of patients suffering from multiple sclerosis through the lens of connected Nokia Health devices. Our dataset comprises of 198 individuals (184 females and 14 males) and the study lasted over six months. By analyzing carefully crafted user-studies and correlating with personal sensor data collected with Nokia devices, we found that the level of fatigue is one of the main sources of discomfort across the majority of the patients. We further perform an exploratory analysis, which provides an early indication that by actively monitoring and perturbing users' daily activity levels, such as increasing daily step-counts, sleep duration and decreasing body weight, patients can potentially reduce their daily fatigue level. ...

An Early Resource Characterisation of Wi-Fi Sensing on Residential Gateways

Conference paper (2018) - Chulhong Min, Mohammed Alloulah, Fahim Kawsar

Recent research has successfully shown brand new models with Wi- Fi signals explaining space dynamics, assessing social environments, and even tracking people's posture, gesture and emotion. However, these models are seldom used in real execution and operating environments, i.e., on residential gateways with networking tasks. In this paper, we present the first, albeit preliminary, measurement study of common Wi-Fi sensing models on a residential gateway. This investigation aims to understand the performance characteristics, resource requirements, and execution bottlenecks for Wi-Fi sensing when being used in parallel with communication tasks. Based on our findings, we propose two optimisation techniques - i) dynamic sampling and ii) dynamic planning of inference execution - for optimum Wi-Fi sensing performance without compromising the quality of communication service. The results and insights lay an empirical foundation for the development of optimisation methods and execution environments that enable sensing models to be more readily integrated into next-generation residential gateways. ...

Beyond Testbeds

Real-World IoT Deployments

Journal article (2018) - Florian Michahelles, Fahim Kawsar, Simon Mayer, Luca Mottola

For a long time, the Internet of Things initiative was driven by academics-developing embedded hardware, sensing algorithms, network protocols, software frameworks, applications, business scenarios, and interaction paradigms. Only recently industrial stakeholders realized the unparalleled potential of these technologies, instigating a paradigm shift that we refer to as the fourth industrial revolution. However, as this shift starts entering the mainstream and disrupting a multitude of business dynamics, it also uncovered myriads of challenges-some are technical, some are political, and some are ethical. In this special issue, together with our guest authors, we focus our attention on some of the daunting system development challenges in bringing bleeding-edge Internet of Things (IoTs) technologies to the real world. Each of the five articles featured in this issue tackles unique challenges associated with the deployment of real-world IoT systems. ...

Demo

ESense - Earable platform for human sensing

Conference paper (2018) - Fahim Kawsar, Chulhong Min, Akhil Mathur, Marc Van den Broeck, Utku Gunay Acer, Claudio Forlivesi

Mindful interruptions

A lightweight system for managing interruptibility onwearables

Conference paper (2018) - Claudio Forlivesi, Utku Günay Acer, Marc Van Den Broeck, Fahim Kawsar

We present the design, development, and evaluation of a personalised, privacy-aware and multi-modal wearable-only system to model interruptibility. Our system runs as a background service of a wearable OS and operates on two key techniques: i) online learning to recognise interruptible situation at a personal scale and ii) runtime inference of opportune moments for an interruption. .e former is realised by a set of fast and ecient algorithms to automatically discover and learn interruptible situations as a function of meaningful places, and physical and conversational activities with active user engagement. .e la.er is substantiated with a multiphased context sensing mechanics to identify moments which are then utilised to delivery noti€cations and interactive contents at the right moment. Early experimental evaluation of our system shows a sharp 46% increase in the response rate of noti€cations in wearable se.ings at the expense of negligible 6.3% resource cost. ...

Cross-modal approach for conversational well-being monitoring with multi-sensory earables

Conference paper (2018) - Chulhong Min, Alessandro Montanari, Akhil Mathur, Seungchul Lee, Fahim Kawsar

We propose a cross-modal approach for conversational well-being monitoring with a multi-sensory earable. It consists of motion, audio, and BLE models on earables. Using the IMU sensor, the microphone, and BLE scanning, the models detect speaking activities, stress and emotion, and participants in the conversation, respectively. We discuss the feasibility in qualifying conversations with our purpose-built cross-modal model in an energy-efficient and privacy-preserving way. With the cross-modal model, we develop a mobile application that qualifies on-going conversations and provides personalised feedback on social well-being. ...