F. Kawsar
Please Note
25 records found
1
Wearable sensors are increasingly becoming the primary interface for monitoring human activities. However, in order to scale human activity recognition (HAR) using wearable sensors to million of users and devices, it is imperative that HAR computational models are robust against real-world heterogeneity in inertial sensor data. In this paper, we study the problem of wearing diversity which pertains to the placement of the wearable sensor on the human body, and demonstrate that even state-of-the-art deep learning models are not robust against these factors. The core contribution of the paper lies in presenting a first-of-its-kind in-depth study of unsupervised domain adaptation (UDA) algorithms in the context of wearing diversity - we develop and evaluate three adaptation techniques on four HAR datasets to evaluate their relative performance towards addressing the issue of wearing diversity. More importantly, we also do a careful analysis to learn the downsides of each UDA algorithm and uncover several implicit data-related assumptions without which these algorithms suffer a major degradation in accuracy. Taken together, our experimental findings caution against using UDA as a silver bullet for adapting HAR models to new domains, and serve as practical guidelines for HAR practitioners as well as pave the way for future research on domain adaptation in HAR.
EPerceptive
Energy reactive embedded intelligence for batteryless sensors
For long, we have studied tiny energy harvesters to liberate sensors from batteries. With remarkable progress in embedded deep learning, we are now re-imagining these sensors as intelligent compute nodes. Naturally, we are approaching a crossroad where sensor intelligence is meeting energy autonomy enabling maintenance-free swarm intelligence and unleashing a plethora of applications ranging from precision agriculture to ubiquitous asset tracking to infrastructure monitoring. One of the critical challenges, however, is to adapt intelligence fidelity in response to available energy to maximise the overall system availability. To this end, we present the design and implementation of ePerceptive: a novel framework for best-effort embedded intelligence, i.e., inference fidelity varies in proportion to the instantaneous energy supplied. ePerceptive operates on two core principles. First, it enables training a single deep neural network (DNN) to operate on multiple input resolutions without compromising accuracy or incurring memory overhead. Second, it modifies a DNN architecture by injecting multiple exits to guarantee valid, albeit lower-fidelity inferences in the event of energy interruption. The combination of these techniques offers a smooth adaptation between inference latency and recognition accuracy while matching the computational load to the available power budget. We report the manifestation of ePerceptive in designing batteryless cameras and microphones built with TI MSP430 MCU and off-the-shelf RF and solar energy harvesters. Our evaluation of these batteryless sensors with multiple vision and acoustic workloads suggest that the dynamic adaptation of ePerceptive can increase the inference throughput by up to 80% compared to a static baseline while ensuring a maximum accuracy drop of less than 6%.
The increasing availability of multiple sensory devices on or near a human body has opened brand new opportunities to leverage redundant sensory signals for powerful sensing applications. For instance, personal-scale sensory inferences with motion and audio signals can be done individually on a smartphone, a smartwatch, and even an earbud - each offering unique sensor quality, model accuracy, and runtime behaviour. At execution time, however, it is incredibly challenging to assess these characteristics to select the best device for accurate and resource-efficient inferences. To this end, we look at a quality-aware collaborative sensing system that actively interplays across multiple devices and respective sensing models. It dynamically selects the best device as a function of model accuracy at any given context. We propose two complementary techniques for the runtime quality assessment. Borrowing principles from active learning, our first technique runs on three heuristic-based quality assessment functions that employ confidence, margin sampling, and entropy of models' output. Our second technique is built with a siamese neural network and acts on the premise that runtime sensing quality can be learned from historical data. Our evaluation across multiple motion and audio datasets shows that our techniques provide 12% increase in overall accuracy through dynamic device selection at the average expense of 13 mW power on each device as compared to traditional single-device approaches.
Conversational agents are increasingly becoming digital partners in our everyday computational experiences. Although rich, and fresh in content, they are oblivious to users’ locality beyond geospatial weather and traffic conditions. We introduce conversational agents that are hyper-local, embedded deeply into the urban infrastructure providing rich, purposeful, detail, and in some cases playful information relevant to a neighborhood. These agents are spatially constrained, and one can only interact with them once she is in close vicinity at street-level granularity. In other words, the city provides personal, stateful, spontaneous service to its citizens through the agents installed in urban landmarks. Drawing lessons from two user studies, we identify the requirements for this system. We then discuss the architecture of these agents that leverage covert communication channels and machine learning algorithms that run on the edge and wearable devices to offer meaningful conversational experience in urban settings.
We explore a new variability observed in motion signals acquired from modern wearables. Wearing variability refers to the variations of the device orientation and placement across wearing events. We collect the accelerometer data on a smartwatch and an earbud and analyse how motion signals change due to the wearing variability. Our analysis shows that the wearing variability can bring an unexpected change to motion signals, not only from different users but also from different wearing sessions of the same user. We also provide empirical ranges of changes in device orientations.
Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users' situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a mixed-method study that informs the design of a situation-and emotion-aware conversational agent for kinetic earables. We surveyed 280 users, and qualitatively interviewed 12 users to understand their expectation from a conversational agent in adapting the interaction style. Grounded on our findings, we develop a first-of-its-kind emotion regulator for a conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, motion signals and ambient sound. We describe these context models, the end-to-end system including a purpose-built kinetic earable and their real-world assessment. The experimental results demonstrate that our regulation mechanism invariably elicits better and affective user experience in comparison to baseline conditions in different real-world settings.
AudiDoS
Real-time denial-of-service adversarial attacks on deep audio models
Deep learning has enabled personal and IoT devices to rethink microphones as a multi-purpose sensor for understanding conversation and the surrounding environment. This resulted in a proliferation of Voice Controllable Systems (VCS) around us. The increasing popularity of such systems is also prone to attracting miscreants, who often want to take advantage of the VCS without the knowledge of the user. Consequently, understanding the robustness of VCS, especially under adversarial attacks, has become an important research topic. Although there exists some previous work on audio adversarial attacks, their scopes are limited to embedding the attacks onto pre-recorded music clips, which when played through speakers cause VCS to misbehave. As an attack-audio needs to be played, the occurrence of this type of attacks can be suspected by a human listener. In this paper, we focus on audio-based Denial-of-Service (DoS) attack, which is unexplored in the literature. Contrary to previous work, we show that adversarial audio attacks in real-time and overthe-air are possible, while a user interacts with VCS. We show that the attacks are effective regardless of the user's command and interaction timings. In this paper, we present a first-of-itskind imperceptible and always-on universal audio perturbation technique that enables such DoS attack to be successful. We thoroughly evaluate the performance of the attacking scheme across (i) two learning tasks, (ii) two model architectures and (iii) three datasets. We demonstrate that the attack can introduce as high as 78% error rate in audio recognition tasks.
In this paper, we introduce inertial signals obtained from an earable placed in the ear canal as a new compelling sensing modality for recognising two key facial expressions: Smile and frown. Borrowing principles from Facial Action Coding Systems, we first demonstrate that an inertial measurement unit of an earable can capture facial muscle deformation activated by a set of temporal microexpressions. Building on these observations, we then present three different learning schemes - shallow models with statistical features, hidden Markov model, and deep neural networks to automatically recognise smile and frown expressions from inertial signals. The experimental results show that in controlled non-conversational settings, we can identify smile and frown with high accuracy (F1 score: 0.85).
Poster
On-Wearable AI to model human interruptibility
The Internet of Things has become a key enabling technology for data-intensive research across universities and private organisations alike. However, the recent introduction of the General Data Protection Regulation (GDPR) in Europe has raised concerns that the GDPR might hamper data-intensive research. In this paper, we address the question of how to enable ethical and compliant research with personal IoT data in an academic environment. We identify three novel trust principles for GDPR compliant use of personal IoT data in science and research (private-by-default, analytics transparency and Accountable analytics) and propose an architecture for a trusted IoT research infrastructure.
Demo abstract
ESense - Open Earable Platform for Human Sensing
We present eSense - an open and multi-sensory in-ear wearable platform for personal-scale behaviour analytics. eSense is a true wireless stereo (TWS) earbud and supports dual-mode Bluetooth and Bluetooth Low Energy. It is also augmented with a 6-axis inertial measurement unit and a microphone. We demonstrate the eSense platform, the data exploration tool with the open APIs for the real-time visualisation of multi-modal sensory data, and its manifestation in a 360 ◦ workplace well-being application.
Beyond Testbeds
Real-World IoT Deployments
In this paper, we explore audio and kinetic sensing on earable devices with the commercial on-the-shelf form factor. For the study, we prototyped earbud devices with a 6-axis inertial measurement unit and a microphone. We systematically investigate the differential characteristics of the audio and inertial signals to assess their feasibility in human activity recognition. Our results demonstrate that earable devices have a superior signal-to-noise ratio under the influence of motion artefacts and are less susceptible to acoustic environment noise. We then present a set of activity primitives and corresponding signal processing pipelines to showcase the capabilities of earbud devices in converting accelerometer, gyroscope, and audio signals into the targeted human activities with a mean accuracy reaching up to 88% in varying environmental conditions.
Poster
Audio-Kinetic Model for Automatic Dietary Monitoring with Earable Devices
Demo
ESensE - Open Earable Platform for Human Sensing
We present eSense - an open and multi-sensory in-ear wearable platform to detect and monitor human activities. eSense is a true wireless stereo (TWS) earbud with dual-mode Bluetooth and Bluetooth Low Energy and augmented with a 6-axis inertial measurement unit and a microphone. We showcase the eSense platform, its data APIs to capture real-time multi-modal sensory data in a data exploration tool, and its manifestation in a 360◦ workplace well-being application.
Mindful interruptions
A lightweight system for managing interruptibility onwearables