Leveraging Large Foundation Models for Zero-Shot IoT Sensing

None, None

Leveraging Large Foundation Models for Zero-Shot IoT Sensing

Master Thesis (2024)

Author(s)

D. XUE (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

K.G. Langendoen – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Q. Song – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Z. Yue – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Large Language Models Internet of Things Multimodal Learning

To reference this document use

https://resolver.tudelft.nl/uuid:1e045806-7675-48c8-898d-3f967d98ea1d

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

28-06-2024

Awarding Institution

Delft University of Technology

Programme

Electrical Engineering, Embedded Systems

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

283

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep learning models are now widely deployed on edge IoT devices. However, most of these models are trained under supervised conditions and can only recognize seen classes learned from the training stage. Zero-shot learning (ZSL) is a popular method for identifying unseen classes by leveraging the semantic information from both seen and unseen classes. Foundation models (FMs) trained on web-scale data have shown impressive ZSL capability in natural language processing and visual understanding. However, leveraging FMs' generalized knowledge for zero-shot Internet of Things (IoT) sensing using signals such as mmWave, IMU, and Wi-Fi has not been fully investigated. In this work, we align the IoT data embeddings with the semantic embeddings generated by an FM's text encoder for zero-shot IoT sensing. To utilize the physics principles governing the generation of IoT sensor signals to derive more effective prompts for semantic embedding extraction, we propose to use a multi-source information fusion strategy, cross-attention, to combine a hard prompt generated by Large Language Models (LLMs) and a soft prompt consisting of learnable vectors. To address the problem of IoT embeddings biasing to seen classes due to the lack of unseen class data during training, we propose using data augmentation to synthesize unseen class IoT data for fine-tuning the IoT feature extractor and embedding projector. We evaluate our approach on multiple IoT sensing tasks. Experiment results show that our approach achieves an average improvement of 1.0% in open-set detection and 9.5% in generalized zero-shot learning compared with multiple baselines on three datasets.

Files

DinghaoXue_Master_Thesis.pdf

(pdf | 7.39 Mb)

- Embargo expired in 28-12-2024

License info not available