Towards On-Device Semantic Search using LLMs

Master Thesis (2024)
Author(s)

X. Chen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.A. Pouwelse – Mentor (TU Delft - Data-Intensive Systems)

Q.A. Stokkink – Graduation committee member (TU Delft - Data-Intensive Systems)

Q. Wang – Coach (TU Delft - Embedded Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
04-07-2024
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Artificial Intelligence']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Traditional search engines rely on centralized databases and powerful servers to process and retrieve information. Developing alternatives to key-value search engine databases in distributed computing environments is a significant challenge, particularly when dealing with limited computational resources. This study explores the use of large language models (LLMs) to address this problem. We focus on environments with constrained computing power, such as mobile devices, to investigate the feasibility of using LLMs as a localized search solution. Through experiments with the state-of-the-art LLMs BERT and T5, we demonstrate their ability to memorize and retrieve unstructured data, specifically YouTube video IDs, based on partial information derived from video titles or tags. Our results show that the explored models can achieve 100\% precision and recall when retrieving 48266 video IDs. The findings suggest that LLMs have the potential to effectively function as a search engine database, offering semantic search capabilities while operating within the constraints of limited computational resources.

Files

XueyuanChenThesis202406.pdf
(pdf | 0.414 Mb)
License info not available