Database is All You Need
Serving LLMs with Relational Queries
Wenbo Sun (TU Delft - Web Information Systems)
Ziyu Li (TU Delft - Team Arjan Mol)
Vaishnav Srinidhi (Student TU Delft)
Rihan Hai (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Large language models (LLMs) have become central to many applications, but their deployment often requires high-performance hardware, specialized libraries, and complex engineering, limiting accessibility for smaller organizations. Meanwhile, relational database systems (RDBMS) are widely used for portability, efficiency, and native support for managing large-scale data operations. This paper presents TranSQL1, a toolkit that enables transformerbased LLM inference within RDBMS. By translating neural operations into SQL queries and representing model weights as relational tables, TranSQL leverages database features like dynamic disk-to-memory data management and caching to reduce hardware and engineering demands for serving LLMs. Using the LLaMA3 8B model, we demonstrate TranSQL's ability to implement attention layers, KV-cache, and end-to-end text generation through SQL queries. TranSQL offers a cost-effective, portable, and scalable approach to making advanced AI technologies more accessible.