Large language models (LLMs) have become central to many applications, but their deployment often requires high-performance hardware, specialized libraries, and complex engineering, limiting accessibility for smaller organizations. Meanwhile, relational database systems (RDBMS) a
...
Large language models (LLMs) have become central to many applications, but their deployment often requires high-performance hardware, specialized libraries, and complex engineering, limiting accessibility for smaller organizations. Meanwhile, relational database systems (RDBMS) are widely used for portability, efficiency, and native support for managing large-scale data operations. This paper presents TranSQL1, a toolkit that enables transformerbased LLM inference within RDBMS. By translating neural operations into SQL queries and representing model weights as relational tables, TranSQL leverages database features like dynamic disk-to-memory data management and caching to reduce hardware and engineering demands for serving LLMs. Using the LLaMA3 8B model, we demonstrate TranSQL's ability to implement attention layers, KV-cache, and end-to-end text generation through SQL queries. TranSQL offers a cost-effective, portable, and scalable approach to making advanced AI technologies more accessible.