Print Email Facebook Twitter Transformer Modules Title Transformer Modules: Transferable & Parameter Efficient LLM Fine Tuning Author O'Dwyer Wha Binda, Jahson (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Interactive Intelligence) Contributor Murukannaiah, P.K. (mentor) Liscio, E. (mentor) van Gemert, J.C. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science Date 2024-06-05 Abstract With the increasing popularity of Large Language Models (LLMs), fine-tuning them has become increasingly computationally expensive. Parameter Efficient Fine-Tuning (PEFT) methods like LoRA and Adapters, introduced by Microsoft and Google, respectively, aim to reduce the number of trainable parameters, with the current state-of-the-art combining both methods as LoRA Adapters. This paper introduces Transformer Modules as a PEFT method. These modules utilize Modular Transformer Blocks (MTBs) inserted into a frozen pre-trained model, achieving competitive performance while significantly reducing computation costs. Compared to the current state-of-the-art using GPT-2, BERT, and T5, Transformer Modules further reduced compute time by 39.7\% and training memory by 72.7\%, with a performance cost of 4.5±2.51\% on the GLUE benchmark. Additionally, the paper presents the Transformer Bridge, a continuous vector transformer designed to transfer Transformer Modules across different models. This could enable cross-model fine-tuning, allowing model-agnostic modules, such as an ethics or medical module, to be used across various LLMs without retraining or access to the original dataset. Although the current implementation of the Transformer Bridge did not fully succeed in mapping embedding spaces, analysis of the results suggests that further refinements using traditional model distillation techniques could lead to success in future iterations. Subject Large Language Models (LLMs)Natural Language Processing (NLP)Transformer Neural NetworkParameter Efficient Fine Tuning (PEFT) To reference this document use: http://resolver.tudelft.nl/uuid:516b2d8d-3d74-4dc5-bf69-cbd2b230aff3 Embargo date 2025-06-30 Part of collection Student theses Document type master thesis Rights © 2024 Jahson O'Dwyer Wha Binda Files file embargo until 2025-06-30