Transformer Modules

Transferable & Parameter Efficient LLM Fine Tuning

More Info
expand_more

Abstract

With the increasing popularity of Large Language Models (LLMs), fine-tuning them has become increasingly computationally expensive. Parameter Efficient Fine-Tuning (PEFT) methods like LoRA and Adapters, introduced by Microsoft and Google, respectively, aim to reduce the number of trainable parameters, with the current state-of-the-art combining both methods as LoRA Adapters. This paper introduces Transformer Modules as a PEFT method. These modules utilize Modular Transformer Blocks (MTBs) inserted into a frozen pre-trained model, achieving competitive performance while significantly reducing computation costs. Compared to the current state-of-the-art using GPT-2, BERT, and T5, Transformer Modules further reduced compute time by 39.7\% and training memory by 72.7\%, with a performance cost of 4.5±2.51\% on the GLUE benchmark. Additionally, the paper presents the Transformer Bridge, a continuous vector transformer designed to transfer Transformer Modules across different models. This could enable cross-model fine-tuning, allowing model-agnostic modules, such as an ethics or medical module, to be used across various LLMs without retraining or access to the original dataset. Although the current implementation of the Transformer Bridge did not fully succeed in mapping embedding spaces, analysis of the results suggests that further refinements using traditional model distillation techniques could lead to success in future iterations.

Files

Transformer_Modules_Jahson_ODw... (.pdf)
warning

File under embargo until 30-06-2025