FI

F. Ignijic

1 records found

Evaluating Adaptive Activation Functions in Language Models

Does choice of activation function matter in smaller Langaunge Models?

The rapid expansion of large language models (LLMs) driven by the transformer architecture has raised concerns about the lack of high-quality train ing data. This study investigates the role of acti vation functions in smaller-scale language models, specifically those with app ...