Towards Safe, Secure, and Usable LLMs4Code

Conference Paper (2024)
Author(s)

A. Al-Kaswan (TU Delft - Software Engineering)

Research Group
Software Engineering
DOI related publication
https://doi.org/10.1145/3639478.3639803
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Software Engineering
Pages (from-to)
258-260
ISBN (electronic)
9798400705021
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large Language Models (LLMs) are gaining popularity in the field of Natural Language Processing (NLP) due to their remarkable accuracy in various NLP tasks. LLMs designed for coding are trained on massive datasets, which enables them to learn the structure and syntax of programming languages. These datasets are scraped from the web and LLMs memorise information in these datasets. LLMs for code are also growing, making them more challenging to execute and making users increasingly reliant on external infrastructure.We aim to explore the challenges faced by LLMs for code and propose techniques to measure and prevent memorisation. Additionally, we suggest methods to compress models and run them locally on consumer hardware.