Towards Safe, Secure, and Usable LLMs4Code

None, None

Towards Safe, Secure, and Usable LLMs4Code

Conference Paper (2024)

Author(s)

A. Al-Kaswan (TU Delft - Software Engineering)

Research Group

Software Engineering

DOI related publication

https://doi.org/10.1145/3639478.3639803

Privacy Large language models Compression Data leakage Memorisation

To reference this document use:

https://resolver.tudelft.nl/uuid:90e4a400-8940-443f-9c53-02b56296dda3

More Info

expand_more

Publication Year

2024

Language

English

Research Group

Software Engineering

Pages (from-to)

258-260

ISBN (electronic)

9798400705021

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large Language Models (LLMs) are gaining popularity in the field of Natural Language Processing (NLP) due to their remarkable accuracy in various NLP tasks. LLMs designed for coding are trained on massive datasets, which enables them to learn the structure and syntax of programming languages. These datasets are scraped from the web and LLMs memorise information in these datasets. LLMs for code are also growing, making them more challenging to execute and making users increasingly reliant on external infrastructure.We aim to explore the challenges faced by LLMs for code and propose techniques to measure and prevent memorisation. Additionally, we suggest methods to compress models and run them locally on consumer hardware.

Files

3639478.3639803.pdf

(pdf | 0.531 Mb)