Exploring the Generation and Detection of Weaknesses in LLM Generated Code

LLMs can not be trusted to produce secure code, but they can detect it

Bachelor Thesis (2024)
Author(s)

I. Vasiliauskas (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Ali Al-Kaswan – Mentor (TU Delft - Software Engineering)

A Van Deursen – Graduation committee member (TU Delft - Software Engineering)

Maliheh Izadi – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
28-06-2024
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large Language Models (LLMs) have gained a lot of popularity for code generation in recent years. Developers might use LLM-generated code in projects where the security of software matters. A relevant question is therefore: what is the prevalence of code weaknesses in LLM-generated code, and can we use LLMs to detect them? In this research, we generate prompts based on a taxonomy of code weaknesses and run them on multiple LLMs with varying properties. We evaluate the results on the existence of insecurities both manually and by the LLMs themselves. We can conclude that even when LLMs are not provoked and asked benign realistic requests, they often generate code containing known software weaknesses. We find a correlation between model parameter size and the percentage of secure answers. However, they are exceptionally successful in recognizing these insecurities themselves. Future work should focus on a wider set of models and a larger set of prompts, to get more results on this subject.

Files

License info not available