Using local LLMs in constrained environments for increasing mutation score

None, None

Using local LLMs in constrained environments for increasing mutation score

Bachelor Thesis (2024)

Author(s)

R.R.L. van der Geest (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Panichella – Mentor (TU Delft - Software Engineering)

Mitchell Olsthoorn – Mentor (TU Delft - Software Engineering)

C.B. Bach Poulsen – Graduation committee member (TU Delft - Programming Languages)

Faculty

Electrical Engineering, Mathematics and Computer Science

Large Language Models Mutation Testing Test Generation

To reference this document use:

https://resolver.tudelft.nl/uuid:fc580043-efd2-4c72-8a60-5d2c0e2a6e4f

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

26-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Mutation testing is a way to test the effectiveness of a test suite for catching bugs in a given piece of code. Writing these tests manually can be cumbersome and time-consuming. Automated tools can be used to generate tests that achieve a high mutation score. The output of these tools is often very hard to understand for humans, and therefore rarely used as actual test suites for software programs. Because LLMs have been shown to be able to generate programs that can be more easily understood by humans, we ask if these LLMs can be used for improving or generating tests for the purpose of mutation testing. Some LLMs run in the cloud, while others run locally. Cloud-based LLMs such as ChatGPT or Copilot are not always an option because of privacy concerns, speed, or regulations, but do not require possession of hardware. Local LLMs do not have the privacy concerns, but sometimes require large amounts of hardware to be available. This paper will focus on local LLMs that can be run in a computationally restricted environment. We present an automated approach to use a local LLM to improve the mutation score of existing test suites. We compare three different models (DeepSeek Coder, Code Llama and Codestral), evaluated on publicly available datasets. Using this approach, we were able to successfully generate unit tests that, combined with the existing manually written tests, are able to increase the mutation score around one third to half of the time depending on the model.

Files

R._van_der_Geest_2024_-_Using_... (pdf)

(pdf | 1.02 Mb)

License info not available