Exploring Test Suite Coverage of Large Language Model–Enhanced Unit Test Generation

None, None

Exploring Test Suite Coverage of Large Language Model–Enhanced Unit Test Generation

A Study on the Ability of Large Language Models to Improve the Understandability of Generated Unit Tests Without Compromising Coverage

Bachelor Thesis (2024)

Author(s)

A. Drăgoi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A.E. Zaidman – Mentor (TU Delft - Software Technology)

A. Deljouyi – Mentor (TU Delft - Software Engineering)

A Katsifodimos – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:3940725d-a367-418d-bf19-407e65d7b902

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

25-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Automated software testing is a frequently studied topic in specialized literature. Search-based software testing tools, like EvoSuite, can generate test suites using genetic algorithms without the developer’s input. Large Language Models (LLMs) have recently attracted significant attention in the software engineering domain for their potential to automate test generation. UTGen, a tool integrating LLMs with EvoSuite, produces more understandable tests than EvoSuite; however, the generated tests suffer a coverage drop.

To streamline bug detection by developers, we propose UTGenCov, a concept that focuses on improving the understandability of EvoSuite-generated tests without compromising on coverage. This approach builds upon UTGen by thoroughly analyzing the reasons behind the decrease in coverage and proposing an alternative approach.

Our investigation determined that the leading cause of coverage reduction in UTGen is LLM hallucination in the Understandability phase. UTGenCov aims to address hallucinations by providing the source code of the methods used in the test to the LLM. Yet, our experiment results indicate inconsistent performance and a further decrease in branch coverage of 0.74% compared to UTGen.

Files

Exploring_Test_Suite_Coverage_... (pdf)

(pdf | 0.454 Mb)

License info not available