Using LLM-Generated Summarizations to Improve the Understandability of Generated Unit Tests

Enhancing Unit Test Understandability: An Evaluation of LLM-Generated Summaries

Bachelor Thesis (2024)
Author(s)

N. Djajadi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Andy Zaidman – Mentor (TU Delft - Software Technology)

A. Deljouyi – Mentor (TU Delft - Software Engineering)

A. Katsifodimos – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
25-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Since software testing is crucial, there has been research on generating test cases automatically. The problem is that the generated test cases can be hard to understand. Multiple factors play a role in understandability and one of them is test summarization, which provides an overview of the test of what it is exactly testing and sometimes highlights the key functionalities. There already exist some tools to generate test summaries that use template-based summarization techniques. Limitations of generated summaries include that they can be lengthy and redundant, and that it is best to use them in combination with well-defined test names and variables. There is a tool developed named UTGen, which combines Evosuite and Large Language models to increase understandability which includes improving the test names and variables, but does not have a summarization functionality yet. In this research, we extend UTGen using LLM-generated summaries. We investigate to what extent the understandability of a test case can be influenced by LLM-generated test summaries in terms of context, conciseness, and naturalness. For this reason, we do a user evaluation with 11 participants with a software testing background. They will judge LLM-generated summaries and compare them to existing summarization tools. The LLM-generated summaries scored overall higher than the template-based summaries and were also more favorable by the participants.

Files

License info not available