Unit test case generation aims to help software developers test programmes. The evolutionary algorithm is one of the successful approaches for unit test case generation that evolves problem solutions over time. Previous research on seeding, the use of previously available informa
...
Unit test case generation aims to help software developers test programmes. The evolutionary algorithm is one of the successful approaches for unit test case generation that evolves problem solutions over time. Previous research on seeding, the use of previously available information to improve search performance, showed positive improvements in unit test case generation. However, this approach cannot be used in the absence of previously available information, such as existing unit test cases.
The recent increased availability of Large Language Models (LLMs), which were trained on various corpora of previously available data, provides an opportunity to address the seed absence problem. We devised an approach involving TestSpark and EvoSuite to see the impact of LLM-based seeding on unit test case generation. TestSpark, an IntelliJ plugin, uses ChatGPT-4o to generate test cases which we later supply as a seed for EvoSuite's seeding strategies such as cloning and carving. We evaluated our approach on a set of 136 Java 11 projects from the GitBug-Java dataset w.r.t. line coverage, branch coverage, mutation score, and area under the curve.
Our results show that LLM-based seeding has the potential to improve EvoSuite's unit test case generation if it manages to extract information from the seed. Our approach experiences significant struggles to supply LLM-generated tests from which information can be extracted. We lost 63% of the benchmark classes because LLM did not generate functional tests for all experiment iterations. Meanwhile, another 24% of the benchmarks are excluded because EvoSuite seeding does not extract any information from them.