LLM Test Generation for Java Libraries in Low Context Settings

The Impact of Javadoc on LLM Test Generation Without Source Code

Bachelor Thesis (2026)
Author(s)

I. Raychaudhuri (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S. Proksch – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

C.R. Paulsen – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

S.S. Chakraborty – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
22-06-2026
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
9
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Many software systems depend on external packages that evolve independently. Updating these packages is important for reliability and security, but compatibility afterwards with dependent systems is not guaranteed. Testing can help verify compatibility, and LLMs show promise in test generation, but current approaches assume access to source code and additional context, something unavailable for packages released only as compiled bytecode, potentially with documentation. The quality of LLM generated tests in such settings is relatively unexplored, as is the context amount needed to generate sufficient tests. Javadoc provides contextual information not present in bytecode, possibly offsetting source code absence. To investigate if it does, I assessed generated test suite quality for bytecode, bytecode and Javadoc, and full source code. Evaluation metrics included the percentage of compiling test classes and passing tests, along with line, branch, and mutation coverages. Results showed that adding Javadoc led to increases over only bytecode in almost all metrics, reducing the gap to source code noticeably. While this combination consumes more tokens than bytecode and source code, the difference is small enough to be outweighed by the quality gains. Thus, I conclude that a bytecode and Javadoc configuration is an effective substitute and a promising option for verifying dependency compatibility in source code's absence.

Files

Final_paper.pdf
(pdf | 0.255 Mb)
License info not available