PyTestGuard: An IDE-Integrated Tool for Supporting Developers with LLM-Generated Unit Tests

Master Thesis (2025)
Author(s)

N. Mouman (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Carolin Brandt – Mentor (TU Delft - Software Engineering)

A. Panichella – Mentor (TU Delft - Software Engineering)

W.P. Brinkman – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
12-09-2025
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Unit testing is an important step in the software development workflow to detect bugs and ensure system correctness. Recently, Large Language Models (LLMs) have been explored to automate unit test generation and have demonstrated promising results. However, the generated tests are not always reliable, as they may contain syntax errors, hallucinations, test smells, or failing assertions. We conjecture that providing developers with feedback on such issues will increase the adoption of LLMs in realworld workflows. To address this, we propose PyTestGuard, a PyCharm plugin that allows developers to generate and refine unit tests directly within the Integrated Development Environment (IDE). Beyond test generation, PyTestGuard helps users evaluate test quality by detecting test smells and reporting issues such as missing arguments or references to non-existing objects. We conducted a user study with nine participants to assess PyTestGuard’s usefulness as a testing assistant and to identify areas for improvement. Participants reported that the tool’s feedback on test quality, along with its summarised error messages and coverage information, supported them while writing unit tests. However, they also faced challenges and suggested improvements before completely trusting LLM-based test generation in their development workflow. Based on these findings, we highlight several design recommendations for future tools that aim to integrate LLMs into software testing workflows.

Files

License info not available