Closing the Gap: Java Test Assertion Generation via Knowledge Distillation with Trident Loss

Bachelor Thesis (2025)
Author(s)

M.D. Chu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Panichella – Mentor (TU Delft - Software Engineering)

M.J.G. Olsthoorn – Mentor (TU Delft - Software Engineering)

P. Kellnhofer – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
24-06-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Software testing is crucial in the software development process to ensure quality. However, automating test assertion generation remains a significant challenge in software engineering due to the need for both precise syntactic structure and semantic correctness. While large language models (LLMs) have shown impressive capabilities in generating test assertions, their high computational demands make them less practical for developers working in resource- constrained environments where cloud services are not a viable option. We present a knowledge distillation approach that trains a smaller student model (220M parameters) to mimic the behavior of a larger teacher model (770M parameters) through a novel Trident multi-component loss. Trident combines (1) a focal loss to focus training on hard-to-predict tokens, (2) a Jensen-Shannon Divergence (JSD) term to align the student with the teacher’s output distribution, and (3) a semantic similarity loss to preserve meaning, along with dynamic weight scheduling to balance these objectives. While knowledge distillation is established, its application to the nuanced task of generating test code assertions is underexplored. Our experimental evaluation on 7,000 Java unit tests demonstrates that the distilled student model achieves 90% of the teacher’s Code Quality Score while requiring 71% less memory. This significant reduction in resource requirements makes powerful LLM capabilities more accessible, particularly for developers in resource-constrained environments where cloud-based inference is not viable.

Files

Closing_the_gap.pdf
(pdf | 0.608 Mb)
License info not available