Deliberate Code Coverage

Master Thesis (2026)
Author(s)

R.M. de Britto Heemskerk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S. Proksch – Mentor (TU Delft - Software Engineering)

C.E. Brandt – Mentor (TU Delft - Software Engineering)

M.A. Migut – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
22-01-2026
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Software Technology']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Testing is a major part of software development. Within testing often coverage requirements are used
as a tool for quality assurance. But what code should be covered to reach the requirement is not clear.
To address this, we suggest using historic data to make these decisions more deliberate. In other
words, we want to use machine learning to predict coverage.
Building upon previous research, we investigate how different approaches affect the performance of
decision tree models. We did this using data from the Mozilla Firefox codebase. We focused in partic-
ular on the C/C++ code within there. Naively splitting training and test set and representing coverage
per lines leads to best performance. Analysis showed that grouping coverage data based on basic
blocks slightly lessened the predictive performance of the model. Meanwhile, splitting the data across
the training set and test set based on their files appears to take away all predictive performance.
This study provides a new dataset for use in developer coverage prediction. It also introduces a new
way of representing coverage data for developer coverage prediction, being basic-block coverage. And
finally, gives insights on the effects of different coverage representations on decision trees.

Files

Deliberate_Code_Coverage.pdf
(pdf | 0.636 Mb)
License info not available