Towards Refined Code Coverage

A New Predictive Problem in Software Testing

Conference Paper (2025)
Author(s)

Carolin E. Brandt (TU Delft - Software Engineering)

Aurora Ramírez (University of Córdoba)

Research Group
Software Engineering
DOI related publication
https://doi.org/10.1109/ICST62969.2025.10989028
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Software Engineering
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl. Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
613-617
ISBN (print)
979-8-3315-0815-9
ISBN (electronic)
979-8-3315-0814-2
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To measure and improve the strength of test suites, software projects and their developers commonly use code coverage and aim for a threshold of around 80%. But what is the 80 % of the source code that should be covered? To prepare for the development of new, more refined code coverage criteria, we introduce a novel predictive problem in software testing: whether a code line is, or should be, covered by the test suite. In this short paper, we propose the collection of coverage information, source code metrics, and abstract syntax tree data and explore whether they are relevant to predict whether a code line is exercised by the test suite or not. We present a preliminary experiment using four machine learning (ML) algorithms and an open source Java project. We observe that ML classifiers can achieve high accuracy (up to 90%) on this novel predictive problem. We also apply an explainable method to better understand the characteristics of code lines that make them more “appealing” to be covered. Our work opens a research line worth to investigate further, where the focus of the prediction is the code to be tested. Our innovative approach contrasts with most predictive problems in software testing, which aim to predict the test case failure probability.

Files

License info not available
warning

File under embargo until 20-11-2025