Training a Machine-Learning Model for Optimal Fitness Function Selection with the Aim of Finding Bugs

More Info
expand_more

Abstract

Software testing is essential for a successful development process, however, it can be troublesome as manually writing tests can be time demanding and error-prone. EvoSuite is a test case generating tool developed to address this [18]. It can generate test cases for different test criteria - Line Coverage, Branch Coverage, Input Diversity, etc. Branch Coverage puts the focus on covering branches in the code, whilst Input Diversity puts the focus on the use of diverse inputs as parameters in the test cases. The downside is that the user needs to select the best suited test criteria, out of the many that EvoSuite provides, for the classes under test. It is not feasible for the user to manually find the optimal one for the classes under test. This paper aims to shine a light on the effectiveness of the combination of Input Diversity and Branch Coverage as a test criteria. This study presents a machine learning technique to automatically select the best combination of test generation objectives according to static metrics. The model we chose for this task is a decision tree as it directly provides a pattern. Said pattern is a combination of conditions that the static metrics need to hold for the chosen test criteria to be effective. The evaluation of the effectiveness was done one a benchmark of 346 classes taken from SF-110 Corpus of Classes [9] and the Appache Commons. To evaluate the effectiveness of Input Diversity in combination with Branch Coverage, we compared the test criteria to two other test criteria - Branch Coverage and the Default coverage criteria used in EvoSuite. The decision tree models created achieve an accuracy upwards of 90% in the best case and deem metrics such as wmc, dit, fanin and others to be crucial for the effectiveness of Input Diversity in combination with Branch Coverage.

Files