Machine-Learning for Optimal Fitness Function Selection in Automated Testing

None, None

Machine-Learning for Optimal Fitness Function Selection in Automated Testing

Bachelor Thesis (2022)

Author(s)

D. Toader (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Annibale Panichella – Mentor (TU Delft - Software Engineering)

Pouria Derakhshanfar – Mentor (TU Delft - Software Engineering)

Mitchell Olsthoorn – Mentor (TU Delft - Software Engineering)

T. Höllt – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

To reference this document use:

https://resolver.tudelft.nl/uuid:7b1ae1e0-d598-479b-adb1-8dfd545f7c2d

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

22-06-2022

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The perpetual desire for more qualitative software has been an excellent incentive for software engineers to create automated tools to ease and improve the process of software testing. EvoSuite is an example of a state-of-the-art tool that synthesises test cases automatically. It uses a genetic algorithm to produce test cases based on given search targets. Previous studies have analysed the performance of single or combinations of targets but have not yet explored the differences between various combinations. In this research, we compare the Weak Mutation + Branch setting to Branch and the Default (combination of eight separate targets) of EvoSuite. We aim to provide insightful information about their differences in branch coverage and mutation scores. Moreover, we discuss machine-learning models that can predict which combination has the highest score (i.e., branch coverage, mutation score) based on characteristics of the tested classes, such as the number of lines of code. Our results highlight that the Weak Mutation + Branch combination outperforms Branch for the mutation score metric and Default for the branch coverage metric. They also show that Weak Mutation + Branch is outperformed by the branch criterion for Branch Coverage and by the Default combination for mutation score. Our findings also cover the performance of the models, having concluded that the Random Forest and Decision Tree classifiers produce the best results and are feasible options for predicting the best combinations from the ones analysed. Finally, static code metrics such as 'wmc', 'loc', and 'mathOperationsOty' often appear as relevant features for our models. We visualise how they influence the most suitable combination of criteria through our Decision Trees.

Files

ML_for_fitness_function_select... (pdf)

(pdf | 0.714 Mb)

License info not available