Defect prediction as a multiobjective optimization problem

Journal Article (2015)
Authors

Gerardo Canfora (University of Sannio)

Andrea De Lucia (University of Salerno)

Massimiliano Di Di Penta (University of Sannio)

Rocco Oliveto (University of Molise)

Annibale Panichella (University of Salerno, TU Delft - Software Engineering)

Sebastiano Panichella (Universitat Zurich, University of Sannio)

Research Group
Software Engineering
To reference this document use:
https://doi.org/10.1002/stvr.1570
More Info
expand_more
Publication Year
2015
Language
English
Research Group
Software Engineering
Issue number
4
Volume number
25
Pages (from-to)
426-459
DOI:
https://doi.org/10.1002/stvr.1570

Abstract

In this paper, we formalize the defect-prediction problem as a multiobjective optimization problem. Specifically, we propose an approach, coined as multiobjective defect predictor (MODEP), based on multiobjective forms of machine learning techniques - logistic regression and decision trees specifically - trained using a genetic algorithm. The multiobjective approach allows software engineers to choose predictors achieving a specific compromise between the number of likely defect-prone classes or the number of defects that the analysis would likely discover (effectiveness), and lines of code to be analysed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the PROMISE repository indicate the quantitative superiority of MODEP with respect to single-objective predictors, and with respect to trivial baseline ranking classes by size in ascending or descending order. Also, MODEP outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.

No files available

Metadata only record. There are no files for this record.