Machine Learning for Software Refactoring: a Large-Scale Empirical Study

More Info
expand_more

Abstract

Refactorings tackle the challenge of architectural degradation of object-oriented software projects by improving its internal structure without changing the behavior. Refactorings improve software quality and maintainability if applied correctly. However, identifying refactoring opportunities is a challenging problem for developers and researchers alike. In a recent work, machine learning algorithms have shown great potential to solve this problem. This thesis used RefactoringMiner to detect refactorings in open-source Java projects and computed code metrics by static analysis. We defined the refactoring opportunity detection problem as a binary classification problem and deployed machine learning algorithms to solve it. The models classify between a specific refactoring type and a stable class using the metrics as features. Multiple machine learning experiments were designed based on the results of an empirical study of the refactorings. For this work, we created the largest data set of refactorings in Java source code to date, including 92800 open-source projects from GitHub with a total of 33.67 million refactoring samples. The data analysis revealed that Class- and Package-Level refactorings occur most frequently in early development stages of a class, Method- and Variable-Level refactorings are applied uniformly during the development of a class. The machine learning models achieve high performance ranging from 80\% to 89\% total average accuracy for different configurations of the refactoring opportunity prediction problem on unseen projects. Selecting a high Stable Commit Threshold (K) improves the recall of the models significantly, but also strongly reduces the generalizability of the models. The Random Forest (RF) classifier shows great potential for the refactoring opportunity detection, it can adapt to various configurations of the problem, identifies a large variety of relevant metrics in the data and is able to distinguish different refactoring types. This work shows that for solving the refactoring opportunity detection problem a large variety of metrics is required, as a small set of metrics cannot represent the complexity of the problem.