Guided Metamorphic Transformations for Testing the Robustness of Trained Code2Vec Models

Master Thesis (2022)
Author(s)

R.J. Marang (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Annibale Panichella – Mentor (TU Delft - Software Engineering)

Leonhard Herbert Applis – Mentor (TU Delft - Software Engineering)

A. van Deursen – Graduation committee member (TU Delft - Software Technology)

Zekeriya Erkin – Graduation committee member (TU Delft - Cyber Security)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Ruben Marang
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Ruben Marang
Graduation Date
30-08-2022
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Machine learning models are increasingly being used within software engineering for their predictions. Research shows that these models’ performance is increasing with new research. This thesis focuses on models for method name prediction, for which the goal is to have a model that can accurately predict method names. With this thesis, we could create a tool that can suggest method names to software developers, which would assist in improving the quality of the projects.
This research aims to get insight into the robustness vulnerabilities of a method name prediction model. We use a genetic search algorithm that looks for these robustness problems. The main question this thesis tries to answer is to what extent the performance metrics are affected by applying metamorphic transformations to the test set of a trained code2vec model. Besides this, this thesis also proposes an alternative metric called percentage MRR, which might better reflect the robustness of a model. The main idea behind this metric is that it penalizes the prediction certainty of a model instead of penalizing the prediction rank.
To answer this research question, a tool is created that runs a genetic algorithm applying these metamorphic transformations to a dataset that a trained model is then evaluating. With this tool, we conducted 22 genetic search experiments on primary metrics and combinations of metrics to see the trade-offs in the Pareto fronts. The guided search of applying metamorphic transformations on the test set results in an average performance decrease of around 19%. This thesis also compares this drop in performance to the performance decrease a random search algorithm would create. Notably, for every transformer added, the average decrease in performance becomes smaller, and there are transformations, e.g., the if-false-else transformation, that have a bigger effect than others. This thesis concludes that the trained model is not robust against metamorphic transformations and has a significant performance drop.

Files

License info not available