Deep Just-in-Time Defect Prediction at Adyen

None, None

Deep Just-in-Time Defect Prediction at Adyen

Master Thesis (2021)

Author(s)

N. van der Laan (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Maurício Aniche – Mentor (TU Delft - Software Engineering)

A. Van Van Deursen – Graduation committee member (TU Delft - Software Technology)

Sicco Verwer – Graduation committee member (TU Delft - Cyber Security)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Deep learning Machine learning Defect prediction Just-in-time

To reference this document use:

https://resolver.tudelft.nl/uuid:a8166502-093f-43fb-bbab-265f6fb6e8f9

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

25-08-2021

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Software Technology']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Finding defects in proposed changes is one of the biggest motivations and expected outcomes of code review, but does not result as often as expected in actually finding defects. Just-in-time (JIT) defect prediction focuses on predicting bug-introducing changes, which can help with efficient allocation of inspection time according to the defect-proneness of the changed software parts. Despite the promising results achieved by DeepJIT and CC2Vec, two deep learning-based JIT defect prediction models, industry-based JIT defect prediction studies have not opted yet to apply deep models. In this work, the goal is to build and evaluate several JIT defect prediction models that can help Adyen developers spot defective changes during code review. To construct a new dataset with a large enough set of labels, we identify four sources of potential bug-fixing commits by analysing Adyen's way of working. We make several practical adaptations to DeepJIT and CC2Vec and compare their performances with three traditional metric-based models when making predictions at both commit-level and file-level. Our results indicate that deep models are able to outperform the metric-based models across all three datasets. All models performed slightly worse when evaluated on Adyen data compared to an open-source setting, but both deep models still achieved respectable performances and significantly outperformed the metric-based models. When evaluated in a real-world setting on bugs manually collected by Adyen developers, DeepJIT performed consistent with earlier findings when evaluated on commit-level, but performances fall on file-level. Lastly, we find that although inclusion of each bug source generally does not lead to worse performance, whether it leads to better performance is dependent on both what type of model is used and at what granularity predictions are made.

Files

Master_Thesis_Niek_van_der_Laa... (pdf)

(pdf | 1.84 Mb)

License info not available