Learning Off-By-One Mistakes

An Empirical Study

Conference Paper (2021)
Author(s)

Hendrig Sellik (Student TU Delft)

Onno van van Paridon (Adyen B.V.)

Gousios Georgios (TU Delft - Software Engineering)

Maurício Aniche (TU Delft - Software Engineering)

Research Group
Software Engineering
Copyright
© 2021 Hendrig Sellik, Onno van Paridon, G. Gousios, Maurício Aniche
DOI related publication
https://doi.org/10.1109/MSR52588.2021.00019
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Hendrig Sellik, Onno van Paridon, G. Gousios, Maurício Aniche
Research Group
Software Engineering
Bibliographical Note
Accepted author manuscript@en
Pages (from-to)
58-67
ISBN (print)
978-1-6654-2985-6
ISBN (electronic)
978-1-7281-8710-5
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary conditions, the problem remains open. In this paper, we explore the effectiveness of deep learning models in learning and predicting mistakes in boundary conditions. We train different models on approximately 1.6M examples with faults in different boundary conditions. We achieve a precision of 85% and a recall of 84% on a balanced dataset, but lower numbers in an imbalanced dataset. We also perform tests on 41 real-world boundary condition bugs found from GitHub, where the model shows only a modest performance. Finally, we test the model on a large-scale Java code base from Adyen, our industrial partner. The model reported 36 buggy methods, but none of them were confirmed by developers.

Files

License info not available