Learning Off-By-One Mistakes

None, None; None, None; None, None; None, None

Learning Off-By-One Mistakes

An Empirical Study

Conference Paper (2021)

Author(s)

Hendrig Sellik (Student TU Delft)

Onno van Paridon (Adyen B.V.)

Georgios Gousios (TU Delft - Software Engineering)

Maurício Aniche (TU Delft - Software Engineering)

Research Group

Software Engineering

DOI related publication

https://doi.org/10.1109/MSR52588.2021.00019

Software testing Boundary testing Deep learning for software engineering Machine learning for software engineering

To reference this document use:

https://resolver.tudelft.nl/uuid:fb773461-fa1b-41e3-87da-028b5bff9d8a

More Info

expand_more

Publication Year

2021

Language

English

Research Group

Software Engineering

Bibliographical Note

Accepted author manuscript

Pages (from-to)

58-67

ISBN (print)

978-1-6654-2985-6

ISBN (electronic)

978-1-7281-8710-5

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary conditions, the problem remains open. In this paper, we explore the effectiveness of deep learning models in learning and predicting mistakes in boundary conditions. We train different models on approximately 1.6M examples with faults in different boundary conditions. We achieve a precision of 85% and a recall of 84% on a balanced dataset, but lower numbers in an imbalanced dataset. We also perform tests on 41 real-world boundary condition bugs found from GitHub, where the model shows only a modest performance. Finally, we test the model on a large-scale Java code base from Adyen, our industrial partner. The model reported 36 buggy methods, but none of them were confirmed by developers.

Files

Msr2021_learning_off_by_one.pd... (pdf)

(pdf | 0.394 Mb)

License info not available