OffSide

Learning to Identify Mistakes in Boundary Conditions

Conference Paper (2020)
Author(s)

Jón Arnar Briem (Student TU Delft)

Jordi Smit (Student TU Delft)

Hendrig Sellik (Student TU Delft)

Pavel Rapoport (Student TU Delft)

G. Gousios (TU Delft - Software Engineering)

Maurício Aniche (TU Delft - Software Engineering)

Research Group
Software Engineering
Copyright
© 2020 Jón Arnar Briem, Jordi Smit, Hendrig Sellik, Pavel Rapoport, G. Gousios, Maurício Aniche
DOI related publication
https://doi.org/10.1145/3387940.3391464
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 Jón Arnar Briem, Jordi Smit, Hendrig Sellik, Pavel Rapoport, G. Gousios, Maurício Aniche
Research Group
Software Engineering
Pages (from-to)
203-208
ISBN (print)
978-1-4503-7963-2
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Mistakes in boundary conditions are the cause of many bugs in software. These mistakes happen when, e.g., developers make use of '<' or '>' in cases where they should have used '<=' or '>='. Mistakes in boundary conditions are often hard to find and manually detecting them might be very time-consuming for developers. While researchers have been proposing techniques to cope with mistakes in the boundaries for a long time, the automated detection of such bugs still remains a challenge. We conjecture that, for a tool to be able to precisely identify mistakes in boundary conditions, it should be able to capture the overall context of the source code under analysis. In this work, we propose a deep learning model that learn mistakes in boundary conditions and, later, is able to identify them in unseen code snippets. We train and test a model on over 1.5 million code snippets, with and without mistakes in different boundary conditions. Our model shows an accuracy from 55% up to 87%. The model is also able to detect 24 out of 41 real-world bugs; however, with a high false positive rate. The existing state-of-the-practice linter tools are not able to detect any of the bugs. We hope this paper can pave the road towards deep learning models that will be able to support developers in detecting mistakes in boundary conditions.

Files

Deeptest_2020.pdf
(pdf | 0.686 Mb)
License info not available