Predicting True Vulnerabilities from Static Analyzer Warnings in Industry

An Attempt to Faster Releasing Software in Industry

More Info
expand_more

Abstract

An increasing digital world, comes with many benefits but unfortunately also many drawbacks. The increase of the digital world means an increase in data and software. Developing more software unfortunately also means a higher probability of vulnerabilities, which can be exploited by adversaries. Adversaries taking advantage of users and software vulnerabilities, by stealing data to cause harm, steal money, etc. This makes the digital world a dangerous environment.
To ensure software has a minimal number of vulnerabilities, companies invest in software tools and experts to check their software for vulnerabilities. One such company is ING, the largest bank of The Netherlands. At ING they use Fortify, a static analyzer. The problem with this tool is that it gives many false positives. Therefore, pentesters and developers have to manually check all the warnings given by Fortify, which takes a lot of time and slows down the whole software development process. In this study, we propose to use supervised machine learning techniques to predict true vulnerabilities from static analyzer warnings. Using ING's data from Fortify, two highly imbalanced datasets with code metrics are created on class and method level. Various classifiers and sampling techniques are compared to determine which techniques perform the best. Next to that, we also compared the performance at different levels of granularity. Finally, we also investigate whether a dataset with different types of vulnerabilities performs better than a dataset consisting of only one vulnerability type. From our study, it is clear that Bagging in combination with ClassBalancer gives the best f-measure (0.618) for the class-level dataset, which is slightly good. Random Forest with SMOTE gives the best f-measure (0.412) for the method-level dataset, which we consider weak. Depending on the type of vulnerability, the performance can benefit from a dataset per vulnerability type. Overall, the performance found in this study shows slightly promising results when using Fortify in combination with supervised machine learning, especially compared to only using Fortify.