Non-invasive Characterization of Out-Of-Bounds Write Vulnerabilities

More Info
expand_more

Abstract

As more and more aspects of our society and economy rely on software, security vulnerabilities in programs have become an increasingly significant threat. One such class of vulnerabilities are out-of-bounds writes, which are still one of the most widespread and dangerous types of security bugs in memory-unsafe programming languages five decades after their initial conceptualization.
Their discovery through automated methods like fuzzing, however, has gained substantial traction in the past years and facilitates their discovery in large numbers with minimal human intervention. On the other hand, the analysis following their initial discovery - triaging, root cause analysis, and patching - mostly requires human expertise and intuition. As this need for manual effort affects the time to assess a vulnerability's security implications and prioritize patching accordingly, partial automation of the triaging process is essential. A promising approach is the automatic extraction of vulnerability characteristics that provide vital clues for a bug's exploitability. Such characteristics, which typically require substantial effort to collect manually, can aid a human analyst in triaging and thereby speed up the process while, at the same time, making it more accessible.
In this work, we investigate how the set of affected source code-level objects, which is a decisive indicator of an out-of-bounds write vulnerability's exploitability, can be automatically distilled from a program and an input suspected of causing an out-of-bounds write. As this poses unique challenges with regard to invasiveness, we propose a novel approach for out-of-bounds write detection that facilitates monitoring a compiled program for spatial memory safety violations without the need for instrumentation. We implement a prototype of our design and evaluate it on benchmarks and real-world vulnerabilities, showing that its detection performance is largely on par with state-of-the-art instrumentation-based detection approaches and even outperforms them in several scenarios at the cost of increased performance overhead and a higher rate of false positives.