FATE
Fuzzing for Adversarial examples in Tree Ensembles
C.J.H. Bilstra (TU Delft - Electrical Engineering, Mathematics and Computer Science)
S.E. Verwer – Mentor (TU Delft - Cyber Security)
C.B. Poulsen – Graduation committee member (TU Delft - Programming Languages)
D.A. Vos – Coach (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Machine learning models are increasing in popularity and are nowadays used in a wide range of critical applications in fields such as Automotive, Aviation and Medical. Among machine learning models, tree ensemble models are a popular choice due to their competitive performance and high degree of explainability. Like most machine learning models they however suffer from adversarial examples: slightly perturbed input for which the model makes an unexpected prediction. These can be seen as bugs in the model and in critical applications such a bug may have high impact. We investigate if fuzzers, a popular and effective tool for identifying bugs in software, can be used for finding bugs (adversarial examples) in tree ensemble models as well.
We introduce FATE, a tool based on grey-box fuzzers that is able to find adversarial examples on a multitude of datasets. Using a custom mutator that leverages domain information as well as model-specific information such as splitting thresholds and dataset-specific information such as training samples, FATE is able to find good adversarial examples: for non-image classification models they are within 1 percent-point difference from examples generated by the state-of-the-art (Zhang et al., 2020). However, the coverage-guidance of grey-box fuzzers actually limits the performance of FATE: running the mutator of FATE as a (1+1) Evolutionary Algorithm makes FATE show competitive performance to the state-of-the-art, even outperforming it on some datasets.