Adversarially Robust Decision Trees Against User-Specified Threat Models

More Info
expand_more

Abstract

In the present day we use machine learning for sensitive tasks that require models to be both understandable and robust. Although traditional models such as decision trees are understandable, they suffer from adversarial attacks. When a decision tree is used to differentiate between a user's benign and malicious behavior, an adversarial attack allows the user to effectively evade the model by perturbing the inputs the model receives. We can use algorithms that take adversarial attacks into account to fit trees that are more robust. In this work we propose an algorithm that is two orders of magnitudes faster and scores 4.3% better on accuracy against adversaries moving all samples than the state-of-the-art work while accepting an intuitive and permissible threat model. Where previous threat models were limited to distance norms, we allow each feature to be perturbed with a user-specified threat model specifying either a maximum distance or constraints on the direction of perturbation. Additionally we introduce two hyperparameters rho and phi that can control the trade-off between accuracy vs robustness and accuracy vs fairness respectively. Using the hyperparameters we can train models with less than 5% difference in false positive rate between population groups while scoring on average 2.4% higher on accuracy against adversarial attacks. Lastly, we show that our decision trees perform similarly to more complex random forests of fair and robust decision trees.