A deep dive into the robustness of AdaBoost Ensembling combined with Adversarial Training

More Info
expand_more

Abstract

Adversarial training and its variants have become the standard defense against adversarial attacks - perturbed inputs designed to fool the model. Boosting techniques such as Adaboost have been successful for binary classification problems, however, there is limited research in the application of them for providing adversarial robustness. In this work, we explore the question: How can AdaBoost ensemble learning provide adversarial robustness to white-box attacks when the "weak" learners are neural networks that do adversarial training? We design an extension of AdaBoost to support adversarial training in a multiclass setting, and name it Adven. To answer the question, we systematically study the effect of six variables of Adven’s training procedure on adversarial robustness. From a theoretical standpoint, our experiments show that known characteristics from adversarial training and ensemble learning apply in the combined context. Empirically, we demonstrate that an Adven ensemble is more robust than a single learner in every scenario. Using the best found values of the six tested variables, we derive an Adven ensemble that can defend against 91.88% of PGD attacks and obtain 96.72% accuracy on the MNIST dataset.