Retrieval-Augmentation for Adversarial Robust Visual Classification
To retrieve or not to retrieve
O.J. Braakman (TU Delft - Electrical Engineering, Mathematics and Computer Science)
N.M. Gürel – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
J.C. Gemert – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
S. Dumančić – Graduation committee member (TU Delft - Algorithmics)
S. van Rooij – Mentor (TNO)
G. Burghouts – Mentor (TNO)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
State-of-the-art models are susceptible to adversarial attacks. These attacks can cause catastrophic misclassification when robustness is required. With the increasing popularity of the retrieval augmentation paradigm in deep learning, we adopt it as a fully differential framework for adversarial robustness. We evaluate our method on three visual classification datasets, including ImageNet and attack our model with two white box attacks and a black box attack under various L2 and L∞ norms. The results indicate that a robust classifier emerges if the model fully relies on retrieved examples. We find that we can already obtain a PGD robust ImageNet classifier with 80.1% clean and 64.7% adversarial accuracy, using only one or two examples per class from the training data in the memory set. Contrary to other adversarial defense mechanisms, our method works directly on top of pre-trained models and remains robust when other defenses start to degrade for PGD attacks increasing in strength.