MSID: A Multi-Scale Diffusion-Based Inpainting Defense Against Adversarial Attacks

More Info
expand_more

Abstract

This thesis paper addresses the vulnerability of Deep Neural Networks (DNNs) to adversarial attacks. We introduce Multi-Scale Inpainting Defense (MSID), a novel adversarial purification method leveraging a pre-trained diffusion denoising probabilistic model (DDPM) for targeted perturbation removal. MSID employs a four-step process: (1) multi-scale superpixel segmentation, (2) occlusion sensitivity map generation at multiple scales to identify important regions for inference, (3) targeted inpainting using the DDPM, and (4) artifact removal using Variance Preservation Sampling. We investigate the effectiveness of diffusion-based inpainting for robust defense, the impact of multi-scale occlusion sensitivity mapping, and the robustness of MSID against a set of adversarial attacks, including color-based attacks. Our experiments demonstrate that MSID outperforms existing adversarial purification methods, achieving robustness improvements of up to 5.42% on CIFAR-10 and 10.75% on ImageNet against AutoAttack, with further gains against PGD and unseen attacks, while maintaining high standard accuracy. This paper, to the best of our knowledge, is the first to apply DDPM inpainting for targeted adversarial purification and demonstrates its effectiveness in purifying a range of adversarial attacks.

Files