Abstract
Patch attack, which introduces a perceptible but localized change to the input image, has gained significant momentum in recent years. In this paper, we present a unified framework to analyze certified patch defense tasks, including both certified detection and certified recovery, leveraging the recently emerged Vision Transformers (ViTs). In addition to the existing patch defense setting where only one patch is considered, we provide the very first study on develo** certified detection against the dual patch attack, in which the attacker is allowed to adversarially manipulate pixels in two different regions.
By building upon the latest progress in self-supervised ViTs with masked image modeling (i.e., masked autoencoder (MAE)), our method achieves state-of-the-art performance in both certified detection and certified recovery of adversarial patches. Regarding certified detection, we improve the performance by up to \(\sim 16\%\) on ImageNet without training on a single adversarial patch, and for the first time, can also tackle the more challenging dual patch setting. Our method largely closes the gap between detection-based certified robustness and clean image accuracy. Regarding certified recovery, our approach improves certified accuracy by \(\sim 2\%\) on ImageNet across all attack sizes, attaining the new state-of-the-art performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, H., Dong, L., Wei, F.: Beit: BERT pre-training of image transformers. In: ICLR (2022)
Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. ar**v preprint ar**v:1712.09665 (2017)
Chiang, P.Y., Ni, R., Abdelkader, A., Zhu, C., Studer, C., Goldstein, T.: Certified defenses for adversarial patches. In: ICLR (2020)
Cisse, M.M., Adi, Y., Neverova, N., Keshet, J.: Houdini: fooling deep structured visual and speech recognition models with adversarial examples. In: NeurIPS (2017)
Cohen, J., Rosenfeld, E., Kolter, Z.: Certified adversarial robustness via randomized smoothing. In: ICML (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: ICLR (2021)
Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: CVPR (2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Han, H., X., et al.: ScaleCert: scalable certified defense against adversarial patches with sparse superficial layers. In: NeurIPS (2021)
Hayes, J.: On visible adversarial perturbations & digital watermarking. In: CVPR Workshops (2018)
He, K., Chen, X., **e, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: CVPR (2021)
Huang, L., et al.: Universal physical camouflage attacks on object detectors. In: CVPR (2020)
Huang, Y., Li, Y.: Zero-shot certified defense against adversarial patches with vision transformers. ar**v preprint ar**v:2111.10481 (2021)
Levine, A., Feizi, S.: (De) Randomized smoothing for certifiable defense against patch attacks. In: NeurIPS (2020)
Levine, A., Feizi, S.: Robustness certificates for sparse adversarial attacks by randomized ablation. In: AAAI (2020)
McCoyd, M., et al.: Minority reports defense: defending against adversarial patches. In: ACNS (2020)
Metzen, J.H., Yatsura, M.: Efficient certified defenses against patch attacks on image classifiers. In: ICLR (2021)
Mirman, M., Gehr, T., Vechev, M.: Differentiable abstract interpretation for provably robust neural networks. In: ICML (2018)
Naseer, M., Khan, S., Porikli, F.: Local gradients smoothing: defense against localized adversarial attacks. In: WACV (2019)
Salman, H., Jain, S., Wong, E., Madry, A.: Certified patch robustness via smoothed vision transformers. In: CVPR (2022)
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: ICML (2021)
Wei, C., Fan, H., **e, S., Wu, C.Y., Yuille, A., Feichtenhofer, C.: Masked feature prediction for self-supervised visual pre-training. In: CVPR (2022)
Wu, Z., Lim, S.N., Davis, L.S., Goldstein, T.: Making an invisibility cloak: real world adversarial attacks on object detectors. In: ECCV (2020)
**ang, C., Bhagoji, A.N., Sehwag, V., Mittal, P.: PatchGuard: a provably robust defense against adversarial patches via small receptive fields and masking. In: USENIX Security Symposium (2021)
**ang, C., Mahloujifar, S., Mittal, P.: PatchCleanser: certifiably robust defense against adversarial patches for any image classifier. In: USENIX Security Symposium (2022)
**ang, C., Mittal, P.: Patchguard++: efficient provable attack detection against adversarial patches. ar**v preprint ar**v:2104.12609 (2021)
**e, C., Wang, J., Zhang, Z., Zhou, Y., **e, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection. In: ICCV (2017)
Yang, C., Kortylewski, A., **e, C., Cao, Y., Yuille, A.: PatchAttack: a black-box texture-based attack with reinforcement learning. In: ECCV (2020)
Zhou, J., et al.: IBOT: image BERT pre-training with online tokenizer. In: ICLR (2022)
Acknowledgment
This work is supported by a gift from Open Philanthropy, TPU Research Cloud (TRC) program, and Google Cloud Research Credits program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Zhang, H., **e, C. (2022). ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-19806-9_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19805-2
Online ISBN: 978-3-031-19806-9
eBook Packages: Computer ScienceComputer Science (R0)