ARE-CAM: An Interpretable Approach to Quantitatively Evaluating the Adversarial Robustness of Deep Models Based on CAM

Li, Zituo; Sun, Jianbin; Qin, Yuqi; Ju, Lunhao; Yang, Kewei

doi:10.1007/978-3-031-53305-1_21

Zituo Li¹⁴,
Jianbin Sun¹⁴,
Yuqi Qin¹⁴,
Lunhao Ju¹⁴ &
…
Kewei Yang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14554))

Included in the following conference series:

International Conference on Multimedia Modeling

547 Accesses

Abstract

Evaluating the adversarial robustness of deep models is critical for training more robust models. However, few methods are both interpretable and quantifiable. Interpretable evaluation methods cannot quantify adversarial robustness, leading to unobjective evaluation results. On the other hand, quantifiable evaluation methods are often unexplainable, making it difficult for evaluators to trust and trace the results. To address this issue, an adversarial robustness evaluation approach based on class activation map** (ARE-CAM) is proposed. This approach utilizes CAM to generate heatmaps and visualize the areas of concern for the model. By comparing the difference between the original example and the adversarial example from the perspective of visual and statistical characteristics, the changes in the model after being attacked are observed, which enhances the interpretability of the evaluation. Additionally, four metrics are proposed to quantify adversarial robustness: the average coverage coincidence rate (ACCR), average high activation coincidence rate (AHCR), average heat area difference (AHAD) and average heat difference (AHD). Comprehensive experiments are conducted based on 14 deep models and different datasets to verify ARE-CAM’s efficiency. To the best of our knowledge, ARE-CAM is the first quantifiable and interpretable approach for evaluating adversarial robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 60.98; Price includes VAT (France)

Softcover Book: EUR 78.06; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

More Than Accuracy: An Empirical Study of Consistency Between Performance and Interpretability

Boosting Robustness Verification of Semantic Feature Neighborhoods

Computational Analysis of Robustness in Neural Network Classifiers

Notes

References

Bae, W., Noh, J., Kim, G.: Rethinking class activation map** for weakly supervised object localization. vol. 12360 LNCS, pp. 618–634. Glasgow, United kingdom (2020)
Google Scholar
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. vol. 2018-January, pp. 839–847. Lake Tahoe, NV, United states (2018)
Google Scholar
Chen, S.H., Shen, H.J., Wang, R., Wang, X.Z.: Relationship between prediction uncertainty and adversarial robustness. J. Softw. 33(2), 524–538 (2022)
Google Scholar
Ding, G.W., Wang, L., **, X.: Advertorch v0. 1: an adversarial robustness toolbox based on pytorch. ar**v preprint ar**v:1902.07623 (2019)
Dong, Y., Fu, et al.: Benchmarking adversarial robustness on image classification, pp. 318–328. Virtual, Online, United states (2020)
Google Scholar
Guo, J., et al.: A comprehensive evaluation framework for deep model robustness: an evaluation framework for model robustness. Pattern Recogn. 137, 109308 (2023)
Google Scholar
Šircelj, J., Skoaj, D.: Accuracy-perturbation curves for evaluation of adversarial attack and defence methods, pp. 6290–6297. Virtual, Milan, Italy (2020)
Google Scholar
Ju, L., Cui, R., Sun, J., Li, Z.: A robust approach to adversarial attack on tabular data for classification algorithm testing, pp. 371–376. Guiyang, China (2022)
Google Scholar
Li, Y., **, W., Xu, H., Tang, J.: Deeprobust: a pytorch library for adversarial attacks and defenses. ar**v preprint ar**v:2005.06149 (2020)
Li, Z., Sun, J., Yang, K., **ong, D.: A review of adversarial robustness evaluation for image classification. J. Comput. Res. Develop. 59(10), 2164–2189 (2022)
Google Scholar
Ling, X., et al.: Deepsec: A uniform platform for security analysis of deep learning model, vol. 2019-May, pp. 673–690. San Francisco, CA, United states (2019)
Google Scholar
Luo, B., Liu, Y., Wei, L., Xu, Q.: Towards imperceptible and robust adversarial example attacks against neural networks, pp. 1652–1659. New Orleans, LA, United states (2018)
Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks, vol. 2016-December, pp. 2574–2582. Las Vegas, NV, United states (2016)
Google Scholar
Papernot, N., et al.: Technical report on the cleverhans v2. 1.0 adversarial examples library. ar**v preprint ar**v:1610.00768 (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128(2), 336–359 (2020)
Article Google Scholar
Weng, T.W., Zhang, H., Chen, P.Y., Lozano, A., Hsieh, C.J., Daniel, L.: On extensions of clever: a neural network robustness evaluation algorithm, pp. 1159–1163. Anaheim, CA, United states (2018)
Google Scholar
Weng, T.W., et al.: Evaluating the robustness of neural networks: an extreme value theory approach. Vancouver, BC, Canada (2018)
Google Scholar
Zhang, C., et al.: Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. IEEE Trans. Image Process. 30, 1291–1304 (2021)
Article Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. vol. 2016-December, pp. 2921–2929. Las Vegas, NV, United states (2016)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant No. 71901212 and No. 72071206, in part by Key Projects of the National Natural Science Foundation of China under Grant No. 72231011.

Author information

Authors and Affiliations

College of Systems Engineering, National University of Defense Technology, Changsha, 410073, China
Zituo Li, Jianbin Sun, Yuqi Qin, Lunhao Ju & Kewei Yang

Authors

Zituo Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianbin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuqi Qin
View author publications
You can also search for this author in PubMed Google Scholar
Lunhao Ju
View author publications
You can also search for this author in PubMed Google Scholar
Kewei Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianbin Sun .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Stevan Rudinac
Delft University of Technology, Delft, The Netherlands
Alan Hanjalic
Delft University of Technology, Delft, The Netherlands
Cynthia Liem
University of Amsterdam, Amsterdam, The Netherlands
Marcel Worring
Reykjavik University, Reykjavik, Iceland
Björn Þór Jónsson
Microsoft Research Lab – Asia, Bei**g, China
Bei Liu
The University of Tokyo, Tokyo, Japan
Yoko Yamakata

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Sun, J., Qin, Y., Ju, L., Yang, K. (2024). ARE-CAM: An Interpretable Approach to Quantitatively Evaluating the Adversarial Robustness of Deep Models Based on CAM. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14554. Springer, Cham. https://doi.org/10.1007/978-3-031-53305-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-53305-1_21
Published: 28 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53304-4
Online ISBN: 978-3-031-53305-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ARE-CAM: An Interpretable Approach to Quantitatively Evaluating the Adversarial Robustness of Deep Models Based on CAM

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

More Than Accuracy: An Empirical Study of Consistency Between Performance and Interpretability

Boosting Robustness Verification of Semantic Feature Neighborhoods

Computational Analysis of Robustness in Neural Network Classifiers

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ARE-CAM: An Interpretable Approach to Quantitatively Evaluating the Adversarial Robustness of Deep Models Based on CAM

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

More Than Accuracy: An Empirical Study of Consistency Between Performance and Interpretability

Boosting Robustness Verification of Semantic Feature Neighborhoods

Computational Analysis of Robustness in Neural Network Classifiers

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation