Rule-Based Runtime Mitigation Against Poison Attacks on Neural Networks

Usman, Muhammad; Gopinath, Divya; Sun, Youcheng; Păsăreanu, Corina S.

doi:10.1007/978-3-031-17196-3_4

Muhammad Usman⁹,
Divya Gopinath^10,11,
Youcheng Sun¹² &
…
Corina S. Păsăreanu^10,11,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13498))

Included in the following conference series:

International Conference on Runtime Verification

486 Accesses

The original version of this chapter was revised: minor error in figure 3 was corrected. The correction to this chapter is available at https://doi.org/10.1007/978-3-031-17196-3_23

Abstract

Poisoning or backdoor attacks are well-known attacks on image classification neural networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that the network learns to mis-classify any input with the trigger to a specific target label. We propose a set of runtime mitigation techniques, embodied by the tool AntidoteRT, which employs rules in terms of neuron patterns to detect and correct network behavior on poisoned inputs. The neuron patterns for correct and incorrect classifications are mined from the network based on running it on a clean and an optional set of poisoned samples with known ground-truth labels. AntidoteRT offers two methods for runtime correction: (i) pattern-based correction which employs patterns as oracles to estimate the ideal label, and (ii) input-based correction which corrects the input image by localizing the trigger and resetting it to a neutral color. We demonstrate that our techniques outperform existing defenses such as NeuralCleanse and STRIP on popular benchmarks such as MNIST, CIFAR-10, and GTSRB against the popular BadNets attack and the more complex DFST attack.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 47.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 59.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

VPN: Verification of Poisoning in Neural Networks

Mitigating Backdoor Attacks on Deep Neural Networks

Adversarial Examples for Malware Detection

Change history

23 September 2022
In an older version of this paper, there was error in the figure 3, (e) and (f) was incorrect. This has been corrected.

Notes

1.
Code/data is available at https://github.com/muhammadusman93/AntidoteRT.

References

Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: WACV, pp. 839–847. IEEE (2018)
Google Scholar
Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. In: SafeAI@ AAAI (2019)
Google Scholar
Chen, H., Fu, C., Zhao, J., Koushanfar, F.: DeepInspect: a black-box trojan detection and mitigation framework for deep neural networks. In: IJCAI, pp. 4658–4664 (2019)
Google Scholar
Cheng, S., Liu, Y., Ma, S., Zhang, X.: Deep feature space trojan attack of neural networks by controlled detoxification. In: AAAI, vol. 35, pp. 1148–1156 (2021)
Google Scholar
Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: STRIP: a defense against trojan attacks on deep neural networks. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp. 113–125 (2019)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Gopinath, D., Converse, H., Pasareanu, C., Taly, A.: Property inference for deep neural networks. In: International Conference on Automated Software Engineering (ASE), pp. 797–809. IEEE (2019)
Google Scholar
Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)
Article Google Scholar
Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world images: the German traffic sign detection benchmark. In: International Joint Conference on Neural Networks, no. 1288 (2013)
Google Scholar
Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)
Article MathSciNet MATH Google Scholar
Katz, G., et al.: The Marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11561, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26
Chapter Google Scholar
Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: CVPR, pp. 301–310 (2020)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Li, Y., Lyu, X., Koren, N., Lyu, L., Li, B., Ma, X.: Neural attention distillation: erasing backdoor triggers from deep neural networks. In: International Conference on Learning Representations (2020)
Google Scholar
Li, Y., Zhai, T., Wu, B., Jiang, Y., Li, Z., **a, S.: Rethinking the trigger of backdoor attack. ar**v preprint ar**v:2004.04692 (2020)
Liu, K., Dolan-Gavitt, B., Garg, S.: Fine-pruning: defending against backdooring attacks on deep neural networks. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 273–294. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_13
Chapter Google Scholar
Liu, X., Li, F., Wen, B., Li, Q.: Removing backdoor-based watermarks in neural networks with limited data. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 10149–10156. IEEE (2021)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: 25th Annual Network and Distributed System Security Symposium, NDSS. The Internet Society (2018)
Google Scholar
Liu, Y., Ma, X., Bailey, J., Lu, F.: Reflection backdoor: a natural backdoor attack on deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 182–199. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_11
Chapter Google Scholar
Liu, Y., **e, Y., Srivastava, A.: Neural trojans. In: International Conference on Computer Design (ICCD), pp. 45–48. IEEE (2017)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV, pp. 618–626 (2017)
Google Scholar
Steinhardt, J., Koh, P.W., Liang, P.: Certified defenses for data poisoning attacks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 3520–3532 (2017)
Google Scholar
Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, no. 31 (2018)
Google Scholar
Turner, A., Tsipras, D., Madry, A.: Clean-label backdoor attacks (2018)
Google Scholar
Udeshi, S., Peng, S., Woo, G., Loh, L., Rawshan, L., Chattopadhyay, S.: Model agnostic defence against backdoor attacks in machine learning. ar**v preprint ar**v:1908.02203 (2019)
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: S &P, pp. 707–723. IEEE (2019)
Google Scholar
Wang, R., Zhang, G., Liu, S., Chen, P.-Y., **ong, J., Wang, M.: Practical detection of trojan neural networks: data-limited and data-free cases. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 222–238. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_14
Chapter Google Scholar
Xu, X., Wang, Q., Li, H., Borisov, N., Gunter, C.A., Li, B.: Detecting AI trojans using meta neural analysis. In: S &P, pp. 103–120. IEEE (2021)
Google Scholar
Yao, Y., Li, H., Zheng, H., Zhao, B.Y.: Latent backdoor attacks on deep neural networks. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2041–2055 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Texas at Austin, Austin, USA
Muhammad Usman
KBR Inc., Houston, USA
Divya Gopinath & Corina S. Păsăreanu
NASA Ames, Mountain View, USA
Divya Gopinath & Corina S. Păsăreanu
The University of Manchester, Manchester, UK
Youcheng Sun
Carnegie Mellon University, CyLab, Pittsburgh, USA
Corina S. Păsăreanu

Authors

Muhammad Usman
View author publications
You can also search for this author in PubMed Google Scholar
Divya Gopinath
View author publications
You can also search for this author in PubMed Google Scholar
Youcheng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Corina S. Păsăreanu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Divya Gopinath .

Editor information

Editors and Affiliations

CNRS/Verimag, Saint Martin d’Hères, France
Thao Dang
Høgskulen på Vestlandet, Bergen, Norway
Volker Stolz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Usman, M., Gopinath, D., Sun, Y., Păsăreanu, C.S. (2022). Rule-Based Runtime Mitigation Against Poison Attacks on Neural Networks. In: Dang, T., Stolz, V. (eds) Runtime Verification. RV 2022. Lecture Notes in Computer Science, vol 13498. Springer, Cham. https://doi.org/10.1007/978-3-031-17196-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-17196-3_4
Published: 23 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17195-6
Online ISBN: 978-3-031-17196-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Rule-Based Runtime Mitigation Against Poison Attacks on Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VPN: Verification of Poisoning in Neural Networks

Mitigating Backdoor Attacks on Deep Neural Networks

Adversarial Examples for Malware Detection

Change history

23 September 2022

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Rule-Based Runtime Mitigation Against Poison Attacks on Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VPN: Verification of Poisoning in Neural Networks

Mitigating Backdoor Attacks on Deep Neural Networks

Adversarial Examples for Malware Detection

Change history

23 September 2022

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation