Structural Adversarial Attack for Code Representation Models

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2023)

Abstract

As code intelligence and collaborative computing advances, code representation models (CRMs) have demonstrated exceptional performance in tasks such as code prediction and collaborative code development by leveraging distributed computing resources and shared datasets. Nonetheless, CRMs are often considered unreliable due to their vulnerability to adversarial attacks, failing to make correct predictions when faced with inputs containing perturbations. Several adversarial attack methods have been proposed to evaluate the robustness of CRMs and ensure their reliable in application. However, these methods rely primarily on code’s textual features, without fully exploiting its crucial structural features. To address this limitation, we propose STRUCK, a novel adversarial attack method that thoroughly exploits code’s structural features. The key idea of STRUCK lies in integrating multiple global and local perturbation methods and effectively selecting them by leveraging the structural features of the input code during the generation of adversarial examples for CRMs. We conduct comprehensive evaluations of seven basic or advanced CRMs using two prevalent code classification tasks, demonstrating STRUCK’s effectiveness, efficiency, and imperceptibility. Finally, we show that STRUCK enables a more precise assessment of CRMs’ robustness and increases their resistance to structural attacks through adversarial training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/features/copilot.

  2. 2.

    https://openai.com/blog/chatgpt.

References

  1. Codechef (2022). https://codechef.com/

  2. Ahmad, W.U., Chakraborty, S., Ray, B., Chang, K.W.: A transformer-based approach for source code summarization. ar**v preprint ar**v:2005.00653 (2020)

  3. Akhtar, N., Mian, A., Kardan, N., Shah, M.: Advances in adversarial attacks and defenses in computer vision: a survey. IEEE Access 9, 155161–155196 (2021)

    Article  Google Scholar 

  4. Allamanis, M., Brockschmidt, M., Khademi, M.: Learning to represent programs with graphs. In: ICLR 2018 - Conference Track Proceedings (2018)

    Google Scholar 

  5. Ben-Nun, T., Hoefler, T.: Demystifying parallel and distributed deep learning: An in-depth concurrency analysis. ar**v: Learning (2018)

  6. Cao, H., et al.: Prevention of gan-based privacy inferring attacks towards federated learning. In: Collaborative Computing: Networking, Applications and Worksharing (2022)

    Google Scholar 

  7. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy (2017)

    Google Scholar 

  8. Christlein, V., Riess, C., Jordan, J., Riess, C., Angelopoulou, E.: An evaluation of popular copy-move forgery detection approaches. IEEE Trans. Inf. Forensics Secur. 7(6), 1841–1854 (2012)

    Article  Google Scholar 

  9. Dai, H., Li, H., Tian, T., Huang, X., Wang, L., Zhu, J., Song, L.: Adversarial attack on graph structured data, pp. 1115–1124. PMLR (2018)

    Google Scholar 

  10. Dong, S., Wang, P., Abbas, K.: A survey on deep learning and its applications. Comput. Sci. Rev. 40, 100379 (2021)

    Article  MathSciNet  Google Scholar 

  11. Feng, Z., et al.: Codebert: a pre-trained model for programming and natural languages. In: Empirical Methods in Natural Language Processing (2020)

    Google Scholar 

  12. Fernandes, P., Allamanis, M., Brockschmidt, M.: Structured neural summarization. In: ICLR (2019)

    Google Scholar 

  13. Gao, S., Gao, C., Wang, C., Sun, J., Lo, D.: Carbon: a counterfactual reasoning based framework for neural code comprehension debiasing (2022)

    Google Scholar 

  14. Guo, D., Lu, S., Duan, N., Wang, Y., Zhou, M., Yin, J.: Unixcoder: unified cross-modal pre-training for code representation. In: ACL 2022, Dublin, Ireland (2022)

    Google Scholar 

  15. Guo, D., et al.: Graphcodebert: pre-training code representations with data flow. In: Learning (2020)

    Google Scholar 

  16. Hellendoorn, V.J., Sutton, C., Singh, R., Maniatis, P., Bieber, D.: Global relational models of source code. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020 (2020)

    Google Scholar 

  17. Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Summarizing source code using a neural attention model. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 2073–2083 (2016)

    Google Scholar 

  18. Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. ar**v: Learning (2016)

    Google Scholar 

  19. Li, J., Peng, J., Chen, L., Zheng, Z., Liang, T., Ling, Q.: Spectral adversarial training for robust graph neural network. IEEE Trans. Knowl. Data Eng. 35, 9240–9253 (2022)

    Article  Google Scholar 

  20. Li, J., **e, T., Chen, L., **e, F., He, X., Zheng, Z.: Adversarial attack on large scale graph. IEEE Trans. Knowl. Data Eng. 35(1), 82–95 (2021)

    Google Scholar 

  21. Li, S., Zheng, X., Zhang, X., Chen, X., Li, W.: Facial expression recognition based on deep spatio-temporal attention network. In: Collaborative Computing: Networking, Applications and Worksharing (2022)

    Google Scholar 

  22. Mou, L., Li, G., Zhang, L., Wang, T., **, Z.: Convolutional neural networks over tree structures for programming language processing. In: National Conference on Artificial Intelligence (2014)

    Google Scholar 

  23. Pour, M., Li, Z., Ma, L., Hemmati, H.: A search-based testing framework for deep neural networks of source code embedding. In: ICST (2021)

    Google Scholar 

  24. Rabin, M.R.I., Bui, N.D.Q., Wang, K., Yu, Y., Jiang, L., Alipour, M.A.: On the generalizability of Neural Program Models with respect to semantic-preserving program transformations. Inf. Softw. Technol. 135, 106552 (2021)

    Article  Google Scholar 

  25. Ramakrishnan, G., Albarghouthi, A.: Backdoors in neural models of source code. ar**v: Learning (2020)

    Google Scholar 

  26. Ramakrishnan, G., Henkel, J., Wang, Z., Albarghouthi, A., Jha, S., Reps, T.: Semantic robustness of models of source code. ar**v: Learning (2020)

    Google Scholar 

  27. Schuster, R., Song, C., Tromer, E., Shmatikov, V.: You autocomplete me: poisoning vulnerabilities in neural code completion. In: Usenix Security Symposium (2021)

    Google Scholar 

  28. Sun, X., Tong, M.: Hindom: a robust malicious domain detection system based on heterogeneous information network with transductive classification. Ar**v (2019)

    Google Scholar 

  29. Szegedy, C., et al.: Intriguing properties of neural networks. ar**v preprint ar**v:1312.6199 (2013)

  30. Vasic, M., Kanade, A., Maniatis, P., Bieber, D., Singh, R.: Neural program repair by jointly learning to localize and repair. ar**v: Learning (2019)

    Google Scholar 

  31. Wu, R., Zhang, Y., Peng, Q., Chen, L., Zheng, Z.: A survey of deep learning models for structural code understanding. ar**v preprint ar**v:2205.01293 (2022)

  32. Wu, X.: Blackbox adversarial attacks and explanations for automatic speech recognition. In: ESEC/FSE 2022 (2022)

    Google Scholar 

  33. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2020)

    Article  MathSciNet  Google Scholar 

  34. Yang, Z., Shi, J., He, J., Lo, D.: Natural attack for pre-trained models of code. In: ICSE 2022, New York, NY, USA (2022)

    Google Scholar 

  35. Yefet, N., Alon, U., Yahav, E.: Adversarial examples for models of code. In: Proceedings of the ACM on Programming Languages, vol. 4, no. OOPSLA, pp. 1–30 (2020)

    Google Scholar 

  36. Zhang, H., et al.: Towards robustness of deep program processing models-detection, estimation, and enhancement. TOSEM 31(3), 1–40 (2022)

    Article  MathSciNet  Google Scholar 

  37. Zhang, H., Li, Z., Li, G., Ma, L., Liu, Y., **, Z.: Generating adversarial examples for holding robustness of source code processing models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1169–1176 (2020)

    Google Scholar 

  38. Zhang, J., Wang, X., Zhang, H., Sun, H., Wang, K., Liu, X.: A novel neural source code representation based on abstract syntax tree. In: ICSE. IEEE (2019)

    Google Scholar 

  39. Zhang, W.E., Sheng, Q.Z., Alhazmi, A., Li, C.: Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans. Intell. Syst. Technol. (TIST) 11(3), 1–41 (2020)

    Google Scholar 

  40. Zhang, W., Guo, S., Zhang, H., Sui, Y., Xue, Y., Xu, Y.: Challenging machine learning-based clone detectors via semantic-preserving code transformations. IEEE Trans. Softw. Eng. 49(5), 3052–3070 (2023)

    Article  Google Scholar 

  41. Zhou, Y., Shi, D., Yang, H., Hu, H., Yang, S., Zhang, Y.: Deep reinforcement learning for multi-UAV exploration under energy constraints. In: Collaborative Computing: Networking, Applications and Worksharing (2022)

    Google Scholar 

Download references

Acknowledgment

The research is supported by the National Key R&D Program of China under grant No. 2022YFF0902500, the Guangdong Basic and Applied Basic Research Foundation, China (No. 2023A1515011050). Liang Chen is the corresponding author.The research is supported by the National Key R&D Program of China under grant No. 2022YFF0902500, the Guangdong Basic and Applied Basic Research Foundation, China (No. 2023A1515011050). Liang Chen is the corresponding author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Wu, R., Liao, J., Chen, L. (2024). Structural Adversarial Attack for Code Representation Models. In: Gao, H., Wang, X., Voros, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 562. Springer, Cham. https://doi.org/10.1007/978-3-031-54528-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54528-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54527-6

  • Online ISBN: 978-3-031-54528-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation