Log in

Transferable universal adversarial perturbations against speaker recognition systems

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Deep neural networks (DNN) exhibit powerful feature extraction capabilities, making them highly advantageous in numerous tasks. DNN-based techniques have become widely adopted in the field of speaker recognition. However, imperceptible adversarial perturbations can severely disrupt the decisions made by DNNs. In addition, researchers identified universal adversarial perturbations that can efficiently and significantly attack deep neural networks. In this paper, we propose an algorithm for conducting effective universal adversarial attacks by investigating the dominant features in the speaker recognition task. Through experiments in various scenarios, we find that our perturbations are not only more effective and undetectable but also exhibit a certain degree of transferablity across different datasets and models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4

Similar content being viewed by others

Availability of data and materials

All dataset can be accessed through their respective official websites or mirrors. The VCTK dataset can be accessed at https://datashare.ed.ac.uk/handle/10283/3443. The TIMIT dataset can be accessed at https://catalog.ldc.upenn.edu/LDC93S1. The LibriSpeech dataset can be accessed at https://www.openslr.org/12.

References

  1. Singh, S.P., Kumar, A., Darbari, H., Singh, L., Rastogi, A., Jain, S.: Machine translation using deep learning: An overview. In: 2017 international conference on computer, communications and electronics (comptelix), pp. 162–167 (2017) IEEE

  2. Deng, L., Platt, J.: Ensemble deep learning for speech recognition. In: Proc. interspeech (2014)

  3. Grigorescu, S., Trasnea, B., Cocias, T., Macesanu, G.: A survey of deep learning techniques for autonomous driving. J. Field Robot. 37(3), 362–386 (2020)

    Article  Google Scholar 

  4. Zhao, A., Gu, Z., Jia, Y., Feng, W., Zhang, Y.: TSEE: a novel knowledge embedding framework for cyberspace security (2023)

  5. Du, L., Gu, Z., Wang, Y., Wang, L., Jia, Y.: A Few-Shot Class-Incremental Learning Method for Network Intrusion Detection. IEEE Trans. Netw. Serv, Manag (2023)

  6. Jia, Y., Gu, Z., Du, L., Long, Y., Wang, Y., Li, J., Zhang, Y.: Artificial intelligence enabled cyber security defense for smart cities: A novel attack detection framework based on the MDATA model. Knowl.-Based Syst. 276, 110781 (2023)

  7. Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2021)

    Google Scholar 

  8. Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)

    Article  Google Scholar 

  9. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y.: Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(6), 2094–2107 (2014)

  10. Jia, Y., Gu, Z., Jiang, Z., Gao, C., Yang, J.: Persistent graph stream summarization for real-time graph analytics. World Wide Web, 1–21 (2023)

  11. Soewito, B., Gaol, F.L., Simanjuntak, E., Gunawan, F.E.: Smart mobile attendance system using voice recognition and fingerprint on smartphone. In: 2016 International Seminar on Intelligent Technology and Its Applications (ISITIA), pp. 175–180 (2016). IEEE

  12. Dimaunahan, E.D., Ballado, A.H., Cruz, F.R.G., Cruz, J.C. D.: MFCC and VQ voice recognition based ATM security for the visually disabled. In: 2017IEEE 9th international conference on humanoid, nanotechnology, information technology, communication and control, environment and management (HNICEM), pp. 1–5 (2017). IEEE

  13. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations, ICLR 2014 (2014)

  14. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing Adversarial Examples. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6572

  15. Zhang, H., Gu, Z., Tan, H., Wang, L., Zhu, Z., **e, Y., Li, J.: Masking and purifying inputs for blocking textual adversarial attacks. Inf. Sci. 648, 119501 (2023)

    Article  Google Scholar 

  16. Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. ar**v preprint ar**v:1805.12152 (2018)

  17. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P), pp. 372–387 (2016). IEEE

  18. **e, C., Tan, M., Gong, B., Wang, J., Yuille, A.L., Le, Q.V.: Adversarial examples improve image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 819–828 (2020)

  19. Li, X., Zhong, J., Wu, X., Yu, J., Liu, X., Meng, H.: Adversarial attacks on GMM i-vector based speaker verification systems. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6579–6583 (2020). IEEE

  20. Shamsabadi, A.S., Teixeira, F.S., Abad, A., Raj, B., Cavallaro, A., Trancoso, I.: Foolhd: Fooling speaker identification by highly imperceptible adversarial disturbances. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6159–6163 (2021). IEEE

  21. Chen, G., Chenb, S., Fan, L., Du, X., Zhao, Z., Song, F., Liu, Y.: Who is real bob? adversarial attacks on speaker recognition systems. In: 2021 IEEE Symposium on Security and Privacy (SP), pp. 694–711 (2021). IEEE

  22. Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2010)

    Article  Google Scholar 

  23. Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: Robust dnn embeddings for speaker recognition. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 5329–5333 (2018). IEEE

  24. Bhuvaneshwari, A. and Hemalatha, R. and Satyasavithri, T.: Performance evaluation of Dynamic Neural Networks for mobile radio path loss prediction. In: 2016 IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics Engineering (UPCON), pp. 461–466 (2016) https://doi.org/10.1109/UPCON.2016.7894698

  25. Desplanques, B., Thienpondt, J., Demuynck, K.: Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification. ar**v preprint ar**v:2005.07143 (2020)

  26. Ravanelli, M., Bengio, Y.: Speaker recognition from raw waveform with sincnet. In: 2018 IEEE spoken language technology workshop (SLT), pp. 1021–1028 (2018). IEEE

  27. Biggio, B., Corona, I., Maiorca, D., Nelson, Blaine and Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III 13, pp. 387–402 (2013). Springer

  28. Wang, Jiakai: Adversarial Examples in Physical World. In: IJCAI, pp. 4925–4926 (2021)

  29. Aleksander Madry and Aleksandar Makelov and Ludwig Schmidt and Dimitris Tsipras and Adrian Vladu: Towards Deep Learning Models Resistant to Adversarial Attacks. In: International Conference on Learning Representations (2018) https://openreview.net/forum?id=rJzIBfZAb

  30. Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9185–9193 (2018)

  31. Tan, H., Gu, Z., Wang, L., Zhang, H., Gupta, B.B., Tian, Z.: Improving adversarial transferability by temporal and spatial momentum in urban speaker recognition systems. Comput. Electr. Eng. 104, 108446 (2022)

    Article  Google Scholar 

  32. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 ieee symposium on security and privacy (sp), pp. 39–57 (2017). Ieee

  33. Zhang, L., Meng, Y., Yu, J., ** Li

Authors

Corresponding authors

Correspondence to Ai** Li or Zhaoquan Gu.

Ethics declarations

Ethics approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Tan, H., Zhang, J. et al. Transferable universal adversarial perturbations against speaker recognition systems. World Wide Web 27, 33 (2024). https://doi.org/10.1007/s11280-024-01274-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11280-024-01274-3

Keywords

Navigation