Log in

Data Poisoning Attacks and Mitigation Strategies on Federated Support Vector Machines

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Federated learning is a machine learning approach where multiple edge devices, each holding local data samples, send a locally trained model to the central server, and the central server aggregates the models using a specific aggregation rule. Notably, the distributed nature of federated learning exposes these devices to potential poisoning attacks, especially during the training phase. This paper presents a systematic study on the effect of data poisoning attacks against SVM classifiers in a federated setting (F-SVM). In particular, we implement two widely recognized data poisoning attacks against SVMs named Label-Flip** and Optimal-Poisoning attacks and evaluate their impact on the global F-SVM accuracy using MNIST, FashionMNIST, CIFAR-10, and IJCNN1 datasets. Our results reveal significant reductions in accuracy, highlighting the susceptibility of F-SVMs to such attacks. Our empirical results highlight that if 30% of the edge devices are compromised, accuracy drops by 15%, and if compromised devices increase to 35%, accuracy goes down by 32%. We evaluated the impacts when ratio of the poisonous points is different and when datasets are not independently and identically distributed (non-IID) across edge devices. In addition to this, we investigate some preliminary defense mechanisms against poisoning attacks for F-SVMs. Consequently, we assessed the efficacy of three popular unsupervised outlier detection methods: the K-nearest Neighbor algorithm, Histogram-based outlier detection, and Copula-based outlier detection. All our source codes are written in Python and are open source.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

This study exclusively utilized datasets that are publicly available. All datasets analyzed during this research are accessible from their respective public domain sources. Full references to these datasets are provided within the article.

Notes

  1. https://github.com/pkse-searcher/fsvm-pois-attack-defense/.

References

  1. Anisetti M, Ardagna CA, Balestrucci A, et al. On the robustness of ensemble-based machine learning against data poisoning. Preprint ar**v:2209.14013; 2022.

  2. Anisetti M, Ardagna CA, Bena N, et al. Rethinking certification for trustworthy machine-learning-based applications. IEEE Int Comput. 2023;27(6):22–8. https://doi.org/10.1109/MIC.2023.3322327.

    Article  Google Scholar 

  3. Bagdasaryan E, Veit A, Hua Y, et al. How to backdoor federated learning. In: International conference on artificial intelligence and statistics, PMLR; 2020. p. 2938–48.

  4. Barreno M, Nelson B, Joseph AD, et al. The security of machine learning. Mach Learn. 2010;81(2):121–48.

    Article  MathSciNet  Google Scholar 

  5. Bhagoji AN, Chakraborty S, Mittal P, et al. Analyzing federated learning through an adversarial lens. In: International conference on machine learning, PMLR; 2019. p. 634–43.

  6. Biggio B, Roli F. Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 2018;84:317–31.

    Article  Google Scholar 

  7. Biggio B, Nelson B, Laskov P. Support vector machines under adversarial label noise. In: Asian conference on machine learning, PMLR; 2011. p. 97–112.

  8. Biggio B, Nelson B, Laskov P. Poisoning attacks against support vector machines. In: Proceedings of the 29th international conference on international conference on machine learning. Omnipress, Madison, WI, USA, ICML’12; 2012. p. 1467–74.

  9. Blanchard P, El Mhamdi EM, Guerraoui R, et al. Machine learning with adversaries: Byzantine tolerant gradient descent. Adv Neural Inf Process Syst. 2017;2017:30.

    Google Scholar 

  10. Bovenzi G, Foggia A, Santella S, et al. Data poisoning attacks against autoencoder-based anomaly detection models: a robustness analysis. In: ICC 2022-IEEE international conference on communications; 2022. p. 5427–32. https://doi.org/10.1109/ICC45855.2022.9838942.

  11. Cao X, Fang M, Liu J, et al. Fltrust: Byzantine-robust federated learning via trust bootstrap**. In: 28th annual network and distributed system security symposium, NDSS 2021, virtually, February 21–25, 2021. The Internet Society. https://www.ndss-symposium.org/ndss-paper/fltrust-byzantine-robust-federated-learning-via-trust-bootstrap**/; 2021.

  12. Chen M, Yang Z, Saad W, et al. A joint learning and communications framework for federated learning over wireless networks. IEEE Trans Wirel Commun. 2020;20(1):269–83.

    Article  Google Scholar 

  13. Dalvi N, Domingos P, Sanghai S, et al. Adversarial classification. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining; 2004. p. 99–108.

  14. Demontis A, Melis M, Pintor M, et al. Why do adversarial attacks transfer? Explaining transferability of evasion and poisoning attacks. In: 28th {USENIX} security symposium ({USENIX} security 19); 2019. p. 321–38.

  15. Ding H, Yang F, Huang J. Defending SVMS against poisoning attacks: the hardness and dbscan approach. In: de Campos C, Maathuis MH, etitors, Proceedings of the thirty-seventh conference on uncertainty in artificial intelligence, proceedings of machine learning research, PMLR, vol. 161; 2021. p. 268–78. https://proceedings.mlr.press/v161/ding21b.html.

  16. Doku R, Rawat DB, Mitigating data poisoning attacks on a federated learning-edge computing network. In: 2021 IEEE 18th annual consumer communications and networking conference (CCNC). IEEE; 2021. p. 1–6.

  17. Fang M, Cao X, Jia J, et al. Local model poisoning attacks to {Byzantine-Robust} federated learning. In: 29th USENIX security symposium (USENIX Security 20); 2020. p. 1605–22.

  18. Faqeh R, Fetzer C, Hermanns H, et al. Towards dynamic dependable systems through evidence-based continuous certification. In: Margaria T, Steffen B, et al., editors. Leveraging applications of formal methods, verification and validation: engineering principles. Cham: Springer; 2020. p. 416–39.

    Chapter  Google Scholar 

  19. Gehr T, Mirman M, Drachsler-Cohen D, et al., Ai2: safety and robustness certification of neural networks with abstract interpretation. In: 2018 IEEE symposium on security and privacy (SP). IEEE; 2018. p. 3–18.

  20. Goldstein M, Dengel A. Histogram-based outlier score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: poster and demo track; 2012. p. 59–63.

  21. Hsu RH, Wang YC, Fan CI, et al. A privacy-preserving federated learning system for android malware detection based on edge computing. In: 2020 15th Asia joint conference on information security (AsiaJCIS). IEEE; 2020. p. 128–36.

  22. Huang C, Huang J, Liu X. Cross-silo federated learning: challenges and opportunities. Preprint ar**v:2206.12949 [cs.LG]; 2022.

  23. Huang Y, Chu L, Zhou Z, et al. Personalized cross-silo federated learning on non-iid data. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35(9); 2021. p. 7865–73. https://doi.org/10.1609/aaai.v35i9.16960. https://ojs.aaai.org/index.php/AAAI/article/view/16960.

  24. Israt Jahan Mouri MAAMuhammad Ridowan. Towards poisoning of federated support vector machines with data poisoning attacks. In: Proceedings of the 13th international conference on cloud computing and services science-CLOSER, INSTICC. SciTePress; 2023. p. 24–33.

  25. Jagielski M, Oprea A, Biggio B, et al. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In: 2018 IEEE symposium on security and privacy (SP). IEEE; 2018. p. 19–35.

  26. Kabir T, Adnan MA. A scalable algorithm for multi-class support vector machine on geo-distributed datasets. In: 2019 IEEE international conference on big data (big data). IEEE; 2019. p. 637–42.

  27. Karimireddy SP, Jaggi M, Kale S, et al. Breaking the centralized barrier for cross-device federated learning. Adv Neural Inf Process Syst. 2021;34:28663–76.

    Google Scholar 

  28. Konecnỳ J, McMahan HB, Ramage D, et al. Federated optimization: distributed machine learning for on-device intelligence. CoRR; 2016.

  29. Krizhevsky A, Hinton G, et al. Learning multiple layers of features from tiny images; 2009.

  30. Laishram R, Phoha VV. Curie: A method for protecting svm classifier from poisoning attack. Preprint ar**v:1606.01584; 2016.

  31. LeCun Y. The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/; 1998.

  32. Li Z, Zhao Y, Botta N, et al. Copod: copula-based outlier detection. In: 2020 IEEE international conference on data mining (ICDM). IEEE; 2020. p. 1118–23.

  33. Manoharan P, Walia R, Iwendi C, et al. SVM-based generative adverserial networks for federated learning and edge computing attack model and outpoising. Expert Syst. 2022. https://doi.org/10.1111/exsy.13072.

    Article  Google Scholar 

  34. McMahan B, Moore E, Ramage D, et al. Communication-Efficient Learning of Deep Networks from Decentralized Data. In: Singh A, Zhu J (eds) Proceedings of the 20th international conference on artificial intelligence and statistics, proceedings of machine learning research, PMLR, vol. 54; 2017. p. 1273–82. https://proceedings.mlr.press/v54/mcmahan17a.html.

  35. Mei S, Zhu X. Using machine teaching to identify optimal training-set attacks on machine learners. In: Twenty-ninth AAAI conference on artificial intelligence; 2015.

  36. Melis M, Demontis A, Pintor M, et al. SECML: a Python library for secure and explainable machine learning. Preprint ar**v:1912.10013; 2019.

  37. Muñoz-González L, Biggio B, Demontis A, et al. Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM workshop on artificial intelligence and security; 2017. p. 27–38.

  38. Nair DG, Aswartha Narayana CV, Jaideep Reddy K, et al. Exploring SVM for federated machine learning applications. In: Rout RR, Ghosh SK, Jana PK, et al., editors. Advances in distributed computing and machine learning. Singapore: Springer; 2022. p. 295–305.

    Chapter  Google Scholar 

  39. Navia-Vázquez A, Díaz-Morales R, Fernández-Díaz M. Budget distributed support vector machine for non-id federated learning scenarios. ACM Trans Intel Syst Technol (TIST). 2022;13(6):1–25.

    Article  Google Scholar 

  40. Paudice A, Muñoz-González L, Gyorgy A, et al. Detection of adversarial training examples in poisoning attacks through anomaly detection. Preprint ar** poisoning attacks. In: Alzate C, Monreale A, Assem H, et al., editors. ECML PKDD 2018 workshops. Cham: Springer; 2019. p. 5–15.

    Chapter  Google Scholar 

  41. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.

    MathSciNet  Google Scholar 

  42. Peri N, Gupta N, Huang WR, et al. Deep k-nn defense against clean-label data poisoning attacks. In: Bartoli A, Fusiello A, editors., et al., Computer vision-ECCV 2020 workshops. Cham: Springer; 2020. p. 55–70.

  43. Pitropakis N, Panaousis E, Giannetsos T, et al. A taxonomy and survey of attacks against machine learning. Comput Sci Rev. 2019;34: 100199. https://doi.org/10.1016/j.cosrev.2019.100199. www.sciencedirect.com/science/article/pii/S1574013718303289.

  44. Prokhorov D. Ijcnn 2001 neural network competition. Slide Present IJCNN. 2001;1(97):38.

    Google Scholar 

  45. Radford BJ, Apolonio LM, Trias AJ, et al. Network traffic anomaly detection using recurrent neural networks. CoRR ar**v:1803.10769; 2018

  46. Ramaswamy S, Rastogi R, Shim K. Efficient algorithms for mining outliers from large data sets. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data; 2000. p. 427–38.

  47. Rehman MHu, Dirir AM, Salah K, et al. TrustFed: A framework for fair and trustworthy cross-device federated learning in IIoT. IEEE Trans Ind Inform 2021;17(12):8485–94. https://doi.org/10.1109/TII.2021.3075706.

  48. Shejwalkar V, Houmansadr A, Kairouz P, et al. Back to the drawing board: a critical evaluation of poisoning attacks on production federated learning. In: IEEE symposium on security and privacy; 2022.

  49. Steinhardt J, Koh PW, Liang P. Certified defenses for data poisoning attacks. In: Proceedings of the 31st international conference on neural information processing systems; 2017. p. 3520–32.

  50. Sun G, Cong Y, Dong J, et al. Data poisoning attacks on federated machine learning. IEEE Int Things J. 2021;2021:1.

    Google Scholar 

  51. Tolpegin V, Truex S, Gursoy ME, et al. Data poisoning attacks against federated learning systems. In: European symposium on research in computer security. London: Springer; 2020. p. 480–501.

  52. Wang S, Chen M, Saad W, et al. Federated learning for energy-efficient task computing in wireless networks. In: ICC 2020-2020 IEEE international conference on communications (ICC). IEEE; 2020. p. 1–6.

  53. **ao H, Biggio B, Brown G, et al. Is feature selection secure against training data poisoning? In: International conference on machine learning, PMLR; 2015. p. 1689–98.

  54. **ao H, Rasul K, Vollgraf R. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. Preprint ar**v:1708.07747; 2017.

  55. Yin D, Chen Y, Kannan R, et al. Byzantine-robust distributed learning: Towards optimal statistical rates. In: International conference on machine learning, PMLR; 2018. p. 5650–9.

  56. Zhang R, Zhu Q. A game-theoretic defense against data poisoning attacks in distributed support vector machines. In: 2017 IEEE 56th annual conference on decision and control (CDC). IEEE; 2017. p. 4582–7.

  57. Zhao Y, Nasrullah Z, Li Z. PYOD: a Python toolbox for scalable outlier detection. J Mach Lear Res. 2019;20(96):1–7. http://jmlr.org/papers/v20/19-011.html.

  58. Zhou Y, Kantarcioglu M, Thuraisingham B, et al. Adversarial support vector machine learning. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining; 2012. p. 1059–67.

  59. Zhu Y, Cui L, Ding Z, et al. Black box attack and network intrusion detection using machine learning for malicious traffic. Comput Secur. 2022;123: 102922.

    Article  Google Scholar 

Download references

Acknowledgements

This research work is carried out under the RISE (Research and Innovation Centre for Science and Engineering) Internal Research Grant from Bangladesh University of Engineering and Technology (BUET).

Funding

This study has been carried out in the context of the project titled “Securing Federated Learning from Poisoning Attacks” with Application ID: 2022-01-019, awarded under the RISE (Research and Innovation Centre for Science and Engineering) Internal Research Grant (Call-ID: 2022-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Israt Jahan Mouri.

Ethics declarations

Conflict of Interest

On behalf of all the authors, the corresponding author states that there is no conflict of interest.

Code Availability

Our codes are open source and available in this URL: https://github.com/pkse-searcher/fsvm-pois-attack-defense/.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Recent Trends on Cloud Computing and Services Science” guest edited by Claus Pahl and Maarten van Steen.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mouri, I.J., Ridowan, M. & Adnan, M.A. Data Poisoning Attacks and Mitigation Strategies on Federated Support Vector Machines. SN COMPUT. SCI. 5, 241 (2024). https://doi.org/10.1007/s42979-023-02556-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02556-9

Keywords

Navigation