Log in

MIM: A multiple integration model for intrusion detection on imbalanced samples

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

The quantity of normal samples is commonly significantly greater than that of malicious samples, resulting in an imbalance in network security data. When dealing with imbalanced samples, the classification model requires careful sampling and attribute selection methods to cope with bias towards majority classes. Simple data sampling methods and incomplete feature selection techniques cannot improve the accuracy of intrusion detection models. In addition, a single intrusion detection model cannot accurately classify all attack types in the face of massive imbalanced security data. Nevertheless, the existing model integration methods based on stacking or voting technologies suffer from high coupling that undermines their stability and reliability. To address these issues, we propose a Multiple Integration Model (MIM) to implement feature selection and attack classification. First, MIM uses random Oversampling, random Undersampling and Washing Methods (OUWM) to reconstruct the data. Then, a modified simulated annealing algorithm is employed to generate candidate features. Finally, an integrated model based on Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost) and gradient Boosting with Categorical features support (CatBoost) is designed to achieve intrusion detection and attack classification. MIM leverages a Rule-based and Priority-based Ensemble Strategy (RPES) to combine the high accuracy of the former and the high effectiveness of the latter two, improving the stability and reliability of the integration model. We evaluate the effectiveness of our approach on two publicly available intrusion detection datasets, as well as a dataset created by researchers from the University of New Brunswick and another dataset collected by the Australian Center for Cyber Security. In our experiments, MIM significantly outperforms several existing intrusion detection models in terms of accuracy. Specifically, compared to two recently proposed methods, namely, the reinforcement learning method based on the adaptive sample distribution dual-experience replay pool mechanism (ASD2ER) and the method that combines Auto Encoder, Principal Component Analysis, and Long Short-Term Memory (AE+PCA+LSTM), MIM exhibited a respective enhancement in intrusion detection accuracy by 1.35% and 1.16%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Algorithm 2
Fig. 5
Fig. 6
Fig. 7
Algorithm 3
Algorithm 4
Fig. 8

Availability of Data and Materials

All datasets utilized in this article were obtained from publicly available sources as indicated in references [43] and [44].

References

  1. Yan, J., Zhaoquan, G., Zhihao, J., Cuiyun, G., Jianye, Y.: Persistent graph stream summarization for real-time graph analytics. World Wide Web 26, 2647–2667 (2023)

  2. Uno, F., Jianxin, L., Naveed, A., Man, L., Yan, J.: GoMIC: Multi-view image clustering via self-supervised contrastive heterogeneous graph co-learning. World Wide Web 26, 1667-1683 (2023)

  3. Abhilash, S., Seyed, M.H.M., Jaiprakash, N.: F-TLBO-ID: Fuzzy fed teaching learning based optimisation algorithm to predict the number of k-barriers for intrusion detection. Appl. Soft Comput. 151, 111163 (2024)

  4. Bhawana, S., Lokesh, S., Chhagan, L., Satyabrata, R.: Explainable artificial intelligence for intrusion detection in IoT networks: A deep learning based approach. Expert Syst. App. 238, 121751 (2024)

  5. Zhiqiang, Z., Le, W., Guangyao, C., Zhaoquan, G., Zhihong, T., **, L., Jiezhou, H., Wuxia, Z., Tianyu, M., Zhaohui, T., Jean, P.N., Weihua, G.: ANID-SEoKELM: Adaptive network intrusion detection based on selective ensemble of kernel ELMs with random features. Knowl-Based Syst. 177(1), 104-116 (2019)

  6. Ying, Z., Thomas, M., Shahram, S.: M-AdaBoost-A Based Ensemble System for Network Intrusion Detection. Expert Syst. Appl. 162, 113864 (2020)

  7. Saikat, D., Mohammad, A., Frederick, T.S., Sajjan, S.: Network Intrusion Detection using Natural Language Processing and Ensemble Machine Learning. In: Proceed 2020 IEEE Symp. Ser. Comput. Intell. (SSCI), 829-835 (2020)

  8. Enkhtur, T., Monowar, H.B., Yuzo, T., Doudou, F., Khishigjargal, G., Erik, E., Youki, K.: DeL-IoT: A Deep Ensemble Learning Approach to Uncover Anomalies in IoT. Internet of Things 14, 100391 (2021)

  9. Prabhat, K., Govind, G., Rakesh, T.: An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks. Comput. Commun. 166, 110-124 (2021)

  10. Mahbod, T., Ebrahim, B., Wei, L., Ali, A.G.: A detailed analysis of the KDD CUP 99 data set. In: Proceed. 2009 IEEE Symp. Comput. Intell. Sec. Def. Appl. 1-6 (2009)

  11. Nour, M., Jill, S.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Proceed. 2015 Mil. Commun. Inform. Syst. Conf. (MilCIS), 1-6 (2015)

  12. Saharon, R., Aron, I.: KDD-cup 99: knowledge discovery in a charitable organization’s donor database. ACM SIGKDD Explor. Newsl. 1, 85-90 (2000)

  13. Hongyu, Y., Renyun, Z., Guangquan, X., Liang, Z.: A network security situation assessment method based on adversarial deep learning. Appl. Soft Comput. 102, 107096 (2021)

  14. Al, Y., Wathiq, L., Ali, K.I., Faezah, H.A.: Wrapper feature selection method based differential evolution and extreme learning machine for intrusion detection system. Patt. Recog. 132, 108912 (2022)

  15. Haonan, T., Le, W., Dong, Z., Jianyu, D.: Intrusion Detection Based on Adaptive Sample Distribution Dual-Experience Replay Reinforcement Learning. Math. 12(7), 948 (2024)

  16. Thakkar, A., Nandish, K., Rebakah, G.: Fusion of linear and non-linear dimensionality reduction techniques for feature reduction in LSTM-based Intrusion Detection System. Appl. Soft Comput. 154, 111378 (2024)

  17. Jianlei, G., Senchun, C., Baihai, Z., Yuanqing, X.: Research on Network Intrusion Detection Based on Incremental Extreme Learning Machine and Adaptive Principal Component Analysis. Energ. 12(7), 1207-1223 (2019)

  18. Earum, M., Aneela, Z., Muhammad, U., Asima, A.A.: A two-stage intrusion detection system with auto-encoder and LSTMs. Appl. Soft Comput 121, 108768 (2022)

  19. Hooshmand, M.K., Doreswamy, H.: Network anomaly detection using deep learning techniques. CAAI Trans. Intell. Tech. 7(2), 228-243 (2022)

Download references

Funding

This work is supported by the Guangdong Basic and Applied Basic Research Foundation (2023A1515011698), the Major Key Project of PCL (PCL2022A03), the Guangdong High-level University Foundation Program (SL2022A03J00918), and the National Natural Science Foundation of China (Grant No. 62372137).

Author information

Authors and Affiliations

Authors

Contributions

Zhiqiang Zhang and Le Wang wrote the main manuscript text. Zhiqiang Zhang proposed the main technical ideas for the methods in the manuscript and conducted the experiments. Junyi Zhu and Dong Zhu performed research and analyzed data. Zhaoquan Gu and Yanchun Zhang prepared all figures and tables in the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Le Wang.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Wang, L., Zhu, J. et al. MIM: A multiple integration model for intrusion detection on imbalanced samples. World Wide Web 27, 47 (2024). https://doi.org/10.1007/s11280-024-01285-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11280-024-01285-0

Keywords

Navigation