Detecting Web Application DAST Attacks in Large-Scale Event Data

  • Chapter
  • First Online:
Artificial Intelligence for Security
  • 43 Accesses

Abstract

This chapter proposes data-centric machine learning to protect web applications from dynamic application security testing (DAST) attacks. DAST scanning consists of automated pen testing against web applications to find exploitable vulnerabilities. They are often used by malicious actors in a brute-force manner for attack reconnaissance with a view to eventual compromise. Traditionally, threshold-based methods have been used to detect such malicious events and behaviour in defensive cybersecurity systems. There are inherent challenges in thresholding, however, not least the arguably arbitrary and brittle nature of selecting and applying a threshold in a production environment. Given these drawbacks, we present a machine learning method using random forests and aggregated event data to detect DAST reconnaissance attacks, using data collected from our proprietary web application firewall. Utilising a vast dataset comprising over 40 million real-world events, it is demonstrated our method is effective in successfully detecting DAST attacks, achieving an F1 score of 0.94 with a low miss rate of 6%. This approach provides important insights into the development of accurate and reliable detection systems that minimise manual tuning, essential in safeguarding against evolving cyber threats.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alqahtani, H., Sarker, I.H., Kalim, A., Minhaz Hossain, S.M., Ikhlaq, S., Hossain, S.: Cyber intrusion detection using machine learning classification techniques. In: Computing Science, Communication and Security, pp. 121–131. Springer Singapore, Singapore (2020)

    Google Scholar 

  2. Choraś, M., Kozik, R.: Machine learning techniques applied to detect cyber attacks on web applications. Logic J. IGPL 23(1), 45–56 (2014). https://doi.org/10.1093/jigpal/jzu038

    Article  MathSciNet  Google Scholar 

  3. Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 2, 222–232 (1987)

    Article  Google Scholar 

  4. European Union Agency For Cybersecurity: ENISA threat landscape 2020 - web application attacks. Tech. rep., ENISA (2021). https://www.enisa.europa.eu/publications/web-application-attacks

  5. Farnaaz, N., Jabbar, M.: Random forest modeling for network intrusion detection system. Procedia Comput. Sci. 89, 213–217 (2016)

    Article  Google Scholar 

  6. Harris, C.R., Millman, K.J., van der Walt, S.J., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

    Article  Google Scholar 

  7. Hyslip, T.S.: Cybercrime-as-a-Service Operations, pp. 815–846. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-319-78440-3_36

  8. Javaid, A., Niyaz, Q., Sun, W., Alam, M.: A deep learning approach for network intrusion detection system. EAI Endorsed Trans. Secur. Safety 3(9), e2 (2016)

    Google Scholar 

  9. Kali Linux: Kali tools: Kali linux tools. https://www.kali.org/tools/ (2023)

  10. Kluyver, T., Ragan-Kelley, B., Pérez, F., et al.: Jupyter notebooks – a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds.) Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87–90. IOS Press, Amsterdam (2016)

    Google Scholar 

  11. Kruegel, C., Vigna, G.: An anomaly detection of web-based attacks. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 251–261 (2003)

    Google Scholar 

  12. Millar, S., Podgurskii, D., Kuykendall, D., Martínez del Rincón, J., Miller, P.: Optimising vulnerability triage in dast with deep learning. In: Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security, AISec’22, pp. 137–147. Association for Computing Machinery, New York, NY, USA (2022)

    Google Scholar 

  13. Negandhi, P., Trivedi, Y., Mangrulkar, R.: Intrusion detection system using random forest on the NSL-KDD dataset. In: Emerging Research in Computing, Information, Communication and Applications, pp. 519–531. Springer, New York (2019)

    Google Scholar 

  14. Pan, Y., Sun, F., Teng, Z., et al.: Detecting web attacks with end-to-end deep learning. J. Internet Serv. Appl. 10(1), 1–22 (2019)

    Article  Google Scholar 

  15. Pandas: Pandas-dev/pandas: Pandas (2023). https://doi.org/10.5281/zenodo.3509134

  16. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., et al.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  Google Scholar 

  17. Perez-Villegas, A., Torrano-Gimenez, C., Alvarez, G.: Applying markov chains to web intrusion detection. In: Proceedings of Reunión Espanola sobre Criptología y Seguridad de la Información (RECSI 2010), pp. 361–366 (2010)

    Google Scholar 

  18. Portswigger: BURP scanner – web vulnerability scanner from portswigger. https://portswigger.net/burp/vulnerability-scanner (2023)

  19. Qaiser, S., Ali, R.: Text mining: use of tf-idf to examine the relevance of words to documents. Int. J. Comput. Appl. 181(1), 25–29 (2018)

    Google Scholar 

  20. Rapid7: HTTP track. https://www.rapid7.com/db/vulnerabilities/http-track-method-enabled/ (2023)

  21. Rapid7: WebDAV propfind method allows web directory browsing. https://www.rapid7.com/db/vulnerabilities/http-generic-propfind-dir-browsing/ (2023)

  22. Saha, A., Sanyal, S.: Application layer intrusion detection with combination of explicit-rule-based and machine learning algorithms and deployment in cyber-defence program. CoRR abs/1411.3089 (2014). http://arxiv.org/abs/1411.3089

  23. Scikit-learn: Scikit-learn: Preprocessing Min-Max Scaler. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html (2023)

  24. Scikit-learn: Scikit-learn: Random Forest Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html (2023)

  25. Shone, N., Ngoc, T.N., Phai, V.D., He, X.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Trans. Emerg. Top. Comput. Intell. 2(1), 41–50 (2018)

    Article  Google Scholar 

  26. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28(1), 11–21 (1972)

    Article  Google Scholar 

  27. Sun, F., Zhang, P., White, J., Schmidt, D., Staples, J., Krause, L.: A feasibility study of autonomically detecting in-process cyber-attacks. In: 2017 3rd IEEE International Conference on Cybernetics (CYBCONF), pp. 1–8. IEEE (2017)

    Google Scholar 

  28. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the kdd cup 99 dataset. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009)

    Google Scholar 

  29. The OWASP Foundation: OWASP Top Ten. https://owasp.org/www-project-top-ten/ (2023)

  30. The OWASP Foundation: OWASP Zed Attack Proxy (ZAP). https://www.zaproxy.org/ (2023)

  31. The OWASP Foundation: Project AppSensor. https://owasp.org/www-project-appsensor/ (2023)

  32. Torrano-Giménez, C., Perez-Villegas, A., Alvarez Maranón, G.: An anomaly-based approach for intrusion detection in web traffic. J. Inf. Assur. Secur. 5(4), 446–454

    Google Scholar 

  33. Tukey, J.W., et al.: Exploratory Data Analysis, vol. 2. Addison-Wesley, Reading, MA (1977)

    Google Scholar 

  34. UCI Machine Learning Repository: UCI machine learning repository: Kdd cup 1999 dataset. https://archive.ics.uci.edu/ml/datasets/kdd+cup+1999+data (2023)

  35. Van Rossum, G., Drake Jr, F.L.: Python Reference Manual. Centrum voor Wiskunde en Informatica, Amsterdam (1995)

    Google Scholar 

  36. Vinayakumar, R., Alazab, M., Soman, K., Poornachandran, P., Al-Nemrat, A., Venkatraman, S.: Deep learning approach for intelligent intrusion detection system. IEEE Access 7, 41525–41550 (2019)

    Article  Google Scholar 

  37. WebGoat: WebGoat 8: A deliberately insecure web application. https://github.com/WebGoat/WebGoat (2023)

  38. Yin, C., Zhu, Y., Fei, J., He, X.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5, 21954–21961 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pojan Shahrivar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Shahrivar, P., Millar, S. (2024). Detecting Web Application DAST Attacks in Large-Scale Event Data. In: Sipola, T., Alatalo, J., Wolfmayr, M., Kokkonen, T. (eds) Artificial Intelligence for Security. Springer, Cham. https://doi.org/10.1007/978-3-031-57452-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57452-8_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57451-1

  • Online ISBN: 978-3-031-57452-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation