Abstract
Groundwater is among the utmost essential renewable resources for every organism existing on Earth. Assessing water quality is critical for the ecosystem’s stability and conservation. The overall water quality possesses a significant effect on human being wellness and environmental preservation. Numerous applications of water exist, including those related to industries, agriculture, and consumption. The water quality index (WQI) is an essential metric for assessing water management effectiveness. By its biological, physical, and physiological features, water quality assesses whether water is suitable for a specific application or not. Water quality analysis has become a big concern in today’s world because of industrialization, industry, farming techniques, and people’s behavior. Quality of water has traditionally been examined using expensive testing facilities and numerical procedures, enabling monitoring in real-time obsolete. Improper quality of groundwater necessitates an additional feasible and affordable remedy. The algorithmic learning-based categorization technique looks to be promising for quick identification and estimation of water quality. Predicting the quality of water has been done effectively using machine learning algorithms. The technological investigation of computer algorithms as well as mathematical models that networks of computers employ to complete a certain task without having to be explicitly programmed is referred to as machine learning (ML). The major benefit associated with algorithmic machine learning models is that as an algorithm knows how to utilize data, it can perform its function independently. This work comprehensively examines three major machine learning techniques: Decision Tree, Regression Model, and Support Vector Machine. Features including total coliform, electric conductivity, biological oxygen demand, pH, dissolved oxygen, and nitrate determine the water quality. In this paper, many prior research that employed machine learning techniques for determining water quality in diverse regions were examined. A comparison of past research involving these algorithms, assessment methodologies, and acquired outcomes is offered. We performed a thorough analysis of the cutting-edge ML algorithms used to predict groundwater quality. As part of our methodology, we analysed a wide range of research, looked into the use of conventional and cutting-edge ML techniques, pre-processing techniques, feature selection techniques, and data augmentation methods. The findings of this study will help with groundwater development planning and will enhance the Machine learning applications in improving the quality of groundwater. Our analysis demonstrates the adaptability of ML techniques in predicting groundwater quality. We discovered that ML models, such as deep learning, ensemble approaches, neural networks, support vector machines, and linear regression, have been successfully used to predict the quality of groundwater, identify the origins of contamination, and optimise remediation techniques. We also point out how important data availability and quality are to model success.
Similar content being viewed by others
Data Availability
All relevant data and material are presented in the main paper.
References
Singh PK, Verma P, Tiwari AK, Sharma S, Purty P (2015) Review of various contamination index approaches to evaluate groundwater quality with geographic information system (GIS). Int J ChemTech Res 7(4):1920–1929
Asadi E, Isazadeh M, Samadianfard S, Ramli MF, Mosavi A, Nabipour N, Chau KW (2019) Groundwater quality assessment for sustainable drinking and irrigation. Sustainability 12(1):177
Harter T (2003) Groundwater quality and groundwater pollution. University of California, California
Memon YI, Qureshi SS, Kandhar IA, Qureshi NA, Saeed S, Mubarak NM, Saleh TA (2021) Statistical analysis and physicochemical characteristics of groundwater quality parameters: a case study. Int J Environ Anal Chem 2021:1–22
Li, J., Pang, Z., Liu, Y., Hu, S., Jiang, W., Tian, L., Tian, J. (2023). Changes in groundwater dynamics and geochemical evolution induced by drainage reorganization: Evidence from 81Kr and 36Cl dating of geothermal water in the Weihe Basin of China. Earth and Planetary Science Letters, 623, 118425. https://doi.org/10.1016/j.epsl.2023.118425
Jiang, Y., Li, J., Zuo, R., Sun, C., Zhai, Y., Tian, L., Zhang, X. (2024). The transmission of isotopic signals from precipitation to groundwater and its controls: An experimental study with soil cylinders of various soil textures and burial depths in a monsoon region. Journal of Hydrology, 631, 130746. https://doi.org/10.1016/j.jhydrol.2024.130746
Hussein EA, Thron C, Ghaziasgar M, Bagula A, Vaccari M (2020) Groundwater prediction using machine-learning tools. Algorithms 13(11):300
Aldhyani TH, Al-Yaari M, Alkahtani H, Maashi M (2020) Research article water quality prediction using artificial intelligence algorithms. Appl Bionics Biomech. https://doi.org/10.1155/2020/6659314
Aldhyani TH, Al-Yaari M, Alkahtani H, Maashi M (2020) Water quality prediction using artificial intelligence algorithms. Appl Bionics Biomech. https://doi.org/10.1155/2020/6659314
Azma A, Narreie E, Shojaaddini A, Kianfar N, Kiyanfar R, Seyed Alizadeh SM, Davarpanah A (2021) Statistical modeling for spatial groundwater potential map based on GIS technique. Sustainability 13(7):3788
Wang, H., Wang, Y., Wang, X., Yin, W., Yu, T., Xue, C.,... Wang, A. (2024). Multimodal Machine Learning Guides Low Carbon Aeration Strategies in Urban Wastewater Treatment. Engineering. https://doi.org/10.1016/j.eng.2023.11.020
Yadav RS (2022) A study of relationship to absentees and score using machine learning method: a case study of linear regression analysis. IARS’Int Res J 12(01):33–39
Raheja H, Goel A, Pal M (2022) Prediction of groundwater quality indices using machine learning algorithms. Water Pract Technol 17(1):336–351
Vijay S, Kamaraj K (2019) Groundwater quality prediction using machine learning algorithms in R. Int J Res Anal Rev 6(1):743–749
Mosavi A, Hosseini FS, Choubin B, Abdolshahnejad M, Gharechaee H, Lahijanzadeh A, Dineva AA (2020) Susceptibility prediction of groundwater hardness using ensemble machine learning models. Water 12(10):2770
Zhu M, Wang J, Yang X, Zhang Y, Zhang L, Ren H, Ye L (2022) A review of the application of machine learning in water quality evaluation. Eco-Environ Health. https://doi.org/10.1016/j.eehl.2022.06.001
Liu C, Xu M, Liu Y, Li X, Pang Z, Miao S (2022) Predicting groundwater indicator concentration based on long short-term memory neural network: a case study. Int J Environ Res Public Health 19(23):15612
Ackerson JM, Dave R, Seliya N (2021) Applications of recurrent neural network for biometric authentication & anomaly detection. Information 12(7):272
Imandoust SB, Bolandraftar M (2013) Application of k-nearest neighbor (knn) approach for predicting economic events: theoretical background. Int J Eng Res Appl 3(5):605–610
Taunk K, De S, Verma S, Swetapadma A (2019) A brief review of nearest neighbor algorithm for learning and classification. 2019 international conference on intelligent computing and control systems (ICCS). IEEE, New York, pp 1255–1260
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imag 9:611–629
Abdel-Jaber H, Devassy D, Al Salam A, Hidaytallah L, El-Amir M (2022) A review of deep learning algorithms and their applications in healthcare. Algorithms 15(2):71
Valizadeh M, Wolff SJ (2022) Convolutional neural network applications in additive manufacturing: a review. Adv Ind Manuf Eng. https://doi.org/10.1016/j.aime.2022.100072
Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Phil Trans R Soc A: Math Phys Eng Sci 374(2065):20150202
Howley T, Madden MG, O’Connell ML, Ryder AG (2006) The effect of principal component analysis on machine learning accuracy with high dimensional spectral data. In: Applications and innovations in intelligent systems XIII: proceedings of AI-2005, the twenty-fifth SGAI international conference on innovative techniques and applications of artificial intelligence, Cambridge, UK, December 2005, Springer, London, pp 209–222
Han, X., Wu, H., Li, Q., Cai, W., & Hu, S. (2024). Assessment of heavy metal accumulation and potential risks in surface sediment of estuary area: A case study of Dagu river. Marine Environmental Research, 196, 106416. https://doi.org/10.1016/j.marenvres.2024.106416
Grossi E, Buscema M (2007) Introduction to artificial neural networks. Eur J Gastroenterol Hepatol 19(12):1046–1054
Mhatre MS, Siddiqui F, Dongre M, Thakur P (2015) A review paper on artificial neural network: a prediction technique. Int J Sci Eng Res 6(12):161–163
Dai, H., Liu, Y., Guadagnini, A., Yuan, S., Yang, J., Ye, M. (2024). Comparative Assessment of Two Global Sensitivity Approaches Considering Model and Parameter Uncertainty. Water Resources Research, 60(2), e2023WR036096 https://doi.org/10.1029/2023WR036096
Suwadi NA, Derbali M, Sani NS, Lam MC, Arshad H, Khan I, Ki-Il K (2022) An optimized approach for predicting water quality features based on machine learning. Wireless Commun Mobile Comput. https://doi.org/10.1155/2022/3397972
Gaye B, Zhang D, Wulamu A (2021) Improvement of support vector machine algorithm in big data background. Math Probl Eng 2021:1–9
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Mamat N, Mohd Razali SF, Hamzah FB (2023) Enhancement of water quality index prediction using support vector machine with sensitivity analysis. Front Environ Sci. https://doi.org/10.3389/fenvs.2022.1061835
Tian Y, Shi Y, Liu X (2012) Recent advances in support vector machine research. Technol Econ Dev Econ 18(1):5–33
Nordin NFC, Mohd NS, Koting S, Ismail Z, Sherif M, El-Shafie A (2021) Groundwater quality forecasting modeling using artificial intelligence: a review. Groundw Sustain Dev 14:100643
Khan J, Lee E, Balobaid AS, Kim K (2023) A comprehensive review of conventional, machine learning, and deep learning models for groundwater level (GWL) forecasting. Appl Sci 13(4):2743
Deka PC (2014) Support vector machine applications in the field of hydrology: a review. Appl Soft Comput 19:372–386
Shiri N, Shiri J, Yaseen ZM, Kim S, Chung IM, Nourani V, Zounemat-Kermani M (2021) Development of artificial intelligence models for well groundwater quality simulation: different modeling scenarios. PLoS ONE 16(5):e0251510
Dimple D, Rajput J, Al-Ansari N, Elbeltagi A (2022) Predicting irrigation water quality indices based on data-driven algorithms: case study in semiarid environment. J Chem. https://doi.org/10.1155/2022/4488446
Agrawal P, Sinha A, Kumar S, Agarwal A, Banerjee A, Villuri VGK, Pasupuleti S (2021) Exploring artificial intelligence techniques for groundwater quality assessment. Water 13(9):1172
Aish AM, Zaqoot HA, Sethar WA, Aish DA (2023) Prediction of groundwater quality index in the Gaza coastal aquifer using supervised machine learning techniques. Water Pract Technol 18(3):501–521
Jafari R, Torabian A, Ghorbani MA, Mirbagheri SA, Hassani AH (2019) Prediction of groundwater quality parameter in the Tabriz plain, Iran using soft computing methods. J Water Supply Res Technol AQUA 68(7):573–584
Mogaraju JK (2023) Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. Turk J Eng 7(1):64–72
Sakizadeh M, Mirzaei R (2016) A comparative study of the performance of K-nearest neighbors and support vector machines for classification of groundwater. J Mining Environ 7:149
Zhou T, Wang F, Yang Z (2017) Comparative analysis of ANN and SVM models combined with wavelet preprocess for groundwater depth prediction. Water 9(10):781
Kalaivanan K, Vellingiri J (2022) Survival study on different water quality prediction methods using machine learning. Nat Environ Pollut Technol 21(3):1259
Sumdang N, Chotpantarat S, Cho KH, Thanh NN (2023) The risk assessment of arsenic contamination in the urbanized coastal aquifer of Rayong groundwater basin, Thailand using the machine learning approach. Ecotoxicol Environ Saf 253:114665
Tao H, Hameed MM, Marhoon HA, Zounemat-Kermani M, Heddam S, Kim S, Yaseen ZM (2022) Groundwater level prediction using machine learning models: a comprehensive review. Neurocomputing 489:271–308
Stulp F, Sigaud O (2015) Many regression algorithms, one unified model: a review. Neural Netw 69:60–79
Maulud D, Abdulazeez AM (2020) A review on linear regression comprehensive in machine learning. J Appl Sci Technol Trends 1(4):140–147
Fernández del Castillo A, Yebra-Montes C, Verduzco Garibay M, de Anda J, Garcia-Gonzalez A, Gradilla-Hernández MS (2022) Simple prediction of an ecosystem-specific water quality index and the water quality classification of a highly polluted river through supervised machine learning. Water 14(8):1235
Ardana PDH, Redana IW, Yekti MI, Simpen IN (2022) Groundwater level forecasting using multiple linear regression and artificial neural network approaches. Civil Eng Architect 10(3):784–799
Mokhtar A, Elbeltagi A, Gyasi-Agyei Y, Al-Ansari N, Abdel-Fattah MK (2022) Prediction of irrigation water quality indices based on machine learning and regression models. Appl Water Sci 12(4):76
Moukhliss M, Taleb A, Souabi S, Ouallali A, Spalevic V (2022) Groundwater quality forecasting using machine learning algorithms: case study berrechid aquifer, Central Morocco. Agric For. https://doi.org/10.17707/AgricultForest.68.3.03
Shadrin D, Nikitin A, Tregubova P, Terekhova V, Jana R, Matveev S, Pukalchik M (2021) An automated approach to groundwater quality monitoring—geospatial map** based on combined application of Gaussian process regression and Bayesian information criterion. Water 13(4):400
Podgorski J, Araya D, Berg M (2022) Geogenic manganese and iron in groundwater of Southeast Asia and Bangladesh-Machine learning spatial prediction modeling and comparison with arsenic. Sci Total Environ 833:155131
Gaagai A, Aouissi HA, Bencedira S, Hinge G, Athamena A, Haddam S, Ibrahim H (2023) Application of water quality indices, machine learning approaches, and GIS to identify groundwater quality for irrigation purposes: a case study of Sahara Aquifer, Doucen Plain, Algeria. Water 15(2):289
Stackelberg PE, Belitz K, Brown CJ, Erickson ML, Elliott SM, Kauffman LJ, Reddy JE (2021) Machine learning predictions of pH in the glacial aquifer system, Northern USA. Groundwater 59(3):352–368
Ewusi A, Ahenkorah I, Aikins D (2021) Modelling of total dissolved solids in water supply systems using regression and supervised machine learning approaches. Appl Water Sci 11(2):1–16
Agbasi JC, Egbueri JC (2023) Intelligent soft computational models integrated for the prediction of potentially toxic elements and groundwater quality indicators: a case study. J Sediment Environ 8:1–23
Tran DA, Tsujimura M, Ha NT, Van Binh D, Dang TD, Doan QV, Pham TD (2021) Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta, Vietnam. Ecol Indic 127:107790
Nair JP, Vijaya MS (2022) River water quality prediction and index classification using machine learning. J Phys Conf Ser 2325(1):012011
Bedi S, Samal A, Ray C, Snow D (2020) Comparative evaluation of machine learning models for groundwater quality assessment. Environ Monit Assess 192:1–23
Krhoda G, Amimo MO (2019) Groundwater quality prediction using logistic regression model for Garissa County. Afr J Phys Sci 3:13–27
Mokarram M (2016) Modeling of multiple regression and multiple linear regressions for prediction of groundwater quality (case study: north of Shiraz). Model Earth Syst Environ 2:1–7
Joarder MAM, Raihan F, Alam JB, Hasanuzzaman S (2008) Regression analysis of ground water quality data of Sunamganj District, Bangladesh. Int J Environ Res 2(3):291–296
Ibrahim I, Abdulazeez A (2021) The role of machine learning algorithms for diagnosing diseases. J Appl Sci Technol Trends 2(01):10–19
Charbuty B, Abdulazeez A (2021) Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends 2(01):20–28
Sharma H, Kumar S (2016) A survey on decision tree algorithms of classification in data mining. Int J Sci Res (IJSR) 5(4):2094–2097
Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 35(5–6):352–359
Gakii C, Jepkoech J (2019) A classification model for water quality analysis using decision tree. Eur J Comput Sci Info Technol 7:1–8
Hannan A, Anmala J (2021) Classification and prediction of fecal coliform in stream waters using decision trees (DTs) for upper Green River watershed, Kentucky, USA. Water 13(19):2790
Vijay S, Kamaraj K (2019) Ground water quality prediction using machine learning algorithms in R. Int J Res Anal Rev 6(1):743–749
Brédy J, Gallichand J, Celicourt P, Gumiere SJ (2020) Water table depth forecasting in cranberry fields using two decision-tree-modeling approaches. Agric Water Manag 233:106090
Gorgij AD, Askari G, Taghipour AA, Jami M, Mirfardi M (2023) Spatiotemporal forecasting of the groundwater quality for irrigation purposes, using deep learning method: long short-term memory (Lstm). Agric Water Manag 277:108088
Avand M, Janizadeh S, Tien Bui D, Pham VH, Ngo PTT, Nhu VH (2020) A tree-based intelligence ensemble approach for spatial prediction of potential groundwater. Int J Digital Earth 13(12):1408–1429
Zhu L, Huang L, Fan L, Huang J, Huang F, Chen J, Wang Y (2020) Landslide susceptibility prediction modeling based on remote sensing and a novel deep learning algorithm of a cascade-parallel recurrent neural network. Sensors 20(6):1576
Lerios JL, Villarica MV (2019) Pattern extraction of water quality prediction using machine learning algorithms of water reservoir. Int J Mech Eng Robot Res 8(6):992–997
Jha BK, Sivasankari GG, Venugopal KR (2020) Cloud-based smart water quality monitoring system using IoT sensors and machine learning. Int J Adv Trends Comput Sci Eng 9(3):3403
Gaffoor Z, Pietersen K, Jovanovic N, Bagula A, Kanyerere T, Ajayi O, Wanangwa G (2022) A comparison of ensemble and deep learning algorithms to model groundwater levels in a data-scarce aquifer of Southern Africa. Hydrology 9(7):125
Elzain HE, Chung SY, Venkatramanan S, Selvam S, Ahemd HA, Seo YK, Yassin MA (2023) Novel machine learning algorithms to predict the groundwater vulnerability index to nitrate pollution at two levels of modeling. Chemosphere 314:137671
Gajowniczek K, Ząbkowski T (2021) Interactive decision tree learning and decision rule extraction based on the ImbTreeEntropy and ImbTreeAUC packages. Processes 9(7):1107
Al-Adhaileh MH, Aldhyani TH, Alsaade FW, Al-Yaari M, Albaggar AKA (2022) Groundwater quality: the application of artificial intelligence. J Environ Public Health. https://doi.org/10.1155/2022/8425798
García-del-Toro EM, García-Salgado S, Mateo LF, Quijano M, Más-López MI (2022) Machine learning as a diagnosis tool of groundwater quality in zones with high agricultural activity (Region of Campo de Cartagena, Murcia, Spain). Agronomy 12(12):3076
Hassan MM, Hassan MM, Akter L, Rahman MM, Zaman S, Hasib KM, Mollick S (2021) Efficient prediction of water quality index (WQI) using machine learning algorithms. Human-Centric Intell Syst 1(3–4):86–97
Saghebian SM, Sattari MT, Mirabbasi R, Pal M (2014) Ground water quality classification by decision tree method in Ardebil region. Iran Arab J Geosci 7:4767–4777
Afrifa S, Zhang T, Appiahene P, Varadarajan V (2022) Mathematical and machine learning models for groundwater level changes: a systematic review and bibliographic analysis. Future Internet 14(9):259
Dritsas E, Trigka M (2023) Efficient data-driven machine learning models for cardiovascular diseases risk prediction. Sensors 23(3):1161
Zhao, Y., Song, J., Cheng, K., Liu, Z., & Yang, F. (2024). Migration and remediation of typical contaminants in soil and groundwater: A state of art review. Land Degradation & Development. https://doi.org/10.1002/ldr.5103
Sit M, Demiray BZ, **ang Z, Ewing GJ, Sermet Y, Demir I (2020) A comprehensive review of deep learning applications in hydrology and water resources. Water Sci Technol 82(12):2635–2670
Acknowledgements
The authors are grateful to the Department of Chemical Engineering, School of Energy Technology, Pandit Deendayal Energy University for the permission to publish this research.
Funding
Not Applicable.
Author information
Authors and Affiliations
Contributions
All the authors make a substantial contribution to this manuscript. HP, KJ, and MS participated in drafting the manuscript. HP, KJ, and MS. wrote the main manuscript. All the authors discussed the results and implications of the manuscript at all stages.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Ethical Approval
Not applicable.
Informed Consent
Not applicable.
Consent for Publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pandya, H., Jaiswal, K. & Shah, M. A Comprehensive Review of Machine Learning Algorithms and Its Application in Groundwater Quality Prediction. Arch Computat Methods Eng (2024). https://doi.org/10.1007/s11831-024-10126-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11831-024-10126-2