Log in

Prediction of groundwater nitrate concentration in a semiarid region using hybrid Bayesian artificial intelligence approaches

  • Research Article
  • Published:
Environmental Science and Pollution Research Aims and scope Submit manuscript

Abstract

Nitrate is a major pollutant in groundwater whose main source is municipal wastewater and agricultural activities. In the present study, Bayesian approaches such as Bayesian generalized linear model (BGLM), Bayesian regularized neural network (BRNN), Bayesian additive regression tree (BART), and Bayesian ridge regression (BRR) were used to model groundwater nitrate contamination in a semiarid region Marvdasht watershed, Fars province, Iran. Eleven groundwater (GW) nitrate conditioning factors have been taken as input parameters for predictive modeling. The results showed that the Bayesian models used in this study were all competent to model groundwater nitrate and the BART model with R2 = 0.83 was more efficient than the other models. The result of variable importance showed that potassium (K) has the highest importance in the models followed by rainfall, altitude, groundwater depth, and distance from the residential area. The results of the study can support the decision-making process to control and reduce the sources of nitrate pollution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and materials

The data that support the findings of this study are available from the first author [Quoc Bao Pham, phambaoquoc@tdmu.edu.vn], upon reasonable request.

References

  • Afzali H, Khaksari M, Jeddi S, Kashfi K, Abdollahifar MA, Ghasemi A (2021) Acidified nitrite accelerates wound healing in type 2 diabetic male rats: a histological and stereological evaluation. Molecules 26(7):1872–1885

  • Ahmadi K, Kalantar B, Saeidi V, Harandi EKG, Janizadeh S, Ueda N (2020) Comparison of machine learning methods for map** the stand characteristics of temperate forests using multi-spectral sentinel-2 data. Remote Sens 12:3019

    Article  Google Scholar 

  • Alimohammadi M, Latifi N, Nabizadeh R, Yaghmaeian K, Mahvi AH, Yousefi M, Foroohar P, Hemmati S, Heidarinejad Z (2018) Determination of nitrate concentration and its risk assessment in bottled water in Iran. Data Brief 19:2133–2138

    Article  Google Scholar 

  • Band SS, Janizadeh S, Pal SC, Chowdhuri I, Siabi Z, Norouzi A, Melesse AM, Shokri M, Mosavi A (2020) Comparative analysis of artificial intelligence models for accurate estimation of groundwater nitrate concentration. Sensors 20:5763

    Article  CAS  Google Scholar 

  • Baskin II, Winkler D, Tetko IV (2016) A renaissance of neural networks in drug discovery. Expert Opin Drug Discov 11:785–795

    Article  CAS  Google Scholar 

  • Bonato V, Baladandayuthapani V, Broom BM, Sulman EP, Aldape KD, Do K-A (2011) Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics 27:359–367

    Article  CAS  Google Scholar 

  • Esmaeili A, Moore F, Keshavarzi B (2014) Nitrate contamination in irrigation groundwater, Isfahan, Iran. Environ Earth Sci 72:2511–2522

    Article  CAS  Google Scholar 

  • Fučík P, Novák P, Žížala D (2014) A combined statistical approach for evaluation of the effects of land use, agricultural and urban activities on stream water chemistry in small tile-drained catchments of south Bohemia, Czech Republic. Environ Earth Sci 72:2195–2216

    Article  Google Scholar 

  • Gallagher TL, Gergel SE (2017) Landscape indicators of groundwater nitrate concentrations: an approach for trans-border aquifer monitoring. Ecosphere 8:e02047

    Article  Google Scholar 

  • Gardner KK, Vogel RM (2005) Predicting ground water nitrate concentration from land use. Groundwater 43:343–352

    Article  CAS  Google Scholar 

  • Gelman A, Su YS, Yajima M, Hill J, Pittau MG, Kerman J, Zheng T, Dorie V, Su MYS (2015) Package ‘arm’. Data analysis using regression and multilevel/hierarchical models

  • Gramacy RB, Gramacy MRB and data augmentation extends this Bayesian M (2019) Package ‘monomvn’. R package version, pp 1–9

  • Guarnieri A, Masiero A, Vettore A, Pirotti F (2015) Evaluation of the dynamic processes of a landslide with laser scanners and Bayesian methods. Geomatics. Nat Hazards Risk 6:614–634

    Article  Google Scholar 

  • Hanmer J, Cella D, Feeny D, Fischhoff B, Hays RD, Hess R, Pilkonis PA, Revicki D, Roberts M, Tsevat J, Yu L (2018) Evaluation of options for presenting health-states from PROMIS® item banks for valuation exercises. Qual Life Res 27(7):1835–1843

  • Hosack GR, Hayes KR, Barry SC (2017) Prior elicitation for Bayesian generalised linear models with application to risk control option assessment. Reliab Eng Syst Saf 167:351–361

    Article  Google Scholar 

  • Hosseini FS, Choubin B, Mosavi A, Nabipour N, Shamshirband S, Darabi H, Haghighi AT (2020) Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: application of the simulated annealing feature selection method. Sci Total Environ 711:135161

    Article  CAS  Google Scholar 

  • Huan H, Hu L, Yang Y, Jia Y, Lian X, Ma X, Jiang Y, ** B (2020) Groundwater nitrate pollution risk assessment of the groundwater source field based on the integrated numerical simulations in the unsaturated zone and saturated aquifer. Environ Int 137:105532

    Article  CAS  Google Scholar 

  • Jalali M (2011) Nitrate pollution of groundwater in Toyserkan, western Iran. Environ Earth Sci 62:907–913

    Article  CAS  Google Scholar 

  • Johnston R, Jones K, Manley D (2018) Confounding and collinearity in regression analysis: a cautionary tale and an alternative procedure, illustrated by studies of British voting behaviour. Qual Quant 52:1957–1976

    Article  Google Scholar 

  • Khalil A, Almasri MN, McKee M, Kaluarachchi JJ (2005) Applicability of statistical learning algorithms in groundwater quality modeling. Water Resour Res 41:1–16

  • Knoll L, Breuer L, Bach M (2019) Large scale prediction of groundwater nitrate concentrations from spatial data using machine learning. Sci Total Environ 668:1317–1327

    Article  CAS  Google Scholar 

  • Knoll L, Breuer L, Bach M (2020a) Nation-wide estimation of groundwater redox conditions and nitrate concentrations through machine learning. Environ Res Lett 15:64004

    Article  CAS  Google Scholar 

  • Knoll L, Häußermann U, Breuer L, Bach M (2020b) Spatial distribution of integrated nitrate reduction across the unsaturated zone and the groundwater body in Germany. Water 12:2456

    Article  CAS  Google Scholar 

  • Koh E-H, Lee E, Lee K-K (2020) Application of geographically weighted regression models to predict spatial characteristics of nitrate contamination: implications for an effective groundwater management strategy. J Environ Manage 268:110646

    Article  CAS  Google Scholar 

  • Kuhn M, Wing J, Weston S, Williams A, Keefer C, Engelhardt A, Cooper T, Mayer Z, Kenkel B, Benesty M (2020) Package ‘caret’. R Journal 223:1–7‏

  • Lahjouj A, El Hmaidi A, Bouhafa K, Boufala M (2020) Map** specific groundwater vulnerability to nitrate using random forest: case of Sais basin, Morocco. Model Earth Syst Environ 6:1451–1466

    Article  Google Scholar 

  • Lee S, Choi J, Min K (2002) Landslide susceptibility analysis and verification using the Bayesian probability model. Environ Geol 43:120–131

    Article  Google Scholar 

  • Liu Z, Merwade V (2018) Accounting for model structure, parameter and input forcing uncertainty in flood inundation modeling using Bayesian model averaging. J Hydrol 565:138–149

    Article  Google Scholar 

  • Lu Y, Qin XS, **e YJ (2016) An integrated statistical and data-driven framework for supporting flood risk analysis under climate change. J Hydrol 533:28–39

    Article  Google Scholar 

  • Lüdtke S, Schröter K, Steinhausen M, Weise L, Figueiredo R, Kreibich H (2019) A consistent approach for probabilistic residential flood loss modeling in Europe. Water Resour Res 55:10616–10635

    Article  Google Scholar 

  • Moriasi DN, Gitau MW, Pai N, Daggupati P (2015) Hydrologic and water quality models: Performance measures and evaluation criteria. Trans ASABE 58:1763–1785

    Article  Google Scholar 

  • Naghibi SA, Hashemi H, Berndtsson R, Lee S (2020) Application of extreme gradient boosting and parallel random forest algorithms for assessing groundwater spring potential using DEM-derived factors. J Hydrol 589:125197

    Article  Google Scholar 

  • Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I—A discussion of principles. J Hydrol 10:282–290

    Article  Google Scholar 

  • Nejatijahromi Z, Nassery HR, Hosono T, Nakhaei M, Alijani F, Okumura A (2019) Groundwater nitrate contamination in an area using urban wastewaters for agricultural irrigation under arid climate condition, southeast of Tehran, Iran. Agric Water Manag 221:397–414

    Article  Google Scholar 

  • Nolan BT, Hitt KJ, Ruddy BC (2002) Probability of nitrate contamination of recently recharged groundwaters in the conterminous United States. Environ Sci Technol 36:2138–2145

    Article  CAS  Google Scholar 

  • Nolan BT, Fienen MN, Lorenz DL (2015) A statistical learning framework for groundwater nitrate models of the Central Valley, California, USA. J Hydrol 531:902–911

    Article  CAS  Google Scholar 

  • Ouedraogo I, Defourny P, Vanclooster M (2019) Application of random forest regression and comparison of its performance to multiple linear regression in modeling groundwater nitrate concentration at the African continent scale. Hydrogeol J 27:1081–1098

    Article  CAS  Google Scholar 

  • Park S, Kim J (2021) The predictive capability of a novel ensemble tree-based algorithm for assessing groundwater potential. Sustainability 13:2459

    Article  Google Scholar 

  • Pauwels H, Talbo H (2004) Nitrate concentration in wetlands: assessing the contribution of deeper groundwater from anions. Water Res 38:1019–1025

    Article  CAS  Google Scholar 

  • Pennino MJ, Leibowitz SG, Compton JE, Hill RA, Sabo RD (2020) Patterns and predictions of drinking water nitrate violations across the conterminous United States. Sci Total Environ 722:137661

    Article  CAS  Google Scholar 

  • Pérez RP, Gianola D (2016) BRNN: Bayesian regularization for feed-forward neural networks. R package version 0.6

  • Pollicino LC, Colombo L, Formentin G, Alberti L (2021) Stochastic modelling of solute mass discharge to identify potential source zones of groundwater diffuse pollution. Water Res 200:117240

    Article  CAS  Google Scholar 

  • Pratt B, Chang H (2012) Effects of land cover, topography, and built structure on seasonal water quality at multiple spatial scales. J Hazard Mater 209:48–58

    Article  Google Scholar 

  • Qian H, Chen J, Howard KWF (2020) Assessing groundwater pollution and potential remediation processes in a multi-layer aquifer system. Environ Pollut 263:114669

    Article  CAS  Google Scholar 

  • Rahmati O, Choubin B, Fathabadi A, Coulon F, Soltani E, Shahabi H, Mollaefar E, Tiefenbacher J, Cipullo S, Ahmad BB et al (2019) Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and uneec methods. Sci Total Environ 688:855–866

    Article  CAS  Google Scholar 

  • Raju NJ, Shukla UK, Ram P (2011) Hydrogeochemistry for the assessment of groundwater quality in Varanasi: a fast-urbanizing center in Uttar Pradesh, India. Environ Monit Assess 173:279–300

    Article  CAS  Google Scholar 

  • Ransom KM, Nolan BT, Traum JA, Faunt CC, Bell AM, Gronberg JAM, Wheeler DC, Rosecrans CZ, Jurgens B, Schwarz GE et al (2017) A hybrid machine learning model to predict and visualize nitrate concentration throughout the Central Valley aquifer, California, USA. Sci Total Environ 601:1160–1172

    Article  Google Scholar 

  • Redding DW, Lucas TCD, Blackburn TM, Jones KE (2017) Evaluating Bayesian spatial methods for modelling species distributions with clumped and restricted occurrence data. PLoS One 12:e0187602

    Article  Google Scholar 

  • Rodriguez-Galiano V, Mendes MP, Garcia-Soldado MJ, Chica-Olmo M, Ribeiro L (2014) Predictive modeling of groundwater nitrate pollution using random forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain). Sci Total Environ 476:189–206

    Article  Google Scholar 

  • Rokhshad AM, Siuki AK, Yaghoobzadeh M (2021) Evaluation of a machine-based learning method to estimate the rate of nitrate penetration and groundwater contamination. Arab J Geosci 14:1–11

    Article  Google Scholar 

  • Rolf MM, Garrick DJ, Fountain T, Ramey HR, Weaber RL, Decker JE, …, Taylor JF (2015) Comparison of Bayesian models to estimate direct genomic values in multi-breed commercial beef cattle. Genet Sel Evol 47(1):1-14

  • Saha S, Saha M, Mukherjee K, Arabameri A, Ngo PTT, Paul GC (2020) Predicting the deforestation probability using the binary logistic regression, random forest, ensemble rotational forest, REPTree: a case study at the Gumani River Basin, India. Sci Total Environ 730:139197

    Article  CAS  Google Scholar 

  • Santhi C, Arnold JG, Williams JR, Dugas WA, Srinivasan R, Hauck LM (2001) Validation of the swat model on a large rwer basin with point and nonpoint sources 1. J Am Water Resour Assoc 37:1169–1188

    Article  CAS  Google Scholar 

  • Saulnier G-M, Beven K, Obled C (1997) Digital elevation analysis for distributed hydrological modeling: reducing scale dependence in effective hydraulic conductivity values. Water Resour Res 33:2097–2101

    Article  Google Scholar 

  • Seybold E, Gold AJ, Inamdar SP, Adair C, Bowden WB, Vaughan MCH, Pradhanang SM, Addy K, Shanley JB, Vermilyea A et al (2019) Influence of land use and hydrologic variability on seasonal dissolved organic carbon and nitrate export: insights from a multi-year regional analysis for the northeastern USA. Biogeochemistry 146:31–49

    Article  CAS  Google Scholar 

  • Shahhosseini M, Martinez-Feria RA, Hu G, Archontoulis SV (2019) Maize yield and nitrate loss prediction with machine learning algorithms. Environ Res Lett 14:124026

    Article  Google Scholar 

  • Sparapani R, Dabbouseh N, Gutterman D, Zhang J, Chen H, Bluemke D, Lima J, Burke G, Soliman E (2018) Novel electrocardiographic criteria for the diagnosis of left ventricular hypertrophy derived with Bayesian additive regression trees: the multi-ethnic study of atherosclerosis. Circulation 138:A10908–A10908

    Google Scholar 

  • Spijker J, Fraters D, Vrijhoef A (2021) A machine learning based modelling framework to predict nitrate leaching from agricultural soils across the Netherlands. Environ Res Commun 3:45002

    Article  Google Scholar 

  • Tetko IV, Livingstone DJ, Luik AI (1995) Neural network studies. 1. Comparison of overfitting and overtraining. J Chem Inf Comput Sci 35:826–833

    Article  CAS  Google Scholar 

  • Uddameri V, Silva ALB, Singaraju S, Mohammadi G, Hernandez EA (2020) Tree-based modeling methods to predict nitrate exceedances in the Ogallala Aquifer in Texas. Water 12:1023

    Article  Google Scholar 

  • Van Liew MW, Arnold JG, Garbrecht JD (2003) Hydrologic simulation on agricultural watersheds: choosing between two models. Trans ASAE 46:1539

    Article  Google Scholar 

  • Vaughan MCH, Bowden WB, Shanley JB, Vermilyea A, Sleeper R, Gold AJ, Pradhanang SM, Inamdar SP, Levia DF, Andres AS et al (2017) High-frequency dissolved organic carbon and nitrate measurements reveal differences in storm hysteresis and loading in relation to land cover and seasonality. Water Resour Res 53:5345–5363

    Article  CAS  Google Scholar 

  • Ward MH, Jones RR, Brender JD, De Kok TM, Weyer PJ, Nolan BT, Villanueva CM, Van Breda SG (2018) Drinking water nitrate and human health: an updated review. Int J Environ Res Public Health 15:1557

    Article  Google Scholar 

  • Wells MJ, Gilmore TE, Nelson N, Mittelstet A, Böhlke JK (2021) Determination of vadose zone and saturated zone nitrate lag times using long-term groundwater monitoring data and statistical machine learning. Hydrol Earth Syst Sci 25:811–829

    Article  CAS  Google Scholar 

  • Wheeler DC, Nolan BT, Flory AR, DellaValle CT, Ward MH (2015) Modeling groundwater nitrate exposure for an agricultural health study cohort in Iowa. Sci Total Environ 536:481–488

    Article  CAS  Google Scholar 

  • Yang Y, Yang Y (2020) Hybrid prediction method for wind speed combining ensemble empirical mode decomposition and Bayesian ridge regression. IEEE Access 8:71206–71218

    Article  Google Scholar 

  • Zheng Y, **e Y, Long X (2021) A comprehensive review of Bayesian statistics in natural hazards engineering. Nat Hazards 12:1–29

Download references

Author information

Authors and Affiliations

Authors

Contributions

Khalifa M. Alkindi, Kaustuv Mukherjee: conceptualization, writing—original draft, software, formal analysis, visualization. Manish Pandey, Aman Arora, Saeid Janizadeh: formal analysis, writing—original draft, visualization. Quoc Bao Pham, Kourosh Ahmadi: data curation, writing, review and editing. Duong Tran Anh: supervision, writing, review, editing.

Corresponding author

Correspondence to Duong Tran Anh.

Ethics declarations

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Responsible Editor: Marcus Schulz

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alkindi, K.M., Mukherjee, K., Pandey, M. et al. Prediction of groundwater nitrate concentration in a semiarid region using hybrid Bayesian artificial intelligence approaches. Environ Sci Pollut Res 29, 20421–20436 (2022). https://doi.org/10.1007/s11356-021-17224-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11356-021-17224-9

Keywords

Navigation