Log in

Application of machine learning ensemble models for rainfall prediction

  • Research Article - Hydrology
  • Published:
Acta Geophysica Aims and scope Submit manuscript

Abstract

Practical information can be drawn from rainfall for making long-term water resources management plans, taking flood mitigation measures, and even establishing proper irrigation systems. Given that a large amount of data with high resolution is required for physical modeling, this study proposes a new standalone sequential minimal optimization (SMO) regression model and develops its ensembles using Dagging (DA), random committee (RC), and additive regression (AR) models (i.e., DA-SMO, RC-SMO, and AR-SMO) for rainfall prediction. First, 30-year monthly data derived from the year 1988 to 2018 including evaporation, maximum and minimum temperatures, maximum and minimum relative humidity rates, sunshine hours, and wind speed as input and rainfall as the output were acquired from a synoptic station in Kermanshah, Iran. Next, based on the Pearson correlation coefficient (r-value) between input and output variables, different input scenarios were formed. Then, the dataset was separated into three subsets: development (1988–2008), calibration (2009–2010), and validation (2011–2018). Finally, the performance of the developed algorithms was validated using different visual (scatterplot and boxplot) and quantitative (percentage of BIAS, root mean square error, Nash–Sutcliffe efficiency, and mean absolute error) metrics. The results revealed that minimum relative humidity had the greatest effect on rainfall prediction. The most effective input scenario featured all the input variables except for wind speed. Our findings indicated that the DA-SMO ensemble algorithm outperformed other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

Available from the corresponding author upon reasonable request.

References

  • Adefisan E (2018) Climate change impact on rainfall and temperature distributions over West Africa from three IPCC scenarios. J Earth Sci Clim Change 9:476

    Google Scholar 

  • Ahmadi A, Han D, Kakaei Lafdani E, Moridi A (2015) Input selection for long-lead precipitation prediction using large-scale climate variables: a case study. J Hydroinf 17(1):114–129

    Article  Google Scholar 

  • Aksoy H, Dahamsheh A (2009) Artificial neural network models for forecasting monthly precipitation in Jordan. Stoch Env Res Risk Assess 23(7):917–931

    Article  Google Scholar 

  • Atiquzzaman M, Kandasamy J (2018) Robustness of Extreme Learning Machine in the prediction of hydrological flow series. Comput Geosci 120:105–114

  • Bui DT, Khosravi K, Tiefenbacher J, Nguyen H, Kazakis N (2020) Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Sci Total Environ 721:137612

    Article  Google Scholar 

  • Chau KW, Wu CL (2010) A hybrid model coupled with singular spectrum analysis for daily rainfall prediction. J Hydroinf 12(4):458–473

    Article  Google Scholar 

  • Chen W, Zhao X, Tsangaratos P, Shahabi H, Ilia I, Xue W, Ahmad BB (2020) Evaluating the usage of tree-based ensemble methods in groundwater spring potential map**. J Hydrol 583:124602

    Article  Google Scholar 

  • Chen Z, Liu Z, Yin L, Zheng W (2022) Statistical analysis of regional air temperature characteristics before and after dam construction. Urban Clim. https://doi.org/10.1016/j.uclim.2022.101085

    Article  Google Scholar 

  • Cui X, Penh H, Wen S, Zhi L (2010) Component selection in the additive regression model. Scand J Stat 40(3):491–510

    Article  Google Scholar 

  • Delleur JW, Kavvas ML (1978) Stochastic models for monthly rainfall forecasting and synthetic generation. J Appl Meteorol 17(10):1528–1536

    Article  Google Scholar 

  • Ebrahimi M, Rostami H, Osouli A, Rosanna Saindon RG (2022) Use of Geoelectrical Techniques to Detect Hydrocarbon Plume in Leaking Pipelines, ASCE Lifelines Conference 2021-2022, Los Angeles

  • Elbaz K, Shen S, Sun W, Yin Z, Zhou A (2020) Incorporating improved particle swarm optimization into ANFIS. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2974058

    Article  Google Scholar 

  • Elbazz K, Shen S, Zhou A, Yuan D, Xu Y (2019) Optimization of EPB Shield Performance with Adaptive Neuro-Fuzzy Inference System and Genetic Algorithm. Appl Sci 9(4):780. https://doi.org/10.3390/app9040780

    Article  Google Scholar 

  • Gao Q-Q, Bai Y-Q, Zhan Y-R (2019) Quadratic kernel-free least square twin support vector machine for binary classification problems. J Oper Res Soc China 7:539–559

  • Ghiasi-Freez J, Kadkhodaie-Ilkhchi A, Ziaii M (2012) Improving the accuracy of flow units prediction through two committee machine models: An example from the South Pars Gas Field, Persian Gulf Basin, Iran. Comput Geosci 46:10–23

  • Ghumman AR, Ghazaw YM, Sohail AR, Watanabe K (2011) Runoff forecasting by artificial neural network and conventional model. Alexandria Eng J 50(4):345–350

  • Hashim R, Roy C, Motamedi S, Shamshirband S, Petković D, Gocic M, Lee SC (2016) Selection of meteorological parameters affecting rainfall estimation using neuro-fuzzy computing methodology. Atmos Res 171:21–30

    Article  Google Scholar 

  • Hashmi S, Halawani MO, AmirAhmad MB (2015) Model trees and sequential minimal optimization based support vector machine models for estimating minimum surface roughness value. Appl Math Model 39(3):1119–1136

    Article  Google Scholar 

  • Hong H, Liu J, Bui DT, Pradhan B, Acharya TD, Pham BT, Zhu AX, Chen W, Ahmad BB (2018) Landslide susceptibility map** using J48 decision tree with adaboost, bagging and rotation forest ensembles in the Guangchang area (China). CATENA 163:399–413

    Article  Google Scholar 

  • Jiang S, Zuo Y, Yang M, Feng R (2021) Reconstruction of the Cenozoic tectono-thermal history of the dongpu depression, bohai bay basin, China: constraints from apatite fission track and vitrinite reflectance data. J Petrol Sci Eng 205:108809. https://doi.org/10.1016/j.petrol.2021.108809

    Article  Google Scholar 

  • Kadam AK, Wagh VM, Muley AA, Umrikar BN, Sankhua RN (2019) Prediction of water quality index using artificial neural network and multiple linear regression modelling approach in Shivganga River basin, India. Model Earth Syst Environ 5:951–962

  • Khosravi K, Barzegar R, Miraki S, Adamowski J, Daggupati P, Alizadeh MR, Pham B, Alami M (2020b) Stochastic modeling of groundwater fluoride contamination: introducing lazy learners. In press, Groundwater

    Google Scholar 

  • Khosravi K, Golkarian A, Booij M, Barzegar R, Sun W, Yaseen ZM, Mosavi A (2021) Improving daily stochastic streamflow prediction: comparison of novel hybrid data-mining algorithms. Hydrol Sci J 66:1457–1474. https://doi.org/10.1080/02626667.2021.1928673

    Article  Google Scholar 

  • Khosravi K, Golkarian A, Barzegar R, Aalami MT, Heddam S, Omidvar E, Keestra S, Opez-Vicente M (2022a) Multi-step-ahead soil temperature forecasting at multiple-depth based on meteorological data: integrating resampling algorithms and machine learning models. Under press, Pedosphere

    Google Scholar 

  • Khosravi K, Golkarian A, Melesse A, Deo R (2022b) Suspended sediment load modeling using advanced hybrid rotation forest based elastic network approach. J Hydrol 610:127963. https://doi.org/10.1016/j.jhydrol.2022.127963

    Article  Google Scholar 

  • Khosravi K, Cooper, J. R., Daggupati, P., Pham, B. T., & Bui, D. T. (2020b). Bedload transport rate prediction: application of novel hybrid data mining techniques. Journal of Hydrology, 124774.

  • Kisi O, Shiri J (2011) Precipitation forecasting using wavelet-genetic programming and wavelet-neuro-fuzzy conjunction models. Water Resour Manage 25(13):3135–3152

    Article  Google Scholar 

  • Legates DR, Mccabe GJ (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233–241

    Article  Google Scholar 

  • Liu Y, Zhang K, Li Z, Liu Z, Wang J, Huang P (2020) A hybrid runoff generation modelling framework based on spatial combination of three runoff generation schemes for semi-humid and semi-arid watersheds. J Hydrol (amsterdam) 590:125440. https://doi.org/10.1016/j.jhydrol.2020.125440

    Article  Google Scholar 

  • Liu B, Spiekermann R, Zhao C, Püttmann W, Sun Y, Jasper A, Uhl D (2022a) Evidence for the repeated occurrence of wildfires in an upper Pliocene lignite deposit from Yunnan, SW China. Int J Coal Geol 250:103924. https://doi.org/10.1016/j.coal.2021.103924

    Article  Google Scholar 

  • Liu S, Liu Y, Wang C, Dang X (2022b) The Distribution characteristics and human health risks of high-fluorine groundwater in coastal plain: a case study in Southern Laizhou Bay. Frontiers in Environmental Science, China. https://doi.org/10.3389/fenvs.2022b.901637

    Book  Google Scholar 

  • Luk KC, Ball JE, Sharma A (2000) A study of optimal model lag and spatial inputs to artificial neural network for rainfall forecasting. J Hydrol 227(1–4):56–65

    Article  Google Scholar 

  • Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007a) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900

    Article  Google Scholar 

  • Moriasi DN, Arnold JG, Van Liew MW, Binger RL, Harmel RD, Veith TL (2007b) Model evaluation guidelines for systematic quantification of accuracy in watershedsimulations. Trans ASABE 50:885–900. https://doi.org/10.13031/2013.23153

    Article  Google Scholar 

  • Nguyen H, Mehrabi M, Kalantar B, Moayedi H, Abdullahi MM (2019) Potential of hybrid evolutionary approaches for assessment of geo-hazard landslide susceptibility map**. Geomat Natural Hazard Risk 10(1):1667–1693

  • Nhu V-H, Khosravi K, Cooper JR, Karimi M, Kisi O, Pham BT, Lyu Z (2020) Monthly suspended sediment load prediction using artificial intelligence: testing of a new random subspace method. Hydrol Sci J 65(12):2116–2127

  • Niranjan A, Haripriya DK, Pooja R, Sarah S, Deepa Shenoy P, Venugopal KR (2018) EKRV: Ensemble of kNN and Random Committee Using Voting for Efficient Classification of Phishing. In: Advances in Intelligent Systems and Computing, vol 713, pp 403–414

  • Oyebode O, Stretch D (2019) Neural network modelling of hydrological systems: a review of implementation techniques. In: Natural resource modelling. Wiley, pp 1–14. https://doi.org/10.1002/nrm.12189

  • Osouli A, Ebrahimi M, Alzamora D, Shoup HZ, Pagenkopf J (2022) Multi-criteria assessment of bridge sites for conducting PSTD/ISTD: case histories. Transp Res Rec J Transp Res Board. https://doi.org/10.1177/03611981221108153

    Article  Google Scholar 

  • Pham BT, Le LM, Le T-H, Thi Bui K-T, Minh V, Prakhsh I (2020) Development of advanced artificial intelligence models for daily rainfall prediction. Atmos Res 237:104845

    Article  Google Scholar 

  • Quan Q, Liang W, Yan D, Lei J (2022) Influences of joint action of natural and social factors on atmospheric process of hydrological cycle in Inner Mongolia. China Urban Clim 41:101043. https://doi.org/10.1016/j.uclim.2021.101043

    Article  Google Scholar 

  • Ridwan WM, Sapitang M, Aziz A, Kushiar KF, Ahmed AN, El-Shafie A (2021) Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia. Ain Shams Eng J 12(2):1651–1663

  • Samantaray S, Tripathy O, Sahoo A, & Ghose DK, (2020). Rainfall forecasting through ANN and SVM in Bolangir Watershed, India. In: smart intelligent computing and applications. Springer, Singapore (pp. 767–774)

  • Sánchez-Monedero J, Salcedo-Sanz S, Gutiérrez PA, Casanova-Mateo C, Hervás-Martínez C (2014) Simultaneous modelling of rainfall occurrence and amount using a hierarchical nominal–ordinal support vector classifier. Eng Appl Artif Intell 34:199–207. https://doi.org/10.1016/j.engappai.2014.05.016

    Article  Google Scholar 

  • Sheikh Khozani Z, Bonakdari H, Ebtehaj I (2017) An analysis of shear stress distribution in circular channels with sediment deposition based on Gene Expression Programming. Int J Sediment Res 32(4):575–584

  • Sivapragasam C, Liong SY, Pasha MFK (2001) Rainfall and runoff forecasting with SSA–SVM approach. J Hydroinf 3(3):141–152

    Article  Google Scholar 

  • Tian H, Qin Y, Niu Z, Wang L, Ge S (2021a) Summer maize map** by compositing time series sentinel-1a imagery based on crop growth cycles. J Indian Soc Remote Sens 49(11):2863–2874. https://doi.org/10.1007/s12524-021-01428-0

    Article  Google Scholar 

  • Tian H, Wang Y, Chen T, Zhang L, Qin Y (2021b) Early-season map** of winter crops using sentinel-2 optical imagery. Remote Sens (basel, Switzerland) 13(19):3822. https://doi.org/10.3390/rs13193822

    Article  Google Scholar 

  • Ting, KM, Witten IH, (1997) stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375

  • Toth E, Brath A, Montanari A (2000) Comparison of short-term rainfall prediction models for real-time flood forecasting. J Hydrol 239(1–4):132–147

    Article  Google Scholar 

  • Wang D, Hagen SC, Alizad K (2013) Climate change impact and uncertainty analysis of extreme rainfall events in the Apalachicola River basin, Florida. J Hydrol 480:125–135

    Article  Google Scholar 

  • Wang S, Zhang K, Chao L, Li D, Tian X, Bao H, **a Y (2021) Exploring the utility of radar and satellite-sensed precipitation and their dynamic bias correction for integrated prediction of flood and landslide hazards. J Hydrol (amsterdam) 603:126964. https://doi.org/10.1016/j.jhydrol.2021.126964

    Article  Google Scholar 

  • Wang Y, Cheng H, Hu Q, Liu L, Jia L, Gao S, Wang Y (2022) Pore structure heterogeneity of Wufeng-Longmaxi shale, Sichuan Basin, China: evidence from gas physisorption and multifractal geometries. J Pet Sci Eng 208:109313. https://doi.org/10.1016/j.petrol.2021.109313

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data Mining: Practical Machine Learning Tools and Techniques. Second edn, p 558

  • Wu J, Liu M, ** L (2010) A hybrid support vector regression approach for rainfall forecasting using particle swarm optimization and projection pursuit technology. Int J Comput Intell Appl 9(02):87–104

    Article  Google Scholar 

  • **-based geodetector and machine learning cluster: a case of **ao** County China. ISPRS Int J Geo-Inf 10(2):93. https://doi.org/10.3390/ijgi10020093

    Article  Google Scholar 

  • ** City. China Nat Hazard (dordrecht) 109(1):931–948. https://doi.org/10.1007/s11069-021-04862-y

    Article  Google Scholar 

  • Xu B, Lin B (2017) Does the high–tech industry consistently reduce CO2 emissions? Results from nonparametric additive regression model. Environ Impact Assess Rev 63:44–58

    Article  Google Scholar 

  • Yaseen ZM, Ebtehaj I, Kim S, Sanikhani H, Asadi H, Ghareb MI, Shahid S (2019) Novel hybrid data-intelligence model for forecasting monthly rainfall with uncertainty analysis. Water 11(3):502

    Article  Google Scholar 

  • Yevjevich V (1987) Stochastic models in hydrology. Stoch Hydrol Hydraul 1(1):17–36

    Article  Google Scholar 

  • Yin L, Wang L, Keim BD, Konsoer K, Zheng W (2022a) Wavelet analysis of dam injection and discharge in three gorges dam and reservoir with precipitation and river discharge. Water 14(4):567. https://doi.org/10.3390/w14040567

    Article  Google Scholar 

  • Yin L, Wang L, Zheng W, Ge L, Tian J, Liu, Y.,... Liu, S. (2022b) Evaluation of empirical atmospheric models using swarm-c satellite data. Atmosphere 13(2):294. https://doi.org/10.3390/atmos13020294

    Article  Google Scholar 

  • Zhang K, Ali A, Antonarakis A, Moghaddam M, Saatchi S, Tabatabaeenejad A, Moorcroft P (2019a) The sensitivity of North American terrestrial carbon fluxes to spatial and temporal variation in soil moisture: an analysis using radar-derived estimates of root-zone soil moisture. J Geophys Res Biogeosci 124(11):3208–3231. https://doi.org/10.1029/2018JG004589

    Article  Google Scholar 

  • Zhang K, Wang S, Bao H, Zhao X (2019b) Characteristics and influencing factors of rainfall-induced landslide and debris flow hazards in Shaanxi Province, China. Nat Hazard 19(1):93–105. https://doi.org/10.5194/nhess-19-93-2019

    Article  Google Scholar 

  • Zhang K, Shalehy MH, Ezaz GT, Chakraborty A, Mohib KM, Liu L (2022) An integrated flood risk assessment approach based on coupled hydrological-hydraulic modeling and bottom-up hazard vulnerability analysis. Environ Model Softw 148:105279. https://doi.org/10.1016/j.envsoft.2021.105279

    Article  Google Scholar 

  • Zhao F, Song L, Peng Z, Yang J, Luan G, Chu C, **: construction and analysis of ethnic minority development index. Remote Sens (basel, Switzerland) 13(11):2129. https://doi.org/10.3390/rs13112129

    Article  Google Scholar 

  • Zhao F, Zhang S, Du Q, Ding J, Luan G, **e Z (2021b) Assessment of the sustainable development of rural minority settlements based on multidimensional data and geographical detector method: a case study in Dehong China. Socio-Econ Plan Sci. 78:101066

    Article  Google Scholar 

  • Zhao X, **a H, Pan L, Song H, Niu W, Wang R, Qin Y (2021c) Drought monitoring over yellow river basin from 2003–2019 using reconstructed modis land surface temperature in google earth engine. Remote Sens (basel, Switzerland) 13(18):3748. https://doi.org/10.3390/rs13183748

    Article  Google Scholar 

  • Zhu B, Zhong Q, Chen Y, Liao S, Li Z, Shi K, Sotelo MA (2022a) A novel reconstruction method for temperature distribution measurement based on ultrasonic tomography. IEEE Trans Ultrason Ferroelectr Freq Control. https://doi.org/10.1109/TUFFC.2022.3177469

    Article  Google Scholar 

  • Zhu Z, Zhu Z, Wu Y, Han J (2022b) A Prediction method of coal burst based on analytic hierarchy process and fuzzy comprehensive evaluation. Front Earth Sci (lausanne). https://doi.org/10.3389/feart.2021.834958

    Article  Google Scholar 

  • Zuo Y, Jiang S, Wu S, Xu W, Zhang J, Feng R, Santosh M (2020) Terrestrial heat flow and lithospheric thermal structure in the Chagan depression of the Yingen-E**aqi Basin, north central China. Basin Res 32(6):1328–1346. https://doi.org/10.1111/bre.12430

    Article  Google Scholar 

Download references

Funding

Authors did not receive any funding for this paper.

Author information

Authors and Affiliations

Authors

Contributions

HSS was involved in conceptualization, methodology, software, writing—original draft. HA contributed to supervision, review and editing. BA was involved in supervision, review and editing.

Corresponding author

Correspondence to Hasan Ahmadi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest associated with this research or manuscript.

Additional information

Edited by Dr. Ankit Garg (ASSOCIATE EDITOR) / Dr. Michael Nones (CO-EDITOR-IN-CHEF).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmadi, H., Aminnejad, B. & Sabatsany, H. Application of machine learning ensemble models for rainfall prediction. Acta Geophys. 71, 1775–1786 (2023). https://doi.org/10.1007/s11600-022-00952-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11600-022-00952-y

Keywords

Navigation