Abstract
Lacking climatic data because of scarce meteorological stations make it difficult to assess the non-stationary climatic-runoff processes driven by climate change in the high mountains of northwest China. To address these challenges, this study developed an integrated method by using climate downscaling based on reanalysis products, improved complete ensemble empirical mode decomposition with adaptive noise, and four machine learning algorithms (i.e., radial basis function artificial neural network, random forests, support vector regression, and eXtreme gradient boosting tree) to simulate streamflow. We tested the developed method in seven mountainous basins originating from high Tienshan Mountains, and the observed runoff of hydrological stations indicated the good performance of this method. The Nash–Sutcliffe efficiency coefficient is above 0.7, and mean absolute error and root mean square error between the simulations and observations are below 3 × 108 m3. This study confirms the importance of multi-scale modeling in improving the accuracy of runoff simulations. On inter-decadal scale, runoff has a more significant non-linear relationship than on inter-annual and seasonal scales. This study also indicates that the runoff changes are mainly controlled by temperature in high mountains, while precipitation contributes more to runoff changes in lowlands. This research can serve the hydrological forecasting and water resources management in data-scarce high mountains.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00477-022-02231-0/MediaObjects/477_2022_2231_Fig7_HTML.png)
Similar content being viewed by others
Data availability
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
The code used and analyzed during the current study are available from the corresponding author on reasonable request.
References
Aggarwal SK, Goel A, Singh VP (2012) Stage and discharge forecasting by SVM and ANN techniques. Water Resour Manag 26:3705–3724. https://doi.org/10.1007/s11269-012-0098-x
Aizen V, Aizen E, Glazirin G, Loaiciga HA (2000) Simulation of daily runoff in Central Asian alpine watersheds. J Hydrol 238:15–34. https://doi.org/10.1016/S0022-1694(00)00319-X
Adnan RM, Petroselli A, Heddam S, Santos CAG, Kisi O (2021) Comparison of different methodologies for rainfall–runoff modeling: machine learning vs conceptual approach. Nat Hazards 105:2987–3011. https://doi.org/10.1007/s11069-020-04438-2
Ali M, Prasad R (2019) Significant wave height forecasting via an extreme learning machine model integrated with improved complete ensemble empirical mode decomposition. Renew Sust Energ Rev 104:281–295. https://doi.org/10.1016/j.rser.2019.01.014
Ao YL, Li HQ, Zhu LP, Ali S, Yang ZG (2019) The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J Pet Sci Eng 174:776–789. https://doi.org/10.1016/j.petrol.2018.11.067
Arnold JG, Fohrer N (2005) SWAT2000 current capabilities and research opportunities in applied watershed modeling. Hydrol Process 19:563–572. https://doi.org/10.1002/hyp.5611
Bai L, Xu JH, Chen ZS, Li WH, Liu ZH, Zhao BF, Wang ZJ (2016) The regional features of temperature variation trends over **njiang in China by the ensemble empirical mode decomposition method. Int J Climatol 35:3229–3237. https://doi.org/10.1002/joc.4202
Bai L (2016) Impact of climate change on runoff process in a typical watershed on the southern slope of Tianshan Mountains. Dissertation, East China Normal University (in Chinese)
Baydaroğlu Ö, Koçak K (2014) SVR-based prediction of evaporation combined with chaotic approach. J Hydrol 508:356–363. https://doi.org/10.1016/j.jhydrol.2013.11.008
Berga L (2016) The role of hydropower in climate change mitigation and adaptation: a review. Engineering 2:313–318. https://doi.org/10.1016/J.ENG.2016.03.004
Blake AP, Kapetanios G (2000) A radial basis function artificial neural network test for ARCH. Econ Lett 69:15–23. https://doi.org/10.1016/S0165-1765(00)00267-6
Brand S, Gutiérrez JM, Herrera S, Cofiño AS (2012) On the use of reanalysis data for downscaling. J Clim 25:2517–2526. https://doi.org/10.1175/JCLI-D-11-00251.1
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Bresson R, Laprise R (2011) Scale-decomposed atmospheric water budget over North America as simulated by the Canadian Regional Climate Model for current and future climates. Clim Dyn 36:365–384. https://doi.org/10.1007/s00382-009-0695-4
Broekhuizen I, Muthanna TM, Leonhardt G, Viklander M (2019) Urban drainage models for green areas: structural differences and their effects on simulated runoff. J Hydrol X 5:100044. https://doi.org/10.1016/j.hydroa.2019.100044
Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355
Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer-Verlag, New York
Chen YN, Xu CC, Hao XM, Li WH, Chen YP, Zhu CG, Ye ZX (2009) Fifty-year climate change and its effect on annual runoff in the Tarim River Basin, China. Quat Int 208:53–61. https://doi.org/10.1016/j.quaint.2008.11.011
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining 2016, pp 785–794
Chen YN, Li WH, Deng HJ, Fang GH, Li Z (2016) Changes in Central Asia’s water tower: past, present and future. Sci Rep 6:35458. https://doi.org/10.1038/srep35458
Chen YN, Li BF, Fan YT, Sun CJ, Fang GH (2019) Hydrological and water cycle processes of inland river basins in the arid region of Northwest China. J Arid Land 11:161–179. https://doi.org/10.1007/s40333-019-0050-5
Chen S, Ren MM, Sun W (2021) Combining two-stage decomposition based machine learning methods for annual runoff forecasting. J Hydrol 603:126945. https://doi.org/10.1016/j.jhydrol.2021.126945
Colominas MA, Schlotthauer G, Torres ME (2014) Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed Signal Process Control 14:19–29. https://doi.org/10.1016/j.bspc.2014.06.009
Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88:2783–2792. https://doi.org/10.1890/07-0539.1
Decker M, Brunke MA, Wang Z, Sakaguchi K, Zeng XB, Bosilovich MG (2012) Evaluation of the reanalysis products from GSFC, NCEP, and ECMWF using flux tower observations. J Clim 25:1916–1944. https://doi.org/10.1175/JCLI-D-11-00004.1
Duan WL, Chen YN, Zou S, Nover D (2019) Managing the water-climate-food nexus for sustainable development in Turkmenistan. J Clean Prod 220:212–224. https://doi.org/10.1016/j.jclepro.2019.02.040
Druyan LM, Fulakeza M (2013) Downscaling reanalysis over continental Africa with a regional model: NCEP versus ERA Interim forcing. Clim Res 56:181–196. https://doi.org/10.3354/cr01152
Esmaeili M, Salimi A, Drebenstedt C, Abbaszadeh M, Bazzazi AA (2015) Application of PCA, SVR, and ANFIS for modeling of rock fragmentation. Arab J Geosci 8:6881–6893. https://doi.org/10.1007/s12517-014-1677-3
Fan JL, Wang XK, Wu LF, Zhou HM, Zhang FC, Yu X, Lu XH, **ang YZ (2018) Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: a case study in China. Energy Convers Manag 164:102–111. https://doi.org/10.1016/j.enconman.2018.02.087
Fan MT, Xu JH, Chen YN, Li WH (2020) Simulating the precipitation in the data scarce Tianshan Mountains, Northwest China based on the Earth system data products. Arab J Geosci 13:637. https://doi.org/10.1007/s12517-020-05509-1
Fan MT, Xu JH, Chen YN, Li WH (2021a) Modeling streamflow driven by climate change in data-scarce mountainous basins. Sci Total Environ 790:148256. https://doi.org/10.1016/j.scitotenv.2021.148256
Fan MT, Xu JH, Chen YN, Li WH (2021b) Reconstructing high-resolution temperature for the past 40 years in the Tianshan Mountains, China based on the Earth system data products. Atmos Res 253:105493. https://doi.org/10.1016/j.atmosres.2021.105493
Fang GH, Yang J, Chen YN, Li Z, Maeyer PD (2018) Impact of GCM structure uncertainty on hydrological processes in an arid area of China. Hydrol Res 49:893–907. https://doi.org/10.2166/nh.2017.227
Finnerty BD, Smith MB, Seo DJ, Koren V, Moglen GE (1997) Space-time scale sensitivity of the Sacramento model to radar-gage precipitation inputs. J Hydrol 203:21–38. https://doi.org/10.1016/S0022-1694(97)00083-8
Gauch M, Kratzert F, Klotz D, Nearing G, Lin J, Hochreiter S (2021) Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network. Hydrol Earth Syst Sci 25:2045–2062. https://doi.org/10.5194/hess-25-2045-2021
Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random Forests for land cover classification. Pattern Recognit Lett 27:294–300. https://doi.org/10.1016/j.patrec.2005.08.011
Guo J, Zhou JZ, Qin H, Zou Q, Li QQ (2011) Monthly streamflow forecasting based on improved support vector machine model. Expert Syst Appl 38:13073–13081. https://doi.org/10.1016/j.eswa.2011.04.114
Hadadin N (2016) Modeling of rainfall-runoff relationship in semi-arid watershed in the Central Region of Jordan. Jordan J Civ Eng 10:209–2018. https://doi.org/10.2495/AFM060020
Huang HP, Liang ZM, Li BQ, Wang D, Hu YM, Li YJ (2019) Combination of multiple data-driven models for long-term monthly runoff predictions based on Bayesian model averaging. Water Resour Manag 33:3321–3338. https://doi.org/10.1007/s11269-019-02305-9
Huang WJ, Duan WL, Chen YN (2021) Rapidly declining surface and terrestrial water resources in Central Asia driven by socio-economic and climatic changes. Sci Total Environ 784:147193. https://doi.org/10.1016/j.scitotenv.2021.147193
Ji HP, Chen YN, Fang GH, Li Z, Duan WL, Zhang QF (2021) Adaptability of machine learning methods and hydrological models to discharge simulations in data-sparse glaciated watersheds. J Arid Land 13:549–567. https://doi.org/10.1007/s40333-021-0066-5
Karlsson IB, Sonnenborg TO, Refsgaard JC, Trolle D, Børgesen CD, Olesen JE, Jeppesen E, Jensen KH (2016) Combined effects of climate models, hydrological model structures and land use scenarios on hydrological impacts of climate change. J Hydrol 535:301–317. https://doi.org/10.1016/j.jhydrol.2016.01.069
Kashani MH, Gheys RS (2018) Comparison of three intelligent techniques for runoff simulation. Civil Eng J 4(5):1095–1103. https://doi.org/10.1016/10.28991/cej-0309159
Kendall MG (1975) Rank Correlation Methods, 4th edn. Charles Griffin, London
Kim NW, Shin M-J (2018) Estimation of peak flow in ungauged catchments using the relationship between runoff coefficient and curve number. Water 10:1669. https://doi.org/10.3390/w10111669
Kumar A, Kumar P, Singh VK (2019) Evaluating different machine learning models for runoff and suspended sediment simulation. Water Resour Manag 33:1217–1231. https://doi.org/10.1007/s11269-018-2178-z
Lee SLA, Kouzani AZ, Hu EJ (2010) Random forest based lung nodule classification aided by clustering. Comput Med Imaging Graph 34:535–542. https://doi.org/10.1016/j.compmedimag.2010.03.006
Li YC, Wu J, Liu Y, Xu BX, Hao YH, Huo XL, Fan YH, Yeh TJ, Wang ZL (2015) Analyzing effects of climate change on streamflow in a glacier mountain catchment using an ARMA model. Quat Int 358:137–145. https://doi.org/10.1016/j.quaint.2014.10.001
Li YP, Chen YN, Wang F, He YQ, Li Z (2020a) Evaluation and projection of snowfall changes in High Mountain Asia based on NASA’s NEX-GDDP high-resolution daily downscaled dataset. Environ Res Lett 15:104040. https://doi.org/10.1088/1748-9326/aba926
Li BF, Li YP, Chen YN, Zhang BH, Shi X (2020b) Recent fall Eurasian cooling linked to North Pacific sea surface temperatures and a strengthening Siberian high. Nat Commun 11:5202. https://doi.org/10.1038/s41467-020-19014-2
Li Z, Fang GH, Chen YN, Duan WL, Mukanov Y (2020c) Agricultural water demands in Central Asia under 1.5 °C and 2.0 °C global warming. Agric Water Manag 231:106020. https://doi.org/10.1016/j.agwat.2020.106020
Li YJ, Wei J, Wang D, Li B, Huang HP, Xu X, Xu YP (2021) A medium and long-term runoff forecast method based on massive meteorological data and machine learning algorithms. Water 13(9):1308. https://doi.org/10.3390/w13091308
Luz PB, Heermann D (2005) A statistical approach to estimating runoff in center pivot irrigation with crust conditions. Agric Water Manag 72:33–46. https://doi.org/10.1016/j.agwat.2004.09.013
Maity R, Bhagwat PP, Bhatnagar A (2010) Potential of support vector regression for prediction of monthly streamflow using endogenous property. Hydrol Process 24:917–923. https://doi.org/10.1002/hyp.7535
Mann HB (1945) Nonparametric tests against trend. Econometrica 13:245–259. https://doi.org/10.2307/1907187
Pacheco FAL, Van der Weijden CH (2014) Modeling rock weathering in small water-sheds. J Hydrol 513:13–27. https://doi.org/10.1016/j.jhydrol.2014.03.036
Paik K, Kim JH, Kim HS, Lee DR (2005) A conceptual rainfall-runoff model considering seasonal variation. Hydrol Process 19:3837–3850. https://doi.org/10.1002/hyp.5984
Praskievicz S, Chang HJ (2009) A review of hydrological modelling of basin-scale climate change and urban development impacts. Prog Phys Geog 33:650–671. https://doi.org/10.1177/0309133309348098
Rigatti SJ (2017) Random forest. J Insur Med 47:31–39. https://doi.org/10.17849/insm-47-01-31-39.1
Shaban SA (1981) Computational of the Poisson-inverse Gaussian distribution. Commun Stat Theor M 10:1389–1399. https://doi.org/10.1080/03610928108828121
Roderick ML, Farquhar GD (2011) A simple framework for relating variations in runoff to variations in climatic conditions and catchment properties. Water Resour Res 47:W00G07. https://doi.org/10.1029/2010WR009826
Sen PK (1968) Estimates of the regression coefficient based on Kendall’s Tau. J Am Stat Assoc 63:1379–1389
Sharma E, Deo RC, Prasad R, Parisi AV (2020) A hybrid air quality early-warning framework: an hourly forecasting model with online sequential extreme learning machines and empirical mode decomposition algorithms. Sci Total Environ 709:135934. https://doi.org/10.1016/j.scitotenv.2019.135934
Shivhare N, Dikshit PKS, Dwivedi SB (2018) A comparison of SWAT model calibration techniques for hydrological modeling in the Ganga River Watershed. Engineering 4:643–652. https://doi.org/10.1016/j.eng.2018.08.012
Sibtain M, Li XS, Saleem S (2020) Multivariate and multistage medium-and long-term streamflow prediction based on an ensemble of signal decomposition techniques with a deep learning network. Adv Meteorol 2020:8828664. https://doi.org/10.1155/2020/8828664
Sun JY, Zhong GQ, Huang KZ, Dong JY (2018) Banzhaf random forests: cooperative game theory based random forests with consistency. Neural Netw 106:20–29. https://doi.org/10.1016/j.neunet.2018.06.006
Thiel H (1950) A rank‐invariant method of linear and polynomial regression analysis. I II III Nederl Akad Wetensch Proc 53:386–392, 521–525, and 1397–1412. https://doi.org/10.1007/978-94-011-2546-8_20
Vafakhah M, Bozchaloei SK (2020) Regional analysis of flow duration curves through support vector regression. Water Resour Manag 34(1):283–294. https://doi.org/10.1007/s11269-019-02445-y
Wang LC, Kisi O, Hu B, Bilal M, Zounemat-Kermani M, Li H (2017) Evaporation modelling using different machine learning techniques. Int J Climatol 37:1076–1092. https://doi.org/10.1002/joc.5064
Wang SJ, Zhang MJ, Hughes CE, Crawford J, Wang GF, Chen FL, Du MX, Qiu X, Zhou SE (2018a) Meteoric water lines in arid Central Asia using event-based and monthly data. J Hydrol 562:435–445. https://doi.org/10.1016/j.jhydrol.2018.05.034
Wang C, Chen XuJH, YN, Bai L, Chen ZS, (2018b) A hybrid model to assess the impact of climate variability on streamflow for an ungauged mountainous basin. Clim Dyn 50:2829–2844. https://doi.org/10.1007/s00382-017-3775-x
Wang C, Xu JH, Chen YN, Li WH (2019) An approach to simulate the climate-driven streamflow in the data-scarce mountain basins of Northwest China. J Earth Syst Sci 128:95. https://doi.org/10.1007/s12040-019-1117-6
Wang SJ, Yang YD, Gong WY, Che YJ, Ma XG, **e J (2021) Reason analysis of the Jiwenco Glacial Lake Outburst Flood (GLOF) and potential hazard on the Qinghai-Tibetan Plateau. Remote Sens 13:3114. https://doi.org/10.3390/rs13163114
Wen XH, Feng Q, Deo RC, Wu M, Yin ZL, Yang LS, Singh VP (2019) Two-phase extreme learning machines integrated with the complete ensemble empirical mode decomposition with adaptive noise algorithm for multi-scale runoff prediction problems. J Hydrol 570:167–184. https://doi.org/10.1016/j.jhydrol.2018.12.060
Wilkinson MC, Meade AJ (2016) Radial basis function artificial neural-network-inspired numerical solver. J Aerosp Inf Syst 13:1–14. https://doi.org/10.2514/1.I010196
Wu LF, Peng YW, Fan JL, Wang YC (2019) Machine learning models for the estimation of monthly mean daily reference evapotranspiration based on cross-station and synthetic data. Hydrol Res 50:1730–1750. https://doi.org/10.2166/nh.2019.060
Wufu A, Chen Y, Yang ST, Lou HZ, Wang PF, Li CJ, Wang J, Ma LG (2021) Changes in glacial meltwater runoff and its response to climate change in the Tianshan Region detected using unmanned aerial vehicles (UAVs) and satellite remote sensing. Water 13(13):1753. https://doi.org/10.3390/w13131753
**ng H, Zhon ZL, Wang SY (2015) The prediction model of earthquake casuailty based on robust wavelet v-SVM. Nat Hazards 77:717–732. https://doi.org/10.1007/s11069-015-1620-2
Xu JH, Chen YN, Lu F, Li WH, Zhang LJ, Hong YL (2011) The Nonlinear trend of runoff and its response to climate change in the Aksu River, western China. Int J Climatol 31:687–695. https://doi.org/10.1007/s11707-013-0354-2
Xu JH, Chen YN, Bai L, Xu YW (2016) A hybrid model to simulate the annual runoff of the Kaidu River in northwest China. Hydrol Earth Syst Sci 20:1447–1457. https://doi.org/10.5194/hess-20-1447-2016
Xu ZH, Huang XY, Lin L, Wang QF, Liu J, Yu KY, Chen CC (2020) BP neural networks and random forest models to detect damage by Dendrolimus punctatus Walker. J for Res 31:107–121. https://doi.org/10.1007/s11676-018-0832-1
Yao JQ, Chen YN, Chen J, Zhao Y, Tuoliewubieke D, Li JG, Yang LM, Mao WY (2021) Intensification of extreme precipitation in arid Central Asia. J Hydrol 598:125760. https://doi.org/10.1016/j.jhydrol.2020.125760
Ye BS, Ding YJ, Liu FJ, Liu CH (2017) Responses of various-sized alpine glaciers and runoff to climatic change. J Glaciol 49:1–7. https://doi.org/10.3189/172756503781830999
Yuan RF, Cai SY, Liao WH, Lei XH, Zhang YH, Yin ZK, Ding GB, Wang J, Xu Y (2021) Daily runoff forecasting using ensemble empirical mode decomposition and long short-term memory. Front Earth Sci 9:621780. https://doi.org/10.3389/feart.2021.621780
Zhang YQ, Vaze J, Chiew FHS, Li M (2015) Comparing flow duration curve and rainfall–runoff modelling for predicting daily runoff in ungauged catchments. J Hydrol 525:72–86. https://doi.org/10.1016/j.jhydrol.2015.03.043
Zhang H, Yang QL, Shao JM, Wang GQ (2019) Dynamic streamflow simulation via online gradient-boosted regression tree. J Hydrol Eng 24:04019041. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001822
Zhang Q, Chen Y, Li Z, Fang G, **ang Y, Li Y, Ji H (2020) Recent changes in water discharge in snow and glacier melt-dominated rivers in the Tienshan Mountains, Central Asia. Remote Sens 12:2704. https://doi.org/10.3390/rs12172704
Zheng ZS, Ma Q, ** SC, Su YJ, Guo QH, Bales RC (2019) Canopy and terrain interactions affecting snowpack spatial patterns in the Sierra Nevada of California. Water Resour Res 55:8721–8739. https://doi.org/10.1029/2018WR023758
Funding
This work was supported by the National Natural Science Foundation of China (Grant Nos. 41871025, U1903208, 41630859).
Author information
Authors and Affiliations
Contributions
First author collected the data, designed the research outline and wrote the manuscript. First and second author both contributed to data processing and analysis, results interpretation and discussion. Third and fourth authors made overall supervision of the work, provided feedback, reviewed and edited the draft manuscript.
Corresponding author
Ethics declarations
Conflict of interest
We, all authors, declare no conflict of interest of the manuscript being submitted. We warrant that the article is the authors’ original work.
Consent to participate
Not applicable for this manuscript as this study not involving any human subject.
Consent for publication
We, all authors, are giving our full consent to publish the manuscript being submitted.
Ethical approval
Not applicable for this manuscript as this study not involving any human or animal data.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fan, M., Xu, J., Chen, Y. et al. Simulating the climate driven runoff in data-scarce mountains by machine learning and downscaling reanalysis data. Stoch Environ Res Risk Assess 36, 3819–3834 (2022). https://doi.org/10.1007/s00477-022-02231-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02231-0