Identification of potential causal variables for statistical downscaling models: effectiveness of graphical modeling approach

Dutta, Riya; Maity, Rajib

doi:10.1007/s00704-020-03372-4

Identification of potential causal variables for statistical downscaling models: effectiveness of graphical modeling approach

Original Paper
Published: 12 September 2020

Volume 142, pages 1255–1269, (2020)
Cite this article

Theoretical and Applied Climatology Aims and scope Submit manuscript

337 Accesses
Explore all metrics

Abstract

Selection of potential causal variables (PCVs) from a pool of many possibly associated variables is a critical issue since it can significantly affect the performance of any statistical downscaling model. Generally, the variable to be downscaled is associated with many other hydrologic and climatic (aka hydroclimatic) variables. Most of the existing approaches, such as correlation analysis (CA), partial correlation analysis (PaCA), and stepwise regression analysis (SRA), rely mostly on the mutual association for the selection of PCVs. However, none of these approaches investigate the detailed dependence structure that may be helpful in eliminating the unwanted information and efficiently selecting the PCVs for downscaling the target variable. In this study, the effectiveness of graphical modeling (GM) approach is explored for the selection of the PCVs as GM can effectively identify the detailed conditional independence structure among all the associated variables. For demonstration, downscaling of monthly precipitation is undertaken using the PCVs, identified by CA, PaCA, SRA, and the proposed GM approach. Two different downscaling models, namely statistical downscaling model (SDSM) and support vector regression (SVR)–based downscaling model, are utilized. The results show that the PCVs identified through the proposed GM approach provides consistent as well as robust performance, across different regions and seasons, due to its ability to capture the complete conditional indepedence structure among the variables. The downscaled monthly precipitation obtained using the proposed approach is better matching with the observed data in terms of the mean, variance as well as the probability distribution. Overall, this study recommends the GM approach for the identification of the PCVs for the downscaling models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Effectiveness of causality-based predictor selection for statistical downscaling: a case study of rainfall in an Ecuadorian Andes basin

Article 21 September 2022

A new statistical precipitation downscaling method with Bayesian model averaging: a case study in China

Article 31 January 2015

Improvement of multiple linear regression method for statistical downscaling of monthly precipitation

Article 05 September 2017

References

Anandhi A, Srinivas VV, Nanjundiah RS, Nagesh Kumar D (2008) Downscaling precipitation to river basin in India for IPCC SRES scenarios using support vector machine. Int J Climatol 28:401–420. https://doi.org/10.1002/joc.1529
Article Google Scholar
Bang-Jensen J, Gutin G (2007) Digraphs: theory, algorithms and applications. Softw Testing, Verif Reliab 12:59–60. https://doi.org/10.1002/stvr.240
Article Google Scholar
Bates BC, Charles SP, Hughes JP (1998) Stochastic downscaling of numerical climate model simulations. Environ Model Softw 13:325–331. https://doi.org/10.1016/S1364-8152(98)00037-1
Article Google Scholar
Beal MJ, Jojic N, Attias H (2003) A graphical model for audiovisual object tracking. IEEE Trans Pattern Anal Mach Intell 25:828–836. https://doi.org/10.1109/TPAMI.2003.1206512
Article Google Scholar
Bergströms, Carlsson B, Gardelin M et al (2001) Climate change impacts on runoff in Sweden-assessments by global climate models, dynamical downscalling and hydrological modelling. Clim Res 16:101–112. https://doi.org/10.3354/cr016101
Article Google Scholar
Beuchat X, Schaefli B, Soutter M, Mermoud A (2012) A robust framework for probabilistic precipitations downscaling from an ensemble of climate predictions applied to Switzerland. J Geophys Res Atmos 117:1–16. https://doi.org/10.1029/2011JD016449
Article Google Scholar
Box GE, Cox DR (1964) An analysis of transformations. J R Stat Soc 26:211–252
Google Scholar
Cavazos T, Hewitson BC (2005) Performance of NCEP-NCAR reanalysis variables in statistical downscaling of daily precipitation. Clim Res 28:95–107
Article Google Scholar
Charles SP, Bates BC, Whetton PH, Hughes JP (1999) Validation of downscaling models for changed climate conditions: case study of southwestern Australia. Clim Res 12:1–14. https://doi.org/10.3354/cr012001
Article Google Scholar
Chen ST, Yu PS, Tang YH (2010) Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. J Hydrol 385:13–22. https://doi.org/10.1016/j.jhydrol.2010.01.021
Article Google Scholar
Chen J, Brissette FP, Leconte R (2011) Uncertainty of downscaling method in quantifying the impact of climate change on hydrology. J Hydrol 401:190–202. https://doi.org/10.1016/j.jhydrol.2011.02.020
Article Google Scholar
Chen H, Xu CY, Guo S (2012) Comparison and evaluation of multiple GCMs, statistical downscaling and hydrological models in the study of climate change impacts on runoff. J Hydrol 434–435:36–45. https://doi.org/10.1016/j.jhydrol.2012.02.040
Article Google Scholar
Chithra NR, Thampi SG (2017) Downscaling future projections of monthly precipitation in a catchment with varying physiography. ISH J Hydraul Eng 23:144–156. https://doi.org/10.1080/09715010.2016.1264895
Article Google Scholar
Coulibaly P, Baldwin CK (2005) Nonstationary hydrological time series forecasting using nonlinear dynamic methods. J Hydrol 307:164–174. https://doi.org/10.1016/j.jhydrol.2004.10.008
Article Google Scholar
Dettinger MD, Cayan DR, Meyer MK, Jeton A (2004) Simulated hydrologic responses to climate variations and change in the Merced, Carson, and American River basins, Sierra Nevada, California, 1900-2099 *. Clim Change 62:283–317. https://doi.org/10.1023/B:CLIM.0000013683.13346.4f
Article Google Scholar
Devak M, Dhanya CT (2014) Downscaling of precipitation in Mahanadi Basin, India. Int J Civ Eng Res 5:111–120
Google Scholar
Dutta R, Maity R (2018) Temporal evolution of hydroclimatic teleconnection and a time-varying model for long-lead prediction of Indian summer monsoon rainfall. Sci Rep 8:10778. https://doi.org/10.1038/s41598-018-28972-z
Article Google Scholar
Dutta R, Maity R (2020a) Spatial variation in long‐lead predictability of summer monsoon rainfall using a time‐varying model and global climatic indices. Int J Climatol. https://doi.org/10.1002/joc.6556
Dutta R, Maity R (2020b) Temporal networks‐based approach for nonstationary hydroclimatic modeling and its demonstration with streamflow prediction. Water Resour Res 56:e2020WR027086. https://doi.org/10.1029/2020WR027086
Fowler HJ, Blenkinsop S, Tebaldi C (2007) Linking climate change modelling to impacts studies: recent advances in downscaling techniques for hydrological modelling. Int J Climatol 27:1547–1578. https://doi.org/10.1002/joc.1556
Article Google Scholar
Grimes DIF, Coppola E, Verdecchia M, Visconti G (2003) A neural network approach to real-time rainfall estimation for Africa using satellite data. J Hydrometeorol 4:1119–1133. https://doi.org/10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Article Google Scholar
Gutmann E, Pruitt T, Clark M (2014) An intercomparison of statistical downscaling methods used for water resource assessments in the United States. Water Resour Res:1–20. https://doi.org/10.1002/2014WR015559.Received
Harpham C, Wilby RL (2005) Multi-site downscaling of heavy daily precipitation occurrence and amounts. J Hydrol 312:235–255. https://doi.org/10.1016/j.jhydrol.2005.02.020
Article Google Scholar
Hassan Z, Shamsudin S, Harun S (2014) Application of SDSM and LARS-WG for simulating and downscaling of rainfall and temperature. Theor Appl Climatol 116:243–257. https://doi.org/10.1007/s00704-013-0951-8
Article Google Scholar
Haylock MR, Cawley GC, Harpham C, Wilby RL, Goodess CM (2006) Downscaling heavy precipitation over the United Kingdom: a comparison of dynamical and statistical methods and their future scenarios. Int J Climatol 26:1397–1415. https://doi.org/10.1002/joc.1318
Article Google Scholar
Hessami M, Gachon P, Ouarda TBMJ, St-Hilaire A (2008) Automated regression-based statistical downscaling tool. Environ Model Softw 23:813–834. https://doi.org/10.1016/j.envsoft.2007.10.004
Article Google Scholar
Huth R (1999) Statistical downscaling in central Europe: evaluation of methods and potential predictors. Clim Res 13:91–101. https://doi.org/10.3354/cr013091
Article Google Scholar
Ihler AT, Kirshner S, Ghil M, Robertson AW, Smyth P (2007) Graphical models for statistical inference and data assimilation. Phys D Nonlinear Phenom 230:72–87. https://doi.org/10.1016/j.physd.2006.08.023
Article Google Scholar
Johnson AR, Bhattacharya KG (2009) Statistics: principles and methods, sixth. John Wiley & Sons, Inc., United States of America
Google Scholar
Jordan MI (2004) Graphical Models. Stat Sci 19:140–155. https://doi.org/10.1214/088342304000000026
Article Google Scholar
Kidson JW, Thompson CS (1998) A comparison of statistical and model-based downscaling techniques for estimating local climate variations. J Clim 11:735–753. https://doi.org/10.1175/1520-0442(1998)011<0735:ACOSAM>2.0.CO;2
Article Google Scholar
Krumsiek J, Suhre K, Illig T, Adamski J, Theis FJ (2011) Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Syst Biol 5:21. https://doi.org/10.1186/1752-0509-5-21
Article Google Scholar
Lauritzen SL, Sheehan NA (2003) Graphical models for genetic analyses. Stat Sci 18:489–514. https://doi.org/10.1214/ss/1081443232
Article Google Scholar
Liu Z, Xu Z, Charles SP, Fu G, Liu L (2011) Evaluation of two statistical downscaling models for daily precipitation over an arid basin in China. Int J Climatol 31:2006–2020. https://doi.org/10.1002/joc.2211
Article Google Scholar
Maity R (2018) Statistical methods in hydrology and hydroclimatology. Springer Nature, Singapore
Book Google Scholar
Meenu R, Rehana S, Mujumdar PP (2013) Assessment of hydrologic impacts of climate change in Tunga-Bhadra river basin, India with HEC-HMS and SDSM. Hydrol Process 27:1572–1589. https://doi.org/10.1002/hyp.9220
Article Google Scholar
Okkan U, Inan G (2015) Statistical downscaling of monthly reservoir inflows for Kemer watershed in Turkey: use of machine learning methods, multiple GCMs and emission scenarios. Int J Climatol 35:3274–3295. https://doi.org/10.1002/joc.4206
Article Google Scholar
Pervez MS, Henebry GM (2014) Projections of the Ganges-Brahmaputra precipitation-downscaled from GCM predictors. J Hydrol 517:120–134. https://doi.org/10.1016/j.jhydrol.2014.05.016
Article Google Scholar
Pichuka S, Maity R (2016) Spatio-temporal downscaling of projected precipitation in the 21st century: indication of a wetter monsoon over the Upper Mahanadi Basin, India. Hydrol Sci J 62:1–16. https://doi.org/10.1080/02626667.2016.1241882
Article Google Scholar
Pierce DW, Cayan DR, Thrasher BL (2014) Statistical downscaling using localized constructed analogs (LOCA)*. J Hydrometeorol 15:2558–2585. https://doi.org/10.1175/JHM-D-14-0082.1
Article Google Scholar
Pinto JG, Neuhaus CP, Leckebusch GC, Reyers M, Kerschgens M (2010) Estimation of wind storm impacts over Western Germany under future climate conditions using a statistical-dynamical downscaling approach. Tellus, Ser A Dyn Meteorol Oceanogr 62:188–201. https://doi.org/10.1111/j.1600-0870.2009.00424.x
Article Google Scholar
Radchenko P, James GM (2010) Variable selection using adaptive nonlinear interaction structures in high dimensions. J Am Stat Assoc 105:1541–1553. https://doi.org/10.1198/jasa.2010.tm10130
Article Google Scholar
Schoof JT, Shin DW, Cocke S, LaRow TE, Lim YK, O'Brien JJ (2009) Dynamically and statistically downscaled seasonal temperature and precipitation hindcast ensembles for the southeastern USA. Int J Climatol 29:243–257. https://doi.org/10.1002/joc.1717
Article Google Scholar
Semenov MA, Brooks RJ, Barrow EM, Richardson CW (1998) Comparison of the WGEN and LARS-WG stochastic weather generators for diverse climates. Clim Res 10:95–107. https://doi.org/10.3354/cr010095
Article Google Scholar
Stoner AMK, Hayhoe K, Yang X, Wuebbles DJ (2013) An asynchronous regional regression model for statistical downscaling of daily climate variables. Int J Climatol 33:2473–2494. https://doi.org/10.1002/joc.3603
Article Google Scholar
Taeb A, Reager JT, Turmon M, Chandrasekaran V (2017) A statistical graphical model of the California Reservoir System. Water Resour Res. 53:9721–9739. https://doi.org/10.1002/2017WR020412
Article Google Scholar
Tatli H, Dalfes HN, Menteş ŞS (2004) A statistical downscaling method for monthly total precipitation over Turkey. Int J Climatol 24:161–180. https://doi.org/10.1002/joc.997
Article Google Scholar
Tatsumi K, Oizumi T, Yamashiki Y (2015) Effects of climate change on daily minimum and maximum temperatures and cloudiness in the Shikoku region: a statistical downscaling model approach. Theor Appl Climatol 120:87–98. https://doi.org/10.1007/s00704-014-1152-9
Article Google Scholar
Tomozeiu R, Cacciamani C, Pavan V, Morgillo A, Busuioc A (2007) Climate change scenarios for surface temperature in Emilia-Romagna (Italy) obtained using statistical downscaling models. Theor Appl Climatol 90:25–47. https://doi.org/10.1007/s00704-006-0275-z
Article Google Scholar
Webster PJ, Magaña VO, Palmer TN, Shukla J, Tomas RA, Yanai M, Yasunari T (1998) Monsoons: processes, predictability, and the prospects for prediction. J Geophys Res Ocean 103:14451–14510. https://doi.org/10.1029/97JC02719
Article Google Scholar
Whittaker J (2009) Graphical models in applied multivariate statistics. Wiley Publishing
Wilby RL, Hay LE, Leavesly HH (1999) A comparison of downscaled and raw output: implications for climate change scenarios in the San Juan river basin, Colorado. J Hydrol 225:67–91. https://doi.org/10.1016/S0022-1694(99)00136-5
Article Google Scholar
Wilby R, Dawson C, Barrow E (2002) Sdsm—a decision support tool for the assessment of regional climate change impacts. Environ Model Softw 17:145–157. https://doi.org/10.1016/S1364-8152(01)00060-3
Article Google Scholar
Wood AW, Leung LR, Sridhar V, Lettenmaier DP (2004) Hydrologic implications of dynamical and statistical approaches to downscaling climate model outputs. Clim Change 62:189–216. https://doi.org/10.1023/B:CLIM.0000013685.99609.9e
Article Google Scholar
Yang J, Li L, Wang A (2011) A partial correlation-based Bayesian network structure learning algorithm under linear SEM. Knowledge-Based Syst 24:963–976. https://doi.org/10.1016/j.knosys.2011.04.005
Article Google Scholar
Yang C, Wang N, Wang S, Zhou L (2016) Performance comparison of three predictor selection methods for statistical downscaling of daily precipitation. Theor Appl Climatol 131:43–54. https://doi.org/10.1007/s00704-016-1956-x
Article Google Scholar
Yang C, Wang N, Wang S (2017) A comparison of three predictor selection methods for statistical downscaling. Int J Climatol 37:1238–1249. https://doi.org/10.1002/joc.4772
Article Google Scholar
Zuo D, Xu Z, Zhao J, Abbaspour KC, Yang H (2015) Response of runoff to climate change in the Wei River basin, China. Hydrol Sci J 60:508–522. https://doi.org/10.1080/02626667.2014.943668
Article Google Scholar

Download references

Funding

The work was partially supported by the sponsored projects supported by Department of Science and Technology, Climate Change Programme (SPLICE), Government of India (Ref No. DST/CCP/CoE/79/2017(G)), through a sponsored project.

Author information

Authors and Affiliations

Department of Civil Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, 721302, India
Riya Dutta & Rajib Maity

Authors

Riya Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Rajib Maity
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajib Maity.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1. Mathematical details of correlation analysis, partial correlation analysis, and stepwise regression analysis

1.1 Appendix 1.1. Correlation analysis

The correlation analysis (CA) is the most commonly used approach for selection of the PCVs. Strong correlation of the causal variables, from a pool of possibly associated hydroclimatic variables, with the target variable is the most basic criteria for selection of PCVs. In this approach, the selection is governed by the correlation coefficient between the associated variables and the target variable to be downscaled. A certain value of the correlation coefficient is considered the threshold value and all the associated variables having equal or higher correlation are considered the PCVs for downscaling. Pearson’s correlation coefficient is used in this study and the same can be expressed as follows,

$$ {r}_{xy}=\frac{\sum \limits_{i=1}^n\left({x}_i-\overline{x}\right)\left({y}_i-\overline{y}\right)}{\sqrt{\sum \limits_{i=1}^n{\left({x}_i-\overline{x}\right)}^2\sum \limits_{i=1}^n{\left({y}_i-\overline{y}\right)}^2}} $$

(5)

where r_xy is Pearson’s correlation coefficient between the associated variables (X) and predictand (Y), n is the number of observations, x_i and y_i are the observations of X and Y respectively, and $ \overline{x} $ and $ \overline{y} $are the means of X and Y respectively. The p value is evaluated, considering the correlation coefficient to follow t distribution at 95 % confidence level with n − 2 degrees of freedom. The causal variables with p value greater than 0.05 are recommended to select as the PCVs of the statistical downscaling model.

1.2 Appendix 1.2. Partial correlation analysis

Partial correlation is the measure of association between two variables (a particular associated variable and target variable), while controlling the effect of other associated variables. The partial correlation analysis (PaCA) can be used to identify the PCVs for downscaling as it adjusts the effect of other associated variables. The partial correlation coefficient between two variables controlling the third variable can be expressed as follows,

$$ {r}_{xy,z}=\frac{r_{xy}-{r}_{xz}{r}_{yz}}{\sqrt{\left(1-{r}_{xz}^2\right)\left(1-{r}_{yz}^2\right)}} $$

(6)

where r_{xy, z} is the partial correlation between two variables X and Y when the third variable Z is controlled and r_xy, r_xz, r_yz is the correlation coefficient between X and Y, X and Z, and Y and Z respectively. The p value is evaluated, considering the partial correlation coefficient to follow t distribution at 95% confidence level with n − 3 degrees of freedom. The causal variables with p value greater than 0.05 are recommended to select as the PCVs of the statistical downscaling model.

1.3 Appendix 1.3. Stepwise regression analysis

The stepwise regression analysis (SRA) is a method of fitting a regression model by stepwise removal of the least significant variables until all the remaining variables are significant. This method is often used for selection of PCVs when a large number of associated variables are available and to deal with issues related to multi-collinearity. In this technique, initially all the causal variables are considered in the model. At each step of the analysis, a variable is included or excluded from the model usually based on the partial F-tests. If F is greater than the critical F value, the causal variables can be included in the equation. The partial F statistic can be expressed as follows,

$$ F=\frac{\left({R}_q^2-{R}_{q-1}^2\right)\left(n-q-1\right)}{\left(1-{R}_q^2\right)} $$

(7)

where R is the correlation coefficient between a criteria variable and prediction equation, q is the number of causal variables in the equation, and n is as defined before. If the test statistic is less than the critical F value at 95% confidence level with degree of freedom (n − q − 1), the causal variables should be excluded from the equation. The causal variables with p value greater than 0.05 are recommended to select as the PCVs of the statistical downscaling model.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutta, R., Maity, R. Identification of potential causal variables for statistical downscaling models: effectiveness of graphical modeling approach. Theor Appl Climatol 142, 1255–1269 (2020). https://doi.org/10.1007/s00704-020-03372-4

Download citation

Received: 03 August 2018
Accepted: 03 September 2020
Published: 12 September 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s00704-020-03372-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Identification of potential causal variables for statistical downscaling models: effectiveness of graphical modeling approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effectiveness of causality-based predictor selection for statistical downscaling: a case study of rainfall in an Ecuadorian Andes basin

A new statistical precipitation downscaling method with Bayesian model averaging: a case study in China

Improvement of multiple linear regression method for statistical downscaling of monthly precipitation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix 1. Mathematical details of correlation analysis, partial correlation analysis, and stepwise regression analysis

1.1 Appendix 1.1. Correlation analysis

1.2 Appendix 1.2. Partial correlation analysis

1.3 Appendix 1.3. Stepwise regression analysis

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Identification of potential causal variables for statistical downscaling models: effectiveness of graphical modeling approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effectiveness of causality-based predictor selection for statistical downscaling: a case study of rainfall in an Ecuadorian Andes basin

A new statistical precipitation downscaling method with Bayesian model averaging: a case study in China

Improvement of multiple linear regression method for statistical downscaling of monthly precipitation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix 1. Mathematical details of correlation analysis, partial correlation analysis, and stepwise regression analysis

Appendix 1. Mathematical details of correlation analysis, partial correlation analysis, and stepwise regression analysis

1.1 Appendix 1.1. Correlation analysis

1.2 Appendix 1.2. Partial correlation analysis

1.3 Appendix 1.3. Stepwise regression analysis

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation