Log in

Assessing the effects of time series on precipitation forecasting performance from complexity perspective

  • Research
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

Affected by many factors, the precipitation time series have complex characteristics, which increases the difficulty of precipitation prediction. To reveal the impact of complexity on prediction performance, we propose an impact assessment framework. In this framework, firstly, the complexity results of four commonly used entropy models are compared, and the suitable entropy for precipitation complexity is selected from the perspective of stability and reliability. Secondly, the typical machine learning models (multiple linear regression, support vector machine regression, and BP artificial neural network) are used to predict precipitation. Three regions (including 44 meteorological stations) were selected for the study. The results show that (1) the (sample entropy) SE is higher than (wavelet entropy) WE, (permutation entropy) PE and (Fuzzy entropy) FE from the reliability and stability of the results; (2) there are differences in precipitation predictability at different complexity levels. For the high level of complexity, the prediction performance R2 is even close to 0. In addition, the difficulty of precipitation predictability in Southeast China is much higher than that in the other two regions. The findings of this research can help forecasters in determining the difficulty of regional precipitation predictability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The data used in this study can be requested by the corresponding authors.

References

  • Abdullahi J, Iravanian A, Nourani V et al (2020) Application of artificial intelligence based and multiple regression techniques for monthly precipitation modeling in coastal and inland stations. Desalin Water Treat 177:338–349

    Article  Google Scholar 

  • Aksoy H, Dahamsheh A (2018) Markov chain-incorporated and synthetic data-supported conditional artificial neural network models for forecasting monthly precipitation in arid regions. J. Hydrol 562:758–779

    Article  Google Scholar 

  • Aryal A, Shrestha S, Babel MS (2019) Quantifying the sources of uncertainty in an ensemble of hydrological climate-impact projections. Theor Appl Climatol 135:193–209

    Article  Google Scholar 

  • Bandt C, Pompe B (2002) Permutation Entropy: A Natural Complexity Measure for Time Series. Phys Rev Lett 88(17):174102

    Article  Google Scholar 

  • Campbell EP, Palmer MJ (2010) Modeling and forecasting climate variables using a physical-statistical approach. J Geophys Res-Atmos 115(D10). https://doi.org/10.1029/2009JD012030

  • Chen CA, Hsu HH, Liang HC et al (2022) Future change in extreme precipitation in East Asian spring and Mei-Yu seasons in two high-resolution AGCMs. Weather Clim Extremes 35:100408

    Article  Google Scholar 

  • Cho E, Choi MH (2014) Regional scale Spatio-temporal variability of soil moisture and its relationship with meteorological factors over the Korean peninsula. J Hydrol 516:317–329

    Article  Google Scholar 

  • Choubin B, Khalighi S, Malekian A et al (2016) Multiple linear regression, multi-layer perceptron network and adaptive neuro-fuzzy inference system for the prediction of precipitation based on large-scale climate signals. Hydrolog Sci J 61(6):1001–1009

    Article  Google Scholar 

  • Dhanya CT, Villarini G (2017) On the inherent predictability of precipitation across the united states. Theor Appl Climatol 133:1035–1050

    Article  Google Scholar 

  • Du L, Li X, Yang M et al (2022) Assessment of spatiotemporal variability of precipitation using entropy indexes: a case study of Bei**g, China. Stoch Environ Res Risk Assess 36:939–953

    Article  Google Scholar 

  • Ebtehaj I, Bonakdari H, Zeynoddin M et al (2019) Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models. Int J Environ Sci Technol 17:505–524

    Article  Google Scholar 

  • Faiz MA, Liu D, Fu Q et al (2018) Complexity and trends analysis of hydrometeorological time series for a river streamflow: A case study of Songhua River Basin, China. River Res Appl 34(2):101–111

    Article  Google Scholar 

  • Fan XF, Miao CY, Duan QY (2021) Future Climate Change Hotspots Under Different 21st Century Warming Scenarios. Earth's Future 9(6):e2021EF002027

    Article  Google Scholar 

  • Goodwell AE, Jiang P, Ruddell BL et al (2020) Debates—Does information theory provide a new paradigm for Earth science? Causality, interaction, and feedback. Water Resour. Res 56(2):e2019WR024940

    Article  Google Scholar 

  • Goutam K, Shih-Chieh K, Scott P et al (2020) Machine learning assisted hybrid models can improve streamflow simulation in diverse catchments across the conterminous US. Environ Res Lett 15:10

    Google Scholar 

  • Jenks GF, Caspall FC (1971) Error on choroplethic maps: definition, measurement, reduction. Ann Am Assoc Geogr 61(2):217–244

    Article  Google Scholar 

  • Ji HP, Chen YN, Fang GH et al (2021) Adaptability of machine learning methods and hydrological models to discharge simulations in datasparse glaciated watersheds. J Arid Land 13(6):19

    Article  Google Scholar 

  • Kazamias AP, Sapountzis M, Lagouvardos K (2022) Evaluation of GPM-IMERG rainfall estimates at multiple temporal and spatial scales over Greece. Atmos Res 269:106014

    Article  Google Scholar 

  • Khatakho R, Talchabhadel R, Thapa BR (2021) Evaluation of different precipitation inputs on streamflow simulation in Himalayan river basin. J. Hydrol 599:126390

    Article  Google Scholar 

  • Kim T, Shin JY, Kim H et al (2020) Ensemble-Based Neural Network Modeling for Hydrologic Forecasts: Addressing Uncertainty in the Model Structure and Input Variable Selection. Water Resour. Res 56:6

    Article  Google Scholar 

  • Li JJ, He X, Tao L (2022) Assessing multiscale variability and teleconnections of monthly precipitation in Yangtze river basin based on multiscale information theory method. Theor Appl Climatol 147(1-2):717–735

    Article  Google Scholar 

  • Li YG, Wang DG, Wang GL et al (2021a) A hybrid deep learning algorithm and its application to streamflow prediction. J Hydrol 601:126636

    Article  Google Scholar 

  • Li YJ, Xu B, Wang D et al (2021b) Deterministic and probabilistic evaluation of raw and post-processing monthly precipitation forecasts: a case study of China. J Hydroinformatics 23:914–934

    Article  Google Scholar 

  • Liang Z, Li Y, Hu Y (2018) A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework. Theor Appl Climatol 133:137–149

    Article  Google Scholar 

  • Liu D, Liu C, Fu Q et al (2017) ELM evaluation model of regional groundwater quality based on the crow search algorithm. Ecol Ind 81:302–314

    Article  Google Scholar 

  • Liu D, Yan T, Ji Y (2021) Novel method for measuring regional precipitation complexity characteristics based on multiscale permutation entropy combined with CMFO-PPTTE model. J. Hydrol 592:125801

    Article  Google Scholar 

  • Liu MX, Xu XL, Sun A (2015) Decreasing spatial variability in precipitation extremes in southwestern China and the local/large-scale influencing factors. J Geophys Res-Atmos 120:13

    Article  Google Scholar 

  • Luca AD, Termini S (1972) A defnition of a Nonprobabilistic entropy in the setting of fuzzy sets theory. Inf Control 20:301–312

    Article  Google Scholar 

  • Mehr AD, Nourani V, Khosrowshahi VK et al (2019) A hybrid support vector regression–firefly model for monthly rainfall forecasting. Int J Environ Sci Technol 16:335–346

    Article  Google Scholar 

  • Meng EH, Huang SZ, Huang Q et al (2021) A hybrid VMD-SVM model for practical streamflow prediction using an innovative input selection framework. Water Resour Manage 35:1321–1337

    Article  Google Scholar 

  • Pakdaman M, Babaeian I, Javanshiri Z et al (2022) European Multi Model Ensemble (EMME): A New Approach for Monthly Forecast of Precipitation. Water Resour Manage 36:611–623

    Article  Google Scholar 

  • Richman JS, Randall MJ (2000) Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiol Heart Circ Physiol 278(6):H2039

    Article  Google Scholar 

  • Rosso OA, Blanco S, Yordanova J et al (2001) Wavelet entropy: a new tool for analysis of short duration brain electrical signals. J Neuro Methods 105(1):65–75

    Article  Google Scholar 

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423

    Article  Google Scholar 

  • Singh VP (1997) The use of entropy in hydrology and water resources. Hydrol Process 11:587–626

    Article  Google Scholar 

  • Sun P, Liu R, Yao R et al (2023) Responses of agricultural drought to meteorological drought under different climatic zones and vegetation types. J. Hydrol 619:129305

    Article  Google Scholar 

  • Sun QH, Miao CY, Duan QY et al (2017) A review of global precipitation data sets: data sources, estimation, and intercomparisons. Rev Geophys 56:1

    Google Scholar 

  • Tang L, Lv HL, Yang FM et al (2015) Complexity testing techniques for time series data: a comprehensive literature review. Chaos Solit Fractals 81:117–135

    Article  Google Scholar 

  • Tao L, He XG, Li JJ et al (2021) A multiscale long short-term memory model with attention mechanism for improving monthly precipitation prediction. J. Hydrol 602(3):126815

    Article  Google Scholar 

  • Wu QS, Zuo QT, Han CH et al (2022) Integrated assessment of variation characteristics and driving forces in precipitation and temperature under climate change: A case study of Upper Yellow River basin, China. Atmos Res 272:106156

    Article  Google Scholar 

  • Xavier SFA, daSilvaJale J, Stosic T, dos Santos CAC et al (2019) An application of sample entropy to precipitation in Paraíba State. Brazil. Theor. Appl. Climatol 136:429–440

    Article  Google Scholar 

  • **ang Z, Yan J, Demir I (2020) A rainfall-runoff model with LSTM-based sequence-to-sequence learning. Water Resour. Res 56(1):e2019WR025326

    Article  Google Scholar 

  • Xu L, Chen NC, Chen ZQ et al (2021) Spatiotemporal forecasting in earth system science: methods, uncertainties, predictability and future directions. Earth Sci Rev 222:103828

    Article  Google Scholar 

  • Xu TF, Longyang QQ, Tyson CT et al (2022) Hybrid Physically Based and Deep Learning Modeling of a Snow Dominated, Mountainous. Karst Watershed Water Resour Res 58:3

    Google Scholar 

  • Yang X (2022) Construction and application of integrated entropy model for measuring precipitation complexity. Earth Sci Inform 15:1597–1606

    Article  Google Scholar 

  • Yang X (2023) Evaluation of spatial variation of water resources carrying capacity using optimal method: a case study of Fujian, China. Environ Sci Pollut Res 30:1048–1059

    Article  Google Scholar 

  • Yang X, Chen Z, Qin M (2023) Joint probability analysis of streamflow and sediment load based on hybrid copula. Environ Sci Pollut Res 30:46489–46502

    Article  Google Scholar 

  • Yang X, Chen ZH (2023) An integrated index developed for measuring precipitation complexity: a case study of **sha River basin, China. Environ Sci Pollut Res 30(19):54885–54898

    Article  Google Scholar 

  • Yu L, Pan Y, Wu Y (2008) Two new indicators to compare different evaluation methods’ effect based on times Higher-QS world university rankings. J Nan**g Normal Univ (Nat Ed) 31(3):135–140 ((in Chinese with English abstract))

    Google Scholar 

  • Zhang L, Li T, Liu D et al (2020) Spatial variability and possible cause analysis of regional precipitation complexity based on optimized sample entropy. Q J Roy Meteor Soc 146(732):3384–3398

    Article  Google Scholar 

  • Zhang SW, Wang H, Jiang H et al (2021) Studies of the seasonal prediction of heavy late spring rainfall over southeastern china. Clim Dyn 57:1919–1931

    Article  Google Scholar 

  • Zhu S, Xu Z, Luo X et al (2020) Assessing coincidence probability for extreme precipitation events in the **sha River basin. Theor Appl Climatol 139:825–835

    Article  Google Scholar 

Download references

Funding

The research is financially supported by the National Natural Science Foundation of China (Grant No. 51879291).

Author information

Authors and Affiliations

Authors

Contributions

XY: Conceptualization, Methodology, Writing; ZHC: Supervision, Writing–original draft.

Corresponding author

Correspondence to Zhihe Chen.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Permutation entropy

The main idea of PE is that the patterns within the time series do not have similar probability of occurrence (Bandt and Pompe 2002). The calculation process is as follows:

Given a time series z(i) , the phase space reconstruction is performed to obtain the reconstructed vector and form the phase space matrix:

$$Z(i)=\left[\begin{array}{cccc}z(1)& z\left(1+\tau \right)& \cdots & z\left(1+\left(m-1\right)\tau \right)\\ {}\vdots & \vdots & \cdots & \vdots \\ {}z(j)& z\left(j+\tau \right)& \cdots & z\left(j+\left(m-1\right)\tau \right)\\ {}\vdots & \vdots & \cdots & \vdots \\ {}z(G)& z\left(G+\tau \right)& \cdots & z\left(G+\left(m-1\right)\tau \right)\end{array}\right]$$
(A.1)

where, m denotes the embedding dimension, τ denotes the delay time, and G denotes the number of vectors in phase space reconstruction.

Each reconstruction vector in the Z(i) is arranged in ascending order to obtain the following formula:

$$\left\{z\left(i+\left({j}_1-1\right)\tau \right)\le z\left(i+\left({j}_2-1\right)\tau \right)\le \cdots \le z\left(i+\left({j}_m-1\right)\tau \right)\right\}$$
(A.2)

where, j1, j2, ⋯jm is the index of the reconstructed component in the column of each element.

For any reconstruction vector z(j) in the phase space matrix, a symbol sequence S(l) = {j1, j2, ⋯, jm} can be obtained after ascending, where l = 1, 2, …, g, g ≤ m!. S(l) is one of the arrangements, and then calculate the probability of each coincidence sequence Pg. The calculation formula of PE can be written as (Bandt and Pompe 2002):

$$PE(m)=-\sum_{l=1}^g{P}_l\ln {P}_l$$
(A.3)

1.2 Wavelet entropy

The WE is a kind of entropy value calculated by wavelet decomposition of signal sequence, which reflects a measure of the ordered or disordered state of signal spectral energy distribution in each subspace (Rosso et al. 2001). The specific calculation steps are as follows:

The signal uses discrete wavelet transform to separate the low frequency band of the signal from the high frequency band, and decomposes the signal into detail coefficients (Dj, k) and approximate coefficients (Aj, k):

$$\left\{\begin{array}{c}{D}_{j,k}=\sum_n\overline{g}\left(n-2k\right){D}_{j-1,n}\\ {}{A}_{j,k}=\sum_n\overline{h}\left(n-2k\right){A}_{j-1,n}\end{array}\right.$$
(A.4)

where, j is the decomposition level, k is the corresponding wavelet coefficient subscript; g(n) and h(n) are high-pass and low-pass filters, respectively.

The subband energy of the signal is calculated by the square sum of the subband wavelet coefficients:

$${E}_j=\sum_k{\left|{D}_{j,k}\right|}^2$$
(A.5)

Calculate the total wavelet energy and relative wavelet energy (pj):

$$\left\{\begin{array}{c}{E}_{total}=\sum_j{E}_j\\ {}{p}_j=\frac{E_j}{E_{total}}\end{array}\right.$$
(A.6)

Calculate the WE:

$$WE=-\sum_j{p}_j\ln {p}_j$$
(A.7)

1.3 Sample entropy

The SE is the probability of generating a new subsequence in the detection time series. The higher the SE, the more complex the sample sequence (Richman and Randall 2000). The calculation process is as follows:

For the time series {x(n)} = {x(1), x(2), …x(N)} composed of N data, it is composed of m-dimensional vectors Xm(1), ⋯Xm(N − m + 1).

$$\textrm{where},{X}_m(i)=\left\{x(i),x\left(i+1\right),\cdots x\left(i+m-1\right)\right\},\left(1\le i\le N-m+1\right).$$

Defines the maximum difference between the elements corresponding to the vectors Xm(i) and Xm(j).

Determine the SE:

$$\left\{\begin{array}{c}{A}^k(r)=\frac{1}{N-m}\sum\limits_{i=1}^{N-m}\frac{1}{N-m-1}{A}_i\\ {}{B}^m(r)=\frac{1}{N-m}\sum\limits_{i=1}^{N-m}\frac{1}{N-m-1}{B}_i\\ {} SE=-\ln \left\{\frac{A^m(r)}{B^m(r)}\right\}\end{array}\right.$$
(A.8)

Where, Ai is the number of Xm + 1(i) and Xm + 1(j)(1 ≤ j ≤ N = m, j ≠ i) distance not greater than r ; Bi is the number of Xm(i) and Xm(j) distance not greater than r.

1.4 Fuzzy entropy

The FE is used to characterize the complexity of time series. The greater the entropy, the greater the complexity. FE is a very important nonlinear analysis method, which is less sensitive to parameter changes and less dependent on data length (Luca and Termini 1972). The calculation process is as follows:

Given the time series {x(i), i = 1, 2, ⋯, N}, the similarity tolerance r is introduced to determine a degree m(m ≤ N − 2) that divides the length of the subsequence, and the original sequence is reconstructed to obtain (N − m + 1) subsequences:

$$\left\{\begin{array}{c}X(1),X(2),\cdots X\left(N-m+1\right)\\ {}X(i)=\left[x(i),x\left(i+1\right),\cdots, x\left(i+m-1\right)\right]-{x}_0(i)\\ {}{x}_0(i)=\frac{1}{m}\sum\limits_{j=0}^{m-1}x\left(i+j\right)\end{array}\right.$$
(A.9)

The fuzzy membership function is introduced:

$${A}_{ij}^m=\exp \left[-{\left(\frac{d_{ij}^m}{r}\right)}^n\right]$$
(A.10)

where, \({d}_{ij}^m=d\left[X(i),X(j)\right]\) represents the maximum absolute distance between two reconstructed vectors, determined by the maximum difference of corresponding position elements. n represents the fuzzy power (it is usually equal to 2).

Take the average for each i:

$${C}_i^m(r)=\frac{1}{N-m}\sum_{j=1,j\ne i}^{N-m+1}{A}_{ij}^m$$
(A.11)

Define the fuzzy entropy intermediate quantity:

$${\varPhi}^m(r)=\frac{1}{N-m+1}\sum_{i=1}^{N-m+1}{C}_i^m(r)$$
(A.12)

Repeat the above steps to obtain Φm +1(r) when m + 1, so the FE of time series are as follows:

$$FE=\ln {\varPhi}^{m+1}(r)-\ln {\varPhi}^m(r)$$
(A.13)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Chen, Z. Assessing the effects of time series on precipitation forecasting performance from complexity perspective. Theor Appl Climatol 154, 973–986 (2023). https://doi.org/10.1007/s00704-023-04616-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-023-04616-9

Keywords

Navigation