Abstract
Affected by many factors, the precipitation time series have complex characteristics, which increases the difficulty of precipitation prediction. To reveal the impact of complexity on prediction performance, we propose an impact assessment framework. In this framework, firstly, the complexity results of four commonly used entropy models are compared, and the suitable entropy for precipitation complexity is selected from the perspective of stability and reliability. Secondly, the typical machine learning models (multiple linear regression, support vector machine regression, and BP artificial neural network) are used to predict precipitation. Three regions (including 44 meteorological stations) were selected for the study. The results show that (1) the (sample entropy) SE is higher than (wavelet entropy) WE, (permutation entropy) PE and (Fuzzy entropy) FE from the reliability and stability of the results; (2) there are differences in precipitation predictability at different complexity levels. For the high level of complexity, the prediction performance R2 is even close to 0. In addition, the difficulty of precipitation predictability in Southeast China is much higher than that in the other two regions. The findings of this research can help forecasters in determining the difficulty of regional precipitation predictability.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00704-023-04616-9/MediaObjects/704_2023_4616_Fig7_HTML.png)
Similar content being viewed by others
Data availability
The data used in this study can be requested by the corresponding authors.
References
Abdullahi J, Iravanian A, Nourani V et al (2020) Application of artificial intelligence based and multiple regression techniques for monthly precipitation modeling in coastal and inland stations. Desalin Water Treat 177:338–349
Aksoy H, Dahamsheh A (2018) Markov chain-incorporated and synthetic data-supported conditional artificial neural network models for forecasting monthly precipitation in arid regions. J. Hydrol 562:758–779
Aryal A, Shrestha S, Babel MS (2019) Quantifying the sources of uncertainty in an ensemble of hydrological climate-impact projections. Theor Appl Climatol 135:193–209
Bandt C, Pompe B (2002) Permutation Entropy: A Natural Complexity Measure for Time Series. Phys Rev Lett 88(17):174102
Campbell EP, Palmer MJ (2010) Modeling and forecasting climate variables using a physical-statistical approach. J Geophys Res-Atmos 115(D10). https://doi.org/10.1029/2009JD012030
Chen CA, Hsu HH, Liang HC et al (2022) Future change in extreme precipitation in East Asian spring and Mei-Yu seasons in two high-resolution AGCMs. Weather Clim Extremes 35:100408
Cho E, Choi MH (2014) Regional scale Spatio-temporal variability of soil moisture and its relationship with meteorological factors over the Korean peninsula. J Hydrol 516:317–329
Choubin B, Khalighi S, Malekian A et al (2016) Multiple linear regression, multi-layer perceptron network and adaptive neuro-fuzzy inference system for the prediction of precipitation based on large-scale climate signals. Hydrolog Sci J 61(6):1001–1009
Dhanya CT, Villarini G (2017) On the inherent predictability of precipitation across the united states. Theor Appl Climatol 133:1035–1050
Du L, Li X, Yang M et al (2022) Assessment of spatiotemporal variability of precipitation using entropy indexes: a case study of Bei**g, China. Stoch Environ Res Risk Assess 36:939–953
Ebtehaj I, Bonakdari H, Zeynoddin M et al (2019) Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models. Int J Environ Sci Technol 17:505–524
Faiz MA, Liu D, Fu Q et al (2018) Complexity and trends analysis of hydrometeorological time series for a river streamflow: A case study of Songhua River Basin, China. River Res Appl 34(2):101–111
Fan XF, Miao CY, Duan QY (2021) Future Climate Change Hotspots Under Different 21st Century Warming Scenarios. Earth's Future 9(6):e2021EF002027
Goodwell AE, Jiang P, Ruddell BL et al (2020) Debates—Does information theory provide a new paradigm for Earth science? Causality, interaction, and feedback. Water Resour. Res 56(2):e2019WR024940
Goutam K, Shih-Chieh K, Scott P et al (2020) Machine learning assisted hybrid models can improve streamflow simulation in diverse catchments across the conterminous US. Environ Res Lett 15:10
Jenks GF, Caspall FC (1971) Error on choroplethic maps: definition, measurement, reduction. Ann Am Assoc Geogr 61(2):217–244
Ji HP, Chen YN, Fang GH et al (2021) Adaptability of machine learning methods and hydrological models to discharge simulations in datasparse glaciated watersheds. J Arid Land 13(6):19
Kazamias AP, Sapountzis M, Lagouvardos K (2022) Evaluation of GPM-IMERG rainfall estimates at multiple temporal and spatial scales over Greece. Atmos Res 269:106014
Khatakho R, Talchabhadel R, Thapa BR (2021) Evaluation of different precipitation inputs on streamflow simulation in Himalayan river basin. J. Hydrol 599:126390
Kim T, Shin JY, Kim H et al (2020) Ensemble-Based Neural Network Modeling for Hydrologic Forecasts: Addressing Uncertainty in the Model Structure and Input Variable Selection. Water Resour. Res 56:6
Li JJ, He X, Tao L (2022) Assessing multiscale variability and teleconnections of monthly precipitation in Yangtze river basin based on multiscale information theory method. Theor Appl Climatol 147(1-2):717–735
Li YG, Wang DG, Wang GL et al (2021a) A hybrid deep learning algorithm and its application to streamflow prediction. J Hydrol 601:126636
Li YJ, Xu B, Wang D et al (2021b) Deterministic and probabilistic evaluation of raw and post-processing monthly precipitation forecasts: a case study of China. J Hydroinformatics 23:914–934
Liang Z, Li Y, Hu Y (2018) A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework. Theor Appl Climatol 133:137–149
Liu D, Liu C, Fu Q et al (2017) ELM evaluation model of regional groundwater quality based on the crow search algorithm. Ecol Ind 81:302–314
Liu D, Yan T, Ji Y (2021) Novel method for measuring regional precipitation complexity characteristics based on multiscale permutation entropy combined with CMFO-PPTTE model. J. Hydrol 592:125801
Liu MX, Xu XL, Sun A (2015) Decreasing spatial variability in precipitation extremes in southwestern China and the local/large-scale influencing factors. J Geophys Res-Atmos 120:13
Luca AD, Termini S (1972) A defnition of a Nonprobabilistic entropy in the setting of fuzzy sets theory. Inf Control 20:301–312
Mehr AD, Nourani V, Khosrowshahi VK et al (2019) A hybrid support vector regression–firefly model for monthly rainfall forecasting. Int J Environ Sci Technol 16:335–346
Meng EH, Huang SZ, Huang Q et al (2021) A hybrid VMD-SVM model for practical streamflow prediction using an innovative input selection framework. Water Resour Manage 35:1321–1337
Pakdaman M, Babaeian I, Javanshiri Z et al (2022) European Multi Model Ensemble (EMME): A New Approach for Monthly Forecast of Precipitation. Water Resour Manage 36:611–623
Richman JS, Randall MJ (2000) Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiol Heart Circ Physiol 278(6):H2039
Rosso OA, Blanco S, Yordanova J et al (2001) Wavelet entropy: a new tool for analysis of short duration brain electrical signals. J Neuro Methods 105(1):65–75
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423
Singh VP (1997) The use of entropy in hydrology and water resources. Hydrol Process 11:587–626
Sun P, Liu R, Yao R et al (2023) Responses of agricultural drought to meteorological drought under different climatic zones and vegetation types. J. Hydrol 619:129305
Sun QH, Miao CY, Duan QY et al (2017) A review of global precipitation data sets: data sources, estimation, and intercomparisons. Rev Geophys 56:1
Tang L, Lv HL, Yang FM et al (2015) Complexity testing techniques for time series data: a comprehensive literature review. Chaos Solit Fractals 81:117–135
Tao L, He XG, Li JJ et al (2021) A multiscale long short-term memory model with attention mechanism for improving monthly precipitation prediction. J. Hydrol 602(3):126815
Wu QS, Zuo QT, Han CH et al (2022) Integrated assessment of variation characteristics and driving forces in precipitation and temperature under climate change: A case study of Upper Yellow River basin, China. Atmos Res 272:106156
Xavier SFA, daSilvaJale J, Stosic T, dos Santos CAC et al (2019) An application of sample entropy to precipitation in Paraíba State. Brazil. Theor. Appl. Climatol 136:429–440
**ang Z, Yan J, Demir I (2020) A rainfall-runoff model with LSTM-based sequence-to-sequence learning. Water Resour. Res 56(1):e2019WR025326
Xu L, Chen NC, Chen ZQ et al (2021) Spatiotemporal forecasting in earth system science: methods, uncertainties, predictability and future directions. Earth Sci Rev 222:103828
Xu TF, Longyang QQ, Tyson CT et al (2022) Hybrid Physically Based and Deep Learning Modeling of a Snow Dominated, Mountainous. Karst Watershed Water Resour Res 58:3
Yang X (2022) Construction and application of integrated entropy model for measuring precipitation complexity. Earth Sci Inform 15:1597–1606
Yang X (2023) Evaluation of spatial variation of water resources carrying capacity using optimal method: a case study of Fujian, China. Environ Sci Pollut Res 30:1048–1059
Yang X, Chen Z, Qin M (2023) Joint probability analysis of streamflow and sediment load based on hybrid copula. Environ Sci Pollut Res 30:46489–46502
Yang X, Chen ZH (2023) An integrated index developed for measuring precipitation complexity: a case study of **sha River basin, China. Environ Sci Pollut Res 30(19):54885–54898
Yu L, Pan Y, Wu Y (2008) Two new indicators to compare different evaluation methods’ effect based on times Higher-QS world university rankings. J Nan**g Normal Univ (Nat Ed) 31(3):135–140 ((in Chinese with English abstract))
Zhang L, Li T, Liu D et al (2020) Spatial variability and possible cause analysis of regional precipitation complexity based on optimized sample entropy. Q J Roy Meteor Soc 146(732):3384–3398
Zhang SW, Wang H, Jiang H et al (2021) Studies of the seasonal prediction of heavy late spring rainfall over southeastern china. Clim Dyn 57:1919–1931
Zhu S, Xu Z, Luo X et al (2020) Assessing coincidence probability for extreme precipitation events in the **sha River basin. Theor Appl Climatol 139:825–835
Funding
The research is financially supported by the National Natural Science Foundation of China (Grant No. 51879291).
Author information
Authors and Affiliations
Contributions
XY: Conceptualization, Methodology, Writing; ZHC: Supervision, Writing–original draft.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Permutation entropy
The main idea of PE is that the patterns within the time series do not have similar probability of occurrence (Bandt and Pompe 2002). The calculation process is as follows:
Given a time series z(i) , the phase space reconstruction is performed to obtain the reconstructed vector and form the phase space matrix:
where, m denotes the embedding dimension, τ denotes the delay time, and G denotes the number of vectors in phase space reconstruction.
Each reconstruction vector in the Z(i) is arranged in ascending order to obtain the following formula:
where, j1, j2, ⋯jm is the index of the reconstructed component in the column of each element.
For any reconstruction vector z(j) in the phase space matrix, a symbol sequence S(l) = {j1, j2, ⋯, jm} can be obtained after ascending, where l = 1, 2, …, g, g ≤ m!. S(l) is one of the arrangements, and then calculate the probability of each coincidence sequence Pg. The calculation formula of PE can be written as (Bandt and Pompe 2002):
1.2 Wavelet entropy
The WE is a kind of entropy value calculated by wavelet decomposition of signal sequence, which reflects a measure of the ordered or disordered state of signal spectral energy distribution in each subspace (Rosso et al. 2001). The specific calculation steps are as follows:
The signal uses discrete wavelet transform to separate the low frequency band of the signal from the high frequency band, and decomposes the signal into detail coefficients (Dj, k) and approximate coefficients (Aj, k):
where, j is the decomposition level, k is the corresponding wavelet coefficient subscript; g(n) and h(n) are high-pass and low-pass filters, respectively.
The subband energy of the signal is calculated by the square sum of the subband wavelet coefficients:
Calculate the total wavelet energy and relative wavelet energy (pj):
Calculate the WE:
1.3 Sample entropy
The SE is the probability of generating a new subsequence in the detection time series. The higher the SE, the more complex the sample sequence (Richman and Randall 2000). The calculation process is as follows:
For the time series {x(n)} = {x(1), x(2), …x(N)} composed of N data, it is composed of m-dimensional vectors Xm(1), ⋯Xm(N − m + 1).
Defines the maximum difference between the elements corresponding to the vectors Xm(i) and Xm(j).
Determine the SE:
Where, Ai is the number of Xm + 1(i) and Xm + 1(j)(1 ≤ j ≤ N = m, j ≠ i) distance not greater than r ; Bi is the number of Xm(i) and Xm(j) distance not greater than r.
1.4 Fuzzy entropy
The FE is used to characterize the complexity of time series. The greater the entropy, the greater the complexity. FE is a very important nonlinear analysis method, which is less sensitive to parameter changes and less dependent on data length (Luca and Termini 1972). The calculation process is as follows:
Given the time series {x(i), i = 1, 2, ⋯, N}, the similarity tolerance r is introduced to determine a degree m(m ≤ N − 2) that divides the length of the subsequence, and the original sequence is reconstructed to obtain (N − m + 1) subsequences:
The fuzzy membership function is introduced:
where, \({d}_{ij}^m=d\left[X(i),X(j)\right]\) represents the maximum absolute distance between two reconstructed vectors, determined by the maximum difference of corresponding position elements. n represents the fuzzy power (it is usually equal to 2).
Take the average for each i:
Define the fuzzy entropy intermediate quantity:
Repeat the above steps to obtain Φm +1(r) when m + 1, so the FE of time series are as follows:
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, X., Chen, Z. Assessing the effects of time series on precipitation forecasting performance from complexity perspective. Theor Appl Climatol 154, 973–986 (2023). https://doi.org/10.1007/s00704-023-04616-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00704-023-04616-9