Abstract
Stacked generalization is a commonly used technique for improving predictive accuracy by combining less expressive models using a high-level model. This paper introduces a stacked generalization scheme specifically designed for nonlinear time series models. Instead of selecting a single model using traditional model selection criteria, our approach stacks several nonlinear time series models from different classes and proposes a new generalization algorithm that minimizes prediction error. To achieve this, we utilize a feed-forward artificial neural network (FANN) model to generalize existing nonlinear time series models by stacking them. Network parameters are estimated using a backpropagation algorithm. We validate the proposed method using simulated examples and a real data application. The results demonstrate that our proposed stacked FANN model achieves a lower error and improves forecast accuracy compared to previous nonlinear time series models, resulting in a better fit to the original time series data.
Similar content being viewed by others
References
Adhikari R (2015) A neural network based linear ensemble framework for time series forecasting. Neurocomputing 157:231–242
Agarwal S, Chowdary CR (2020) A-stacking and a-bagging: adaptive versions of ensemble learning algorithms for spoof fingerprint detection. Expert Syst Appl 146:113160
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Akter MS, Shahriar H, Chowdhury R, Mahdy MRC (2022) Forecasting the risk factor of frontier markets: a novel stacking ensemble of neural network approach. Future Internet 14:252
Breiman L (1996a) Stacked regressions. Mach Learn 24:49–64
Breiman L (1996b) Bagging predictors. Mach Learn 24(2):123–140
Cao W, Wanga X, Minga Z, Gao J (2018) A review on neural networks with random weights. Neurocomputing 275:278–287
Cestnik B (1990) Estimating probabilities: a crucial task in machine learning. In: proceedings of the European conference on artificial intelligence, pp 147-149
du Jardin P (2018) Failure pattern-based ensembles applied to bankruptcy forecasting. Decis Support Syst 107:64–77
Duan Q, Ajami NK, Gao X, Sorooshian S (2007) Multi-model ensemble hydrologic prediction using Bayesian model averaging. Adv Water Resour 30(5):1371–1386
Efron B, Morris C (1973) Combining possibly related estimation problems (with discussion). J R Stat Soc Ser B 35(3):379–421
Farnoosh R, Hajebi M, Samadi SY (2019) A semiparametric estimation for the first order nonlinear autoregressive time series model with independent and dependent errors. Iran J Sci Technol Trans A Sci 43:905–917
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Géron A (2017) Hands-on machine learning with Scikit-Learn and TensorFlow (concepts, tools, and techniques to build intelligent systems). O’Reilly Media Inc
Graefe A, Kuchenhoff H, Stierle V, Riedl B (2015) Limitations of ensemble Bayesian model averaging for forecasting social science problems. Int J Forecast 31(3):943–951
Guo X, Wang X, Ning Y, Yuan B, Li H (2024) Short-term household load forecasting based on Stacking-SCN. Math Found Comput 7:98–112
Hansen BE (1997) Inference in TAR models. Stud Nonlinear Dyn Econ 2:1–14
James W, Stein C (1961) Estimation with quadratic loss. In: Proceedings of the Fourth Berkeley Symposium, University of California Press 1, pp 361–379
Jiang S, Ren L, Hong Y, Yong B, Yang X, Yuan F, Ma M (2012) Comprehensive evaluation of multi-satellite precipitation products with a dense rain gauge network and optimally merging their simulated hydrological flows using the Bayesian model averaging method. J Hydrol 452–453:213–225
Kuncheva LI, Whitaker CJ (2002) Using diversity with three variants of boosting: aggressive, conservative, and inverse. Multiple Classifier Systems. Berlin, Heidelberg, Springer, Berlin Heidelberg, pp 81–90
Ledezma A, Aler R, Sanchis A, Borrajo D (2010) Ga-stacking: evolutionary stacked generalization. Intell Data Anal 14(1):89–119
Leon F, Zaharia MH (2010) Stacked heterogeneous neural networks for time series forecasting. Math Probl Eng 2010:1–20
Liang G, Cohn AG (2013) An effective approach for imbalanced classification: Unevenly balanced bagging. In: Proceedings of the twenty-seventh AAAI conference on artificial intelligence, July 14-18, 2013, Bellevue, Washington, USA
Ma Y, Hong Y, Chen Y, Yang Y, Tang G, Yao Y, Long D, Li C, Han Z, Liu R (2018) Performance of optimally merged multisatellite precipitation products using the dynamic Bayesian model averaging scheme over the Tibetan Plateau. J Geophy Res Atmos 123(2):814–834
Massaoudi M, Refaat SS, Chihi I, Trabelsi M, Oueslati FS, Abu-Rub H (2021) A novel stacked generalization ensemble-based hybrid LGBM–XGB–MLP model for short-term load forecasting. Energy 214:118874
Michael AN (2015) Neural network and deep learning. Determination Press
Montgomery J, Hollenbach F, Ward M (2012) Improving predictions using ensemble Bayesian model averaging. Polit Anal 20(3):271–291
Moran PAP (1953) The statistical analysis of the Canadian lynx cycle. Aust J Zool 1:163–173
Oliveira M, Torgo L (2015) Ensembles for time series forecasting. In: proceedings of the sixth Asian conference on machine learning, (PMLR) 39, pp 360-370
Oza NC (2003) Boosting with averaged weight vectors. In Windeatt T, Roli F (eds) Multiple classifier systems, 4th international workshop, MCS, Guilford, UK, June 11-13, 2003, proceedings. In Lecture Notes in Computer Science. 2709, pp 15-24
Papouskova M, Hajek P (2019) Two-stage consumer credit risk modeling using heterogeneous ensemble learning. Decis Support Syst 118:33–45
Park JH, Samadi SY (2014) Heteroscedastic modelling via the autoregressive conditional variance subspace. Can J Stat 42(3):423–435
Park JH, Samadi SY (2020) Dimension reduction for the conditional mean and variance functions in time series. Scand J Stat 47:134–155
Park JH, Sriram TN, Yin X (2009) Central mean subspace in time series. J Comput Graph Stat 18(3):717–730
Park JH, Sriram TN, Yin X (2010) Dimension reduction in time series. Stat Sin 20:747–770
Porwik P, Doroz R, Wrobel K (2019) An ensemble learning approach to lip-based biometric verification, with a dynamic selection of classifiers. Expert Syst Appl 115:673–683
Qu Z, Xu J, Wang Z, Chi R, Liu H (2021) Prediction of electricity generation from a combined cycle power plant based on a stacking ensemble and its hyperparameter optimization with a grid-search method. Energy 227:0360–5442
Quinlan JR (1993) C4.5: program for machine learning. Morgan Kaufmann. Morgan Kaufmann Series in Machine Learning
Raftery AE, Gneiting T, Balabdaoui F, Polakowski M (2005) Using Bayesian model averaging to calibrate forecast ensembles. Mon Weather Rev 133:1155–74
Rao JNK, Subrahmaniam K (1971) Combining independent estimators and estimation in linear regression with unequal variances. Biometrics 27(4):971–990
Ruano-Ords D, Yevseyeva I, Fernandes VB, Mndez JR, Emmerich MT (2019) Improving the drug discovery process by using multiple classifier systems. Expert Syst Appl 121:292–303
Rubin DB, Weisberg S (1975) The variance of a linear combination of independent estimators using estimated weights. Biometrika 62(3):708–709
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323:533–536
Samadi SY, De Alwis TP (2023) Fourier methods for sufficient dimension reduction in time series. ar**v:2312.02110
Samadi SY, Hajebi M, Farnoosh R (2019) A semiparametric approach for modelling multivariate nonlinear time series. Can J Stat 47:668–687
Singh A, Dhillon A, Kumar N, Hossain MS, Muhammad G, Kumar M (2021) eDiaPredict: an ensemble-based framework for diabetes prediction. ACM Trans Multimed Comput Commun Appl 17(2s):1–26
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J March Learn Res 15:1929–1958
Sun J, Jia MY, Li H (2011) Adaboost ensemble for financial distress prediction: an empirical comparison with data from Chinese listed companies. Expert Syst Appl 38(8):9305–9312
Syarif I, Zaluska E, Prugel-Bennett A, Wills G (2012) Application of bagging, boosting and stacking to intrusion detection. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Berlin, Heidelberg, pp 593–602
Ting KM, Witten IH (1999) Issues in stacked generalization. J Artif Intell Res 10:271–289
Tong H (1977) Contribution to the discussion of the paper entitled Stochastic modeling of river flow time series by A. J. Lawrance and N. T. Kottegoda. J R Stat Soc Ser A 140:34–35
Tong H (1990) Nonlinear time series: a dynamical systems approach. Oxford University Press, New York
Tong H, Lim KS (1980) Threshold autoregression, limit cycles, and cyclical data (with discussion). J Roy Stat Soc B 42:245–292
Tsay RS (1988) Nonlinear time series analysis of blowfly population. J Time Ser Anal 9:247–264
Wei L, Jiang S, Dong J, Ren L, Liu Y, Zhang L, Wang M, Duan Z (2023) Fusion of gauge-based, reanalysis, and satellite precipitation products using Bayesian model averaging approach: determination of the influence of different input sources. J Hydrol 618:129234
Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259
**a Y, An HZ (1999) Projection pursuit autoregression in time series. J Time Ser Anal 20:693–714
**a Y, Tong H, Li WK, Zhu L (2002) An adaptive estimation of dimension reduction space. J R Stat Soc Ser B 64:363–410
Yin X, Cook RD (2005) Direction estimation in single-index regression. Biometrika 92(2):371–384
Yumnam K, Kumar Guntu R, Rathinasamy M, Agarwal A (2022) Quantile-based Bayesian model averaging approach towards merging of precipitation products. J Hydrol 604:127–206
Zhang X, Mahadevan S (2019) Ensemble machine learning models for aviation incident risk prediction. Decis Support Syst 116:48–63
Zhou ZH (2012) Ensemble methods: foundations and algorithms. Chapman and Hall/CRC
Acknowledgements
We would like to express our gratitude to Prof. Carla Rampichini, the Editor-in-Chief, the Associate Editor, and the three anonymous reviewers for their valuable and insightful comments, which significantly improved the quality of the article.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
De Alwis, T.P., Samadi, S.Y. Stacking-based neural network for nonlinear time series analysis. Stat Methods Appl (2024). https://doi.org/10.1007/s10260-024-00746-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10260-024-00746-0
Keywords
- Stacked generalization
- Cross-validation
- Time series
- Feed-forward artificial neural network (FANN)
- Backpropagation algorithm