Abstract
Background
Hand, foot, and mouth disease (HFMD) is a common infectious disease that poses a serious threat to children all over the world. However, the current prediction models for HFMD still require improvement in accuracy. In this study, we proposed a hybrid model based on autoregressive integrated moving average (ARIMA), ensemble empirical mode decomposition (EEMD) and long short-term memory (LSTM) to predict the trend of HFMD.
Methods
The data used in this study was sourced from the National Clinical Research Center for Child Health and Disorders, Chongqing, China. The daily reported incidence of HFMD from 1 January 2015 to 27 July 2023 was collected to develop an ARIMA-EEMD-LSTM hybrid model. ARIMA, LSTM, ARIMA-LSTM and EEMD-LSTM models were developed to compare with the proposed hybrid model. Root mean square error (RMSE), mean absolute error (MAE) and coefficient of determination (R2) were adopted to evaluate the performances of the prediction models.
Results
Overall, ARIMA-EEMD-LSTM model achieved the most accurate prediction for HFMD, with RMSE, MAPE and R2 of 4.37, 2.94 and 0.996, respectively. Performing EEMD on the residual sequence yields 11 intrinsic mode functions. EEMD-LSTM model is the second best, with RMSE, MAPE and R2 of 6.20, 3.98 and 0.996.
Conclusion
Results showed the advantage of ARIMA-EEMD-LSTM model over the ARIMA model, the LSTM model, the ARIMA-LSTM model and the EEMD-LSTM model. For the prevention and control of epidemics, the proposed hybrid model may provide a more powerful help. Compared with other three models, the two integrated with EEMD method showed significant improvement in predictive capability, offering novel insights for modeling of disease time series.
Similar content being viewed by others
Background
Hand, foot, and mouth disease (HFMD) is a common infectious disease caused by a group of enteroviruses, particularly among children under the age of 5 [ In this study, the original time series was divided into a training set, covering the period from 1 January 2015, to 7 January 2022 (80% of the data), and a testing set, covering the period from 8 January 2022, to 27 July 2023 (20% of the data). A rolling forecast approach was employed, where 60 days of historical data were used to predict the next 1 day. To begin, the 'forecast' package in R was utilized. The 'auto.arima' function was employed to identify the optimal model parameters for the training data, resulting in the creation of an ARIMA(5,1,2) model. The ARIMA model was fitted to the training set and used to make predictions on the testing set. The EEMD method was applied to decompose the residual series of the ARIMA model, and the results are shown in Fig. 3. The original residual series was decomposed into 11 IMF series and 1 trend series. The IMF series with lower indices represent high-frequency signals in the original sequence, while the IMF series with higher indices represent low-frequency signals. From the decomposition results, it can be observed that the original data contains significant high-frequency signals. When these signals are included in the original time series, they are not easily learned by the LSTM model. However, separating these signals facilitates the learning process for LSTM. These decomposed series were used as inputs to train the LSTM models, and the performances of these models on the testing set is shown in Fig. 4. It can be observed that the predicted values of each component series closely match the true values in terms of numerical values and trend, without significant lag. The predicted values of the IMF series and the trend series were summed up to obtain the predicted results of the residual series, as shown in Fig. 5. Compared to the actual residual series, the predicted series demonstrates strong consistency in terms of frequency and amplitude of fluctuations, indicating a good predictive effect for the residual series. Finally, the predicted values of the ARIMA model and the residual series were added up to obtain the final predicted values, which were compared to the true values in Fig. 6. From the figure, it can be observed that the model accurately predicts the changing trend of the original time series and can capture significant fluctuations. In this study, we developed 4 more models as comparison: the ARIMA model, the LSTM model, the ARIMA-LSTM model and the EEMD-LSTM model. The results of those models are shown in Supplemental Figures 1–4. The evaluation results of the hybrid ARIMA-EEMD-LSTM model, as well as the ARIMA, LSTM, ARIMA-LSTM, and EEMD-LSTM models on the training set and the testing set, are shown in Table 1. The proposed ARIMA-EEMD-LSTM model achieved an RMSE of 4.37, MAE of 2.94, and an R2 of 0.996 on the testing set, demonstrating accurate predictions of the incidence of HFMD. In comparison, the ARIMA model had an RMSE of 6.95, MAE of 3.68, and an R2 of 0.990, while the LSTM model had an RMSE of 13.93, MAE of 8.07, and an R2 of 0.961. The hybrid model outperformed these single models in accuracy and goodness of fit, achieving better predictive performance. Furthermore, two other hybrid models, ARIMA-LSTM and EEMD-LSTM, were also developed. On the testing set, the ARIMA-LSTM model had an RMSE of 9.85, MAE of 8.11, and an R2 of 0.980, while the EEMD-LSTM model had an RMSE of 6.20, MAE of 3.98, and an R2 of 0.992. Compared with the LSTM model, EEMD-LSTM showed improvements in RMSE from 13.93 to 6.20, MAE from 8.07 to 3.98, and R2 from 0.961 to 0.992. Compared with the ARIMA-LSTM model, ARIMA-EEMD-LSTM showed improvements in RMSE from 9.85 to 4.37, MAE from 8.11 to 2.94, and R2 from 0.980 to 0.996. These results indicate that the inclusion of the EEMD method significantly enhances the predictive performance of the models. Overall, the hybrid ARIMA-EEMD-LSTM model demonstrates superior predictive accuracy and fitness compared with the ARIMA, LSTM, ARIMA-LSTM, and EEMD-LSTM models. The addition of the EEMD method contributes to the improvement of the model's predictive performance. In this study, we proposed a novel hybrid prediction model which combined the strength of linear statistical model, advanced deep learning model and the cutting-edge EEMD technology to achieve accurate prediction for HFMD incidence. The proposed hybrid ARIMA-EEMD-LSTM model outperformed the other four prediction models developed in this study-ARIMA, LSTM, ARIMA-LSTM and EEMD-LSTM according to the evaluation results, which means the ARIMA-EEMD-LSTM model provides more accurate predictions. ARIMA, as a classical time series prediction model, has been applied widely in disease predictions [18,19,20]. However, since belongs to lineal models, ARIMA can only capture the linear characteristics. Many time series in real world contain a mixture of linear and non-linear features, which poses challenges for the predictions of ARIMA model. But the deep learning algorithm can compensate for this limitation. The combination of ARIMA model and LSTM model,the widely used deep learning model for time series,keeps ARIMA’s advantage in capturing linear trends and dependencies within time series while excels at capturing complex,nonlinear patterns and long-term dependencies. EEMD is a novel technology for processing non-linear and non-stationary data, and has been successfully applied in various fields [21,22,23]. However, there have been few studies which use EEMD for epidemic predictions. With EEMD method, complex data can be decomposed into relatively simple components that are more suitable for model training. This compensates for the limitation of the LSTM model in dealing with nonstationary time series. In this study, we compared the hybrid ARIMA-EEMD-LSTM model with two single models-ARIMA and LSTM, and two hybrid models-ARIMA-LSTM and EEMD-LSTM. The evaluation results showed that the ARIMA-EEMD-LSTM model exhibited the best predictive performance with the RMSE, MAPE and R2 of 4.37, 2.94 and 0.996, respectively. The predcition performance of the proposed model suggests its potential utility in epidemic prevention and control. And the two models integrated with EEMD method showed significant improvement in predictive capability when compared with other three models. The inclusion of EEMD can have great impact on model performance, offering novel insights for modeling of disease time series. There are also several limitations in this study. Firstly, the data used in this study were from the National Children's Regional Medical Center (Southwest Region), and more cross-center studies are needed to verify the validity and generalizability of the results. Secondly, models developed in this study only utilized daily cases of HFMD, and more related factors such as temperature and humidity should be considered to furtherly enhance the prediction performance. In conclusion, this study proposed an innovative hybrid ARIMA-EEMD-LSTM model for predicting the incidence of HFMD. By integrating the strengths of the ARIMA model, LSTM model, and EEMD method, the hybrid model achieved enhanced prediction accuracy and fit, and can serve as a valuable tool for healthcare professionals and policymakers in understanding and managing the spread of HFMD and other epidemics.Results
The development of ARIMA-EEMD-LSTM
The development of other models
Model evaluation and comparison
Discussion
Conclusion
Availability of data and materials
The dataset used in the study are available from the corresponding author on reasonable request.
Abbreviations
- HFMD:
-
Hand, foot and mouth disease
- ARIMA:
-
Autoregressive integrated moving average
- EMD:
-
Empirical mode decomposition
- EEMD:
-
Ensemble empirical mode decomposition
- LSTM:
-
Long short-term memory
References
**ng W, Liao Q, Viboud C, Zhang J, Sun J, Wu JT, et al. Hand, foot, and mouth disease in China, 2008–12: an epidemiological study. Lancet Infect Dis. 2014;14(4):308–18.
Park K, Lee B, Baek K, Cheon D, Yeo S, Park J, et al. Enteroviruses isolated from herpangina and hand-foot-and-mouth disease in Korean children. Virol J. 2012;9:205.
Alsop J, Flewett TH, Foster JR. “Hand-foot-and-mouth disease” in Birmingham in 1959. Br Med J. 1960;2(5214):1708–11.
Huang CC, Liu CC, Chang YC, Chen CY, Wang ST, Yeh TF. Neurologic complications in children with enterovirus 71 infection. N Engl J Med. 1999;341(13):936–42.
Koh WM, Bogich T, Siegel K, ** J, Chong EY, Tan CY, et al. The epidemiology of hand, foot and mouth disease in asia: a systematic review and analysis. Pediatr Infect Dis J. 2016;35(10):e285-300.
Gonzalez G, Carr MJ, Kobayashi M, Hanaoka N, Fujimoto T. Enterovirus-associated hand-foot and mouth disease and neurological complications in Japan and the rest of the world. Int J Mol Sci. 2019;20(20):5201.
Jayaraj VJ, Hoe VCW. Forecasting HFMD cases using weather variables and google search queries in Sabah, Malaysia. Int J Environ Res Public Health. 2022;19(24):16880.
Kua JA, Pang J. The epidemiological risk factors of hand, foot, mouth disease among children in Singapore: a retrospective case-control study. PLoS ONE. 2020;15(8):e0236711.
Nhan LNT, Turner HC, Khanh TH, Hung NT, Lien LB, Hong NTT, et al. Economic burden attributed to children presenting to hospitals with hand, foot, and mouth disease in Vietnam. Open Forum Infect Dis. 20191;6(7):284.
Y H, H J, W S, C D, T C, L C, et al. Disease burden in patients with severe hand, foot, and mouth disease in Jiangsu Province: a cross-sectional study. Human vaccines & immunotherapeutics. 2022;18(5). Available from: https://pubmed.ncbi.nlm.nih.gov/35476031/. Cited 13 Aug 2023
Zhang R, Guo Z, Meng Y, Wang S, Li S, Niu R, et al. Comparison of ARIMA and LSTM in Forecasting the Incidence of HFMD Combined and Uncombined with Exogenous Meteorological Variables in Ningbo, China. Int J Environ Res Public Health. 2021;18(11):6174.
Liu L, Luan RS, Yin F, Zhu XP, Lü Q. Predicting the incidence of hand, foot and mouth disease in Sichuan province, China using the ARIMA model. Epidemiol Infect. 2016;144(1):144–51.
Borges D, Nascimento MCV. COVID-19 ICU demand forecasting: a two-stage Prophet-LSTM approach. Appl Soft Comput. 2022;125:109181.
Xu D, Zhang Q, Ding Y, Zhang D. Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting. Environ Sci Pollut Res Int. 2022;29(3):4128–44.
Yang E, Zhang H, Guo X, Zang Z, Liu Z, Liu Y. A multivariate multi-step LSTM forecasting model for tuberculosis incidence with model explanation in Liaoning Province, China. BMC Infect Dis. 2022;22(1):490.
Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen N-C, Tung CC, Liu HH. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc London Ser A Math Phys Eng Sci. 1998;454:903–95.
Wu Z, Huang NE. Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal. 2009;01(01):1–41.
Wang M, Pan J, Li X, Li M, Liu Z, Zhao Q, et al. ARIMA and ARIMA-ERNN models for prediction of pertussis incidence in mainland China from 2004 to 2021. BMC Public Health. 2022;22(1):1447.
Alabdulrazzaq H, Alenezi MN, Rawajfih Y, Alghannam BA, Al-Hassan AA, Al-Anzi FS. On the accuracy of ARIMA based prediction of COVID-19 spread. Results Phys. 2021;27:104509.
Zhang R, Song H, Chen Q, Wang Y, Wang S, Li Y. Comparison of ARIMA and LSTM for prediction of hemorrhagic fever at different time scales in China. PLoS ONE. 2022;17(1):e0262009.
Shao L, Guo Q, Li C, Li J, Yan H. Short-term load forecasting based on EEMD-WOA-LSTM combination model. Appl Bionics Biomech. 2022;2022:2166082.
**e Z, Li Z, Mo C, Wang J. A PCA-EEMD-CNN-Attention-GRU-Encoder-Decoder Accurate Prediction Model for Key Parameters of Seawater Quality in Zhanjiang Bay. Materials (Basel). 2022;15(15):5200.
Zhao J, Nie G, Wen Y. Monthly precipitation prediction in Luoyang city based on EEMD-LSTM-ARIMA model. Water Sci Technol. 2023;87(1):318–35.
Acknowledgements
We would like to thank Children’s Hospital of Chongqing Medical University for providing the data of confirmed hand, foot and mouth disease cases in Chongqing.
Funding
This work was supported by the National Key Research and Development Program of China (No. 2022YFC2704900), the National Natural Science Foundation of China (No. 72174033) and the Program for Youth Innovation in Future Medicine, Chongqing Medical University (No. W0013).
Author information
Authors and Affiliations
Contributions
XL and XMX designed the study. YRW collected and analyzed the data and wrote the manuscript. PS and JCL conducted the literature review and managed the project. All authors contributed to research performing, drafting, and revising the article, gave final approval of the version to be published, and agree to be accountable for all aspects of the work. The authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
The ethics committee of Children’s Hospital of Chongqing Medical University approved this study protocol and waived the need for informed consent of the patients.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Wan, Y., Song, P., Liu, J. et al. A hybrid model for hand-foot-mouth disease prediction based on ARIMA-EEMD-LSTM. BMC Infect Dis 23, 879 (2023). https://doi.org/10.1186/s12879-023-08864-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12879-023-08864-y