Log in

Stock market index prediction using transformer neural network models and frequency decomposition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In an increasingly complex and volatile environment, government officials, researchers, and investors alike would like to possess models that accurately forecast markets in order to make appropriate decisions. This research investigates the efficacy of Transformer-based deep neural networks in predicting financial market returns compared to traditional models, focusing on ten different market indexes. The study employs a comprehensive methodology that involves iterative dropout tests and batch size optimization to enhance model performance. By leveraging the power of deep learning, the research aims to improve prediction accuracy and capture complex patterns in stock market data. Twelve neural network architectures are compared across ten indexes to measure their performance, finding that the proposed Transformer variants produce significantly better results compared to benchmark models in all cases. The results of ablative experiments reveal the superiority of Transformer models in capturing long-term dependencies and extracting meaningful features from time series data. The findings suggest that Transformer neural networks outperform LSTM networks and other traditional models in forecasting financial market trends. This research contributes to the growing body of literature on deep learning applications in finance and provides valuable insights for government officials, researchers, and investors seeking to make informed decisions in the stock market. The implications of this study extend beyond academia, offering practical implications for enhancing prediction accuracy and optimizing investment strategies in the dynamic and volatile financial market landscape.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data will be made available on reasonable request.

References

  1. Rezaei H, Faaljou H, Mansourfar G (2020) Stock price prediction using deep learning and frequency decomposition. Expert Syst Appl 169:114332. https://doi.org/10.1016/j.eswa.2020.114332

    Article  Google Scholar 

  2. Huang N, Wu ML, Qu W et al (2003) Application of Hilbert–Huang transform to non-stationary financial time series analysis. Appl Stoch Model Bus Ind 19:245–268. https://doi.org/10.1002/asmb.501

    Article  MathSciNet  Google Scholar 

  3. Siami Namini S, Tavakoli N, Siami Namin A (2018) A comparison of ARIMA and LSTM in forecasting time series. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA) pp 1394–1401. https://doi.org/10.1109/ICMLA.2018.00227

  4. Selvin S, Ravi V, Gopalakrishnan E, et al (2017) Stock price prediction using LSTM, RNN and CNN-sliding window model. In: 2017 International conference on advances in computing, communications and informatics (ICACCI) pp 1643–1647. https://doi.org/10.1109/ICACCI.2017.8126078

  5. Rhanoui M, Yousfi S, Mikram M et al (2019) Forecasting financial budget time series ARIMA random walk vs LSTM neural network. IAES Int J Artif Intell 8:317. https://doi.org/10.11591/ijai.v8.i4.pp317-327

    Article  Google Scholar 

  6. Güreşen E, Kayakutlu G, Daim T (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397. https://doi.org/10.1016/j.eswa.2011.02.068

    Article  Google Scholar 

  7. Roh T (2007) Forecasting the volatility of stock price index. Expert Syst Appl 33:916–922. https://doi.org/10.1016/j.eswa.2006.08.001

    Article  Google Scholar 

  8. Zhang Y, Li C, Jiang Y et al (2022) Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J Clean Prod 354:131724. https://doi.org/10.1016/j.jclepro.2022.131724

    Article  Google Scholar 

  9. Yang G, Yuan E (2022) Predicting the long-term co2 concentration in classrooms based on the BO-EMD-LSTM model. Build Environ 224:109568. https://doi.org/10.1016/j.buildenv.2022.109568

    Article  Google Scholar 

  10. Lin Y, Lin Z, Liao Y et al (2022) Forecasting the realized volatility of stock price index: a hybrid model integrating CEEMDAN and LSTM. Expert Syst Appl 206:117736. https://doi.org/10.1016/j.eswa.2022.117736

    Article  Google Scholar 

  11. Ran P, Dong K, Liu X et al (2023) Short-term load forecasting based on CEEMDAN and transformer. Electr Power Syst Res 214:108885. https://doi.org/10.1016/j.epsr.2022.108885

    Article  Google Scholar 

  12. Cao J, Li Z, Li J (2018) Financial time series forecasting model based on CEEMDAN and LSTM. Physica A: Stat Mech Appl 519:127–139. https://doi.org/10.1016/j.physa.2018.11.061

    Article  Google Scholar 

  13. Topcu I, Saridemir M (2009) Prediction of compressive strength of concrete containing fly ash using artificial neural networks and fuzzy logic. Comput Mater Sci 41:305–311. https://doi.org/10.1016/j.commatsci.2007.04.009

    Article  Google Scholar 

  14. Mohtasham Moein M, Saradar A, Rahmati K et al (2022) Predictive models for concrete properties using machine learning and deep learning approaches: a review. J Build Eng 63:105444. https://doi.org/10.1016/j.jobe.2022.105444

    Article  Google Scholar 

  15. Sezer O, Gudelek U, Ozbayoglu M (2020) Financial time series forecasting with deep learning: a systematic literature review: 2005–2019. Appl Soft Comput 90:106181. https://doi.org/10.1016/j.asoc.2020.106181

    Article  Google Scholar 

  16. Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44

    Article  Google Scholar 

  17. Elman J (1990) Finding structure in time. Cogn Sci 14:179–211. https://doi.org/10.1016/0364-0213(90)90002-E

    Article  Google Scholar 

  18. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–66. https://doi.org/10.1109/72.279181

    Article  Google Scholar 

  19. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–80. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  20. Fischer T, Krauss C (2017) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270:654–669. https://doi.org/10.1016/j.ejor.2017.11.054

    Article  MathSciNet  Google Scholar 

  21. Junaid T, Sumathi D, Sasikumar AN et al (2022) A comparative analysis of transformer based models for figurative language classification. Comput Electr Eng 101:108051. https://doi.org/10.1016/j.compeleceng.2022.108051

    Article  Google Scholar 

  22. Playout C, Duval R, Boucher M et al (2022) Focused attention in transformers for interpretable classification of retinal images. Med Image Anal 82:102608. https://doi.org/10.1016/j.media.2022.102608

    Article  Google Scholar 

  23. Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  24. Wolf T, Debut L, Sanh V, et al (2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp 38–45

  25. Leow EKW, Nguyen BP, Chua MCH (2021) Robo-advisor using genetic algorithm and BERT sentiments from tweets for hybrid portfolio optimisation. Expert Syst Appl 179:115060

    Article  Google Scholar 

  26. de Oliveira Carosia AE, Coelho GP, da Silva AEA (2021) Investment strategies applied to the Brazilian stock market: a methodology based on sentiment analysis with deep learning. Expert Syst Appl 184:115470

    Article  Google Scholar 

  27. Gao Y, Zhao C, Sun B et al (2022) Effects of investor sentiment on stock volatility: new evidences from multi-source data in china’s green stock markets. Financ Innov 8(1):1–30

    Article  Google Scholar 

  28. Huang N, Shen Z, Long S et al (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proce R Soc London Series A: Math Phys Eng Sci 454:903–995. https://doi.org/10.1098/rspa.1998.0193

    Article  MathSciNet  Google Scholar 

  29. Erdiş A, Bakir M, Jaiteh M (2021) A method for detection of mode-mixing problem. J Appl Stat 48:1–17. https://doi.org/10.1080/02664763.2021.1908969

    Article  MathSciNet  Google Scholar 

  30. Wu Z, Huang N (2009) Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 1:1–41. https://doi.org/10.1142/S1793536909000047

    Article  Google Scholar 

  31. Torres ME, Colominas M, Schlotthauer G, et al (2011) Complete ensemble empirical mode decomposition with adaptive noise. In: ICASSP, IEEE international conference on acoustics, speech and signal processing - proceedings pp 4144–4147. https://doi.org/10.1109/ICASSP.2011.5947265

  32. Colominas M, Schlotthauer G, Torres ME (2014) Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed Signal Process Control 14:19–29. https://doi.org/10.1016/j.bspc.2014.06.009

    Article  Google Scholar 

  33. Hota H, Handa R, Shrivas A (2017) Time series data prediction using sliding window based RBF neural network. Int J Comput Intell Res 13(5):1145–1156

    Google Scholar 

  34. Chu CS (1995) Time series segmentation: a sliding window approach. Inf Sci 85:147–173. https://doi.org/10.1016/0020-0255(95)00021-G

    Article  Google Scholar 

  35. Yann L, Bottou L, Bengio Y et al (1986) Gradientbased learning applied to document recognition. Proc IEEE 11:2278–2324

    Google Scholar 

  36. Guo T, Dong J, Li H, et al (2017) Simple convolutional neural network on image classification. In: 2017 IEEE 2nd international conference on big data analysis (ICBDA) pp 721–724. https://doi.org/10.1109/ICBDA.2017.8078730

  37. Kazemi M, Goel R, Eghbali S, et al (2019) Time2vec: Learning a vector representation of time. ar**v preprint ar**v:1907.05321

  38. Dubey S, Singh S, Chaudhuri B (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111

    Article  Google Scholar 

  39. Glorot X, Bordes A, Bengio Y (2010) Deep sparse rectifier neural networks. J Mach Learn Res 15

  40. Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. ar**v preprint ar**v:1710.05941

  41. Hendrycks D, Gimpel K (2016) Gaussian error linear units (GELUS). ar**v preprint ar**v:1606.08415

  42. Lillicrap T, Santoro A, Marris L et al (2020) Backpropagation and the brain. Nat Rev Neurosci 21:335–346. https://doi.org/10.1038/s41583-020-0277-3

    Article  Google Scholar 

  43. Wang Q, Ma Y, Zhao K et al (2022) A comprehensive survey of loss functions in machine learning. Annals Data Sci 9:187–212. https://doi.org/10.1007/s40745-020-00253-5

    Article  Google Scholar 

  44. Dozat T (2016) Incorporating nesterov momentum into adam

  45. Hardt M, Recht B, Singer Y (2016) Train faster, generalize better: stability of stochastic gradient descent. In: International conference on machine learning pp 1225–1234

  46. Vani S, Rao T (2019) An experimental approach towards the performance assessment of various optimizers on convolutional neural network. In: 2019 3rd international conference on trends in electronics and informatics (ICOEI) pp 331–336. https://doi.org/10.1109/ICOEI.2019.8862686

  47. Hsueh BY, Li W, Wu IC (2019) Stochastic gradient descent with hyperbolic-tangent decay on classification. In: 2019 IEEE winter conference on applications of computer vision (WACV) pp 435–442

  48. Hansen P, Nason J, Lunde A (2010) The model confidence set. Econometrica 79:453–497. https://doi.org/10.2139/ssrn.522382

    Article  MathSciNet  Google Scholar 

  49. Chollet F, et al (2015) Keras. https://github.com/fchollet/keras

  50. Koprinkova-Hristova P, Petrova M (1999) Data-scaling problems in neural-network training. Eng Appl Artif Intell 12:281–296. https://doi.org/10.1016/S0952-1976(99)00008-1

    Article  Google Scholar 

  51. Liu T, Luo Z, Huang J et al (2018) A comparative study of four kinds of adaptive decomposition algorithms and their applications. Sensors 18:2120. https://doi.org/10.3390/s18072120

    Article  Google Scholar 

  52. Ying X (2019) An overview of overfitting and its solutions. J Phys: Conf Ser 1168:022022. https://doi.org/10.1088/1742-6596/1168/2/022022

    Article  Google Scholar 

  53. Chen S, Ge L (2019) Exploring the attention mechanism in LSTM-based Hong Kong stock price movement prediction. Quant Finance 19:1–9. https://doi.org/10.1080/14697688.2019.1622287

    Article  MathSciNet  Google Scholar 

  54. Yu X, Feng Wz, Wang H et al (2020) An attention mechanism and multi-granularity-based bi-LSTM model for Chinese Q &A system. Soft Comput 24:5831–5845. https://doi.org/10.1007/s00500-019-04367-8

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Werner Kristjanpoller.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Consent to publish

The authors agreed with the content and gave explicit consent to submit the manuscript.

Human and animal rights

The study does not involve Human Participants. The study does not involve Animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Model results

See Table 6.

Table 6 Undecomposed models results

CAC 40 loss curves

See Figs. 12 and 13.

Fig. 12
figure 12

IMF 1 Loss curve during training

Fig. 13
figure 13

Residual Loss curve during training

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yañez, C., Kristjanpoller, W. & Minutolo, M.C. Stock market index prediction using transformer neural network models and frequency decomposition. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09931-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09931-4

Keywords

Navigation