Stock market index prediction using transformer neural network models and frequency decomposition

Yañez, Camilo; Kristjanpoller, Werner; Minutolo, Marcel C.

doi:10.1007/s00521-024-09931-4

Stock market index prediction using transformer neural network models and frequency decomposition

Original Article
Published: 18 May 2024

(2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Camilo Yañez¹,
Werner Kristjanpoller ORCID: orcid.org/0000-0002-5878-072X¹ &
Marcel C. Minutolo²

169 Accesses
Explore all metrics

Abstract

In an increasingly complex and volatile environment, government officials, researchers, and investors alike would like to possess models that accurately forecast markets in order to make appropriate decisions. This research investigates the efficacy of Transformer-based deep neural networks in predicting financial market returns compared to traditional models, focusing on ten different market indexes. The study employs a comprehensive methodology that involves iterative dropout tests and batch size optimization to enhance model performance. By leveraging the power of deep learning, the research aims to improve prediction accuracy and capture complex patterns in stock market data. Twelve neural network architectures are compared across ten indexes to measure their performance, finding that the proposed Transformer variants produce significantly better results compared to benchmark models in all cases. The results of ablative experiments reveal the superiority of Transformer models in capturing long-term dependencies and extracting meaningful features from time series data. The findings suggest that Transformer neural networks outperform LSTM networks and other traditional models in forecasting financial market trends. This research contributes to the growing body of literature on deep learning applications in finance and provides valuable insights for government officials, researchers, and investors seeking to make informed decisions in the stock market. The implications of this study extend beyond academia, offering practical implications for enhancing prediction accuracy and optimizing investment strategies in the dynamic and volatile financial market landscape.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

PCA-ICA-LSTM: A Hybrid Deep Learning Model Based on Dimension Reduction Methods to Predict S&P 500 Index Price

Article Open access 28 May 2024

Hybrid Deep Learning Models for Improving Stock Index Prediction

Comparative Study of Predicting Stock Index Using Deep Learning Models

Data availability

The data will be made available on reasonable request.

References

Rezaei H, Faaljou H, Mansourfar G (2020) Stock price prediction using deep learning and frequency decomposition. Expert Syst Appl 169:114332. https://doi.org/10.1016/j.eswa.2020.114332
Article Google Scholar
Huang N, Wu ML, Qu W et al (2003) Application of Hilbert–Huang transform to non-stationary financial time series analysis. Appl Stoch Model Bus Ind 19:245–268. https://doi.org/10.1002/asmb.501
Article MathSciNet Google Scholar
Siami Namini S, Tavakoli N, Siami Namin A (2018) A comparison of ARIMA and LSTM in forecasting time series. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA) pp 1394–1401. https://doi.org/10.1109/ICMLA.2018.00227
Selvin S, Ravi V, Gopalakrishnan E, et al (2017) Stock price prediction using LSTM, RNN and CNN-sliding window model. In: 2017 International conference on advances in computing, communications and informatics (ICACCI) pp 1643–1647. https://doi.org/10.1109/ICACCI.2017.8126078
Rhanoui M, Yousfi S, Mikram M et al (2019) Forecasting financial budget time series ARIMA random walk vs LSTM neural network. IAES Int J Artif Intell 8:317. https://doi.org/10.11591/ijai.v8.i4.pp317-327
Article Google Scholar
Güreşen E, Kayakutlu G, Daim T (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397. https://doi.org/10.1016/j.eswa.2011.02.068
Article Google Scholar
Roh T (2007) Forecasting the volatility of stock price index. Expert Syst Appl 33:916–922. https://doi.org/10.1016/j.eswa.2006.08.001
Article Google Scholar
Zhang Y, Li C, Jiang Y et al (2022) Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J Clean Prod 354:131724. https://doi.org/10.1016/j.jclepro.2022.131724
Article Google Scholar
Yang G, Yuan E (2022) Predicting the long-term co2 concentration in classrooms based on the BO-EMD-LSTM model. Build Environ 224:109568. https://doi.org/10.1016/j.buildenv.2022.109568
Article Google Scholar
Lin Y, Lin Z, Liao Y et al (2022) Forecasting the realized volatility of stock price index: a hybrid model integrating CEEMDAN and LSTM. Expert Syst Appl 206:117736. https://doi.org/10.1016/j.eswa.2022.117736
Article Google Scholar
Ran P, Dong K, Liu X et al (2023) Short-term load forecasting based on CEEMDAN and transformer. Electr Power Syst Res 214:108885. https://doi.org/10.1016/j.epsr.2022.108885
Article Google Scholar
Cao J, Li Z, Li J (2018) Financial time series forecasting model based on CEEMDAN and LSTM. Physica A: Stat Mech Appl 519:127–139. https://doi.org/10.1016/j.physa.2018.11.061
Article Google Scholar
Topcu I, Saridemir M (2009) Prediction of compressive strength of concrete containing fly ash using artificial neural networks and fuzzy logic. Comput Mater Sci 41:305–311. https://doi.org/10.1016/j.commatsci.2007.04.009
Article Google Scholar
Mohtasham Moein M, Saradar A, Rahmati K et al (2022) Predictive models for concrete properties using machine learning and deep learning approaches: a review. J Build Eng 63:105444. https://doi.org/10.1016/j.jobe.2022.105444
Article Google Scholar
Sezer O, Gudelek U, Ozbayoglu M (2020) Financial time series forecasting with deep learning: a systematic literature review: 2005–2019. Appl Soft Comput 90:106181. https://doi.org/10.1016/j.asoc.2020.106181
Article Google Scholar
Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44
Article Google Scholar
Elman J (1990) Finding structure in time. Cogn Sci 14:179–211. https://doi.org/10.1016/0364-0213(90)90002-E
Article Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–66. https://doi.org/10.1109/72.279181
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–80. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Fischer T, Krauss C (2017) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270:654–669. https://doi.org/10.1016/j.ejor.2017.11.054
Article MathSciNet Google Scholar
Junaid T, Sumathi D, Sasikumar AN et al (2022) A comparative analysis of transformer based models for figurative language classification. Comput Electr Eng 101:108051. https://doi.org/10.1016/j.compeleceng.2022.108051
Article Google Scholar
Playout C, Duval R, Boucher M et al (2022) Focused attention in transformers for interpretable classification of retinal images. Med Image Anal 82:102608. https://doi.org/10.1016/j.media.2022.102608
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Wolf T, Debut L, Sanh V, et al (2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp 38–45
Leow EKW, Nguyen BP, Chua MCH (2021) Robo-advisor using genetic algorithm and BERT sentiments from tweets for hybrid portfolio optimisation. Expert Syst Appl 179:115060
Article Google Scholar
de Oliveira Carosia AE, Coelho GP, da Silva AEA (2021) Investment strategies applied to the Brazilian stock market: a methodology based on sentiment analysis with deep learning. Expert Syst Appl 184:115470
Article Google Scholar
Gao Y, Zhao C, Sun B et al (2022) Effects of investor sentiment on stock volatility: new evidences from multi-source data in china’s green stock markets. Financ Innov 8(1):1–30
Article Google Scholar
Huang N, Shen Z, Long S et al (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proce R Soc London Series A: Math Phys Eng Sci 454:903–995. https://doi.org/10.1098/rspa.1998.0193
Article MathSciNet Google Scholar
Erdiş A, Bakir M, Jaiteh M (2021) A method for detection of mode-mixing problem. J Appl Stat 48:1–17. https://doi.org/10.1080/02664763.2021.1908969
Article MathSciNet Google Scholar
Wu Z, Huang N (2009) Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 1:1–41. https://doi.org/10.1142/S1793536909000047
Article Google Scholar
Torres ME, Colominas M, Schlotthauer G, et al (2011) Complete ensemble empirical mode decomposition with adaptive noise. In: ICASSP, IEEE international conference on acoustics, speech and signal processing - proceedings pp 4144–4147. https://doi.org/10.1109/ICASSP.2011.5947265
Colominas M, Schlotthauer G, Torres ME (2014) Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed Signal Process Control 14:19–29. https://doi.org/10.1016/j.bspc.2014.06.009
Article Google Scholar
Hota H, Handa R, Shrivas A (2017) Time series data prediction using sliding window based RBF neural network. Int J Comput Intell Res 13(5):1145–1156
Google Scholar
Chu CS (1995) Time series segmentation: a sliding window approach. Inf Sci 85:147–173. https://doi.org/10.1016/0020-0255(95)00021-G
Article Google Scholar
Yann L, Bottou L, Bengio Y et al (1986) Gradientbased learning applied to document recognition. Proc IEEE 11:2278–2324
Google Scholar
Guo T, Dong J, Li H, et al (2017) Simple convolutional neural network on image classification. In: 2017 IEEE 2nd international conference on big data analysis (ICBDA) pp 721–724. https://doi.org/10.1109/ICBDA.2017.8078730
Kazemi M, Goel R, Eghbali S, et al (2019) Time2vec: Learning a vector representation of time. ar**v preprint ar**v:1907.05321
Dubey S, Singh S, Chaudhuri B (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111
Article Google Scholar
Glorot X, Bordes A, Bengio Y (2010) Deep sparse rectifier neural networks. J Mach Learn Res 15
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. ar**v preprint ar**v:1710.05941
Hendrycks D, Gimpel K (2016) Gaussian error linear units (GELUS). ar**v preprint ar**v:1606.08415
Lillicrap T, Santoro A, Marris L et al (2020) Backpropagation and the brain. Nat Rev Neurosci 21:335–346. https://doi.org/10.1038/s41583-020-0277-3
Article Google Scholar
Wang Q, Ma Y, Zhao K et al (2022) A comprehensive survey of loss functions in machine learning. Annals Data Sci 9:187–212. https://doi.org/10.1007/s40745-020-00253-5
Article Google Scholar
Dozat T (2016) Incorporating nesterov momentum into adam
Hardt M, Recht B, Singer Y (2016) Train faster, generalize better: stability of stochastic gradient descent. In: International conference on machine learning pp 1225–1234
Vani S, Rao T (2019) An experimental approach towards the performance assessment of various optimizers on convolutional neural network. In: 2019 3rd international conference on trends in electronics and informatics (ICOEI) pp 331–336. https://doi.org/10.1109/ICOEI.2019.8862686
Hsueh BY, Li W, Wu IC (2019) Stochastic gradient descent with hyperbolic-tangent decay on classification. In: 2019 IEEE winter conference on applications of computer vision (WACV) pp 435–442
Hansen P, Nason J, Lunde A (2010) The model confidence set. Econometrica 79:453–497. https://doi.org/10.2139/ssrn.522382
Article MathSciNet Google Scholar
Chollet F, et al (2015) Keras. https://github.com/fchollet/keras
Koprinkova-Hristova P, Petrova M (1999) Data-scaling problems in neural-network training. Eng Appl Artif Intell 12:281–296. https://doi.org/10.1016/S0952-1976(99)00008-1
Article Google Scholar
Liu T, Luo Z, Huang J et al (2018) A comparative study of four kinds of adaptive decomposition algorithms and their applications. Sensors 18:2120. https://doi.org/10.3390/s18072120
Article Google Scholar
Ying X (2019) An overview of overfitting and its solutions. J Phys: Conf Ser 1168:022022. https://doi.org/10.1088/1742-6596/1168/2/022022
Article Google Scholar
Chen S, Ge L (2019) Exploring the attention mechanism in LSTM-based Hong Kong stock price movement prediction. Quant Finance 19:1–9. https://doi.org/10.1080/14697688.2019.1622287
Article MathSciNet Google Scholar
Yu X, Feng Wz, Wang H et al (2020) An attention mechanism and multi-granularity-based bi-LSTM model for Chinese Q &A system. Soft Comput 24:5831–5845. https://doi.org/10.1007/s00500-019-04367-8
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Industrias, Universidad Técnica Federico Santa María, Valparaíso, Chile
Camilo Yañez & Werner Kristjanpoller
Rockwell School of Business, Robert Morris University, Moon Township, PA, USA
Marcel C. Minutolo

Authors

Camilo Yañez
View author publications
You can also search for this author in PubMed Google Scholar
Werner Kristjanpoller
View author publications
You can also search for this author in PubMed Google Scholar
Marcel C. Minutolo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Werner Kristjanpoller.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Consent to publish

The authors agreed with the content and gave explicit consent to submit the manuscript.

Human and animal rights

The study does not involve Human Participants. The study does not involve Animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Model results

See Table 6.

Table 6 Undecomposed models results

Full size table

CAC 40 loss curves

See Figs. 12 and 13.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yañez, C., Kristjanpoller, W. & Minutolo, M.C. Stock market index prediction using transformer neural network models and frequency decomposition. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09931-4

Download citation

Received: 10 January 2024
Accepted: 28 April 2024
Published: 18 May 2024
DOI: https://doi.org/10.1007/s00521-024-09931-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Stock market index prediction using transformer neural network models and frequency decomposition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PCA-ICA-LSTM: A Hybrid Deep Learning Model Based on Dimension Reduction Methods to Predict S&P 500 Index Price

Hybrid Deep Learning Models for Improving Stock Index Prediction

Comparative Study of Predicting Stock Index Using Deep Learning Models

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to publish

Human and animal rights

Additional information

Publisher's Note

Appendices

Model results

CAC 40 loss curves

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Stock market index prediction using transformer neural network models and frequency decomposition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PCA-ICA-LSTM: A Hybrid Deep Learning Model Based on Dimension Reduction Methods to Predict S&P 500 Index Price

Hybrid Deep Learning Models for Improving Stock Index Prediction

Comparative Study of Predicting Stock Index Using Deep Learning Models

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to publish

Human and animal rights

Additional information

Publisher's Note

Appendices

Model results

CAC 40 loss curves

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation