Log in

A technique to forecast Pakistan’s news using deep hybrid learning model

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Forecasting future events is a challenging task that can have a significant impact on decision-making and policy-making. In this research, we focus on forecasting news related to Pakistan. Despite the importance of accurate predictions in this field, there currently exists no dataset for forecasting Pakistani news, specifically with regards to politics. Unlike numerical time series data, textual data includes information about the event's potential causes in addition to its impact. Better forecasts are thus anticipated as a result of this greater information. In order to address this gap, our research aims to create a first Pakistani news dataset for forecasting of Pakistan news that is mostly related to politics of Pakistan. This dataset was collected from various sources, including Pakistani news websites and social media platforms, as well as frequently asked questions about Pakistani politics. We develop a forecasting model using this dataset and evaluate the effectiveness of cutting-edge deep hybrid learning techniques incorporating neural networks, random forest, Word2vec, Natural language processing (NLP), and Naive Bayes. To the best of our understanding, no research has been done on the application of a deep hybrid learning model—a blend of deep learning and machine learning—for news forecasting. The accuracy for forecasting model is 97%. According to our findings, the model's performance is adequate when compared to that of other forecasting models. Our research not only fills the gap in the current literature but also presents a new challenge for large language models and has the potential to bring significant practical advantages in the field of forecasting. The unique contribution of this study lies in the intelligent modeling of the prediction challenge, allowing for the utilization of text rich in content for forecasting objectives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1

Similar content being viewed by others

References:

  1. Petropoulos F et al (2022) Forecasting: theory and practice. Int J Forecast. https://doi.org/10.1016/j.ijforecast.2021.11.001

    Article  Google Scholar 

  2. Armstrong JS (2001) Principles of forecasting: a handbook for researchers and practitioners. Kluwer Academic Publishers, Netherlands

    Book  Google Scholar 

  3. Faraway J, Chatfield C (1995) Time series forecasting with neural networks: a case study, University of Bath, Bath (United Kingdom), Research Report, pp 95–06.

  4. Makridakis S, Wheelwright SC, Hyndman RJ (2008) Forecasting methods and applications. John Wiley & Sons, New Jersey

    Google Scholar 

  5. Christensen P, Gillingham K, Nordhaus W (2018) Uncertainty in forecasts of long-run economic growth. Proc Natl Acad Sci 115(21):5409–5414

    Article  Google Scholar 

  6. Christensen K, Davis J, Faber B (2018) Forecasting in a Changing Climate. Bus Econ 53(4):216–223. https://doi.org/10.1080/0000000x.2018.1505503

    Article  Google Scholar 

  7. Adam D (2020) Modelling the pandemic: the simulations driving the world’s response to COVID-19. Nature 580(7803):316–318

    Article  Google Scholar 

  8. Hendrycks D, Carlini N, Schulman J, Steinhardt J (2021) Unsolved problems in ML safety, ar**v preprint ar**v:2109.13916

  9. Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors, In: Proceedings of the 19th International Conference on World Wide Web, pp 851–860.

  10. Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8. https://doi.org/10.1016/j.jocs.2010.12.007

    Article  Google Scholar 

  11. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: what 140 characters reveal about political sentiment. ICWSM 10:178–185

    Article  Google Scholar 

  12. Webby R, O’Connor M (1996) Judgemental and statistical time series forecasting: a review of the literature. Int J Forecast 12(1):91–118. https://doi.org/10.1016/0169-2070(95)00644-3

    Article  Google Scholar 

  13. Makridakis S, Hyndman RJ, Petropoulos F (2020) Forecasting in social settings: the state of the art. Int J Forecast 36(1):15–28. https://doi.org/10.1016/j.ijforecast.2019.05.011

    Article  Google Scholar 

  14. Triebe O, Hewamalage H, Pilyugina P, Laptev N, Bergmeir C, Rajagopal R (2021) NeuralProphet: explainable forecasting at scale, ar**v preprint ar**v:2111.15397

  15. T. F. Rötheli, 2016 Book Review of Superforecasting: The Art and Science of Prediction. by Philip Tetlock and Dan Gardner, Forthcoming: Foresight, the Journal of Future Studies, Strategic Thinking, and Policy

  16. Cohen SP (2002) The Nation and the State of Pakistan. Wash Q 25(3):109–122. https://doi.org/10.1162/01636600260046271

    Article  Google Scholar 

  17. ** W, Khanna R, Kim S, Lee DH, Morstatter F, Galstyan A, Ren X (2021) ForecastQA: a question answering challenge for event forecasting with temporal text data, In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing Vol 1: Long Papers, pp 4636–4650, https://doi.org/10.18653/v1/2021.acl-long.357.

  18. Boschee E, Lautenschlager J, Brien SO, Shellman S, Starz J, Ward M (2015) ICEWS coded event data, Harvard dataverse, Online. Available: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28075.

  19. Leetaru K, Schrodt PA (2013) Gdelt: Global data on events, location, and tone, 1979–2012, in ISA Annual Convention, vol 2, pp 1–49, Citeseer

  20. Morstatter F et al. (2019) SAGE: a hybrid geopolitical event forecasting system, In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI, Macao, China, 10–16, pp 6557–6559, ijcai.org, https://doi.org/10.24963/ijcai.2019/907.

  21. Ramakrishnan N et al. (2014) Beating the news' with EMBERS: forecasting civil unrest using open source indicators, In: The 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD'14, New York, NY, USA, August 24–27, pp 1799–1808, ACM, https://doi.org/10.1145/2623330.2623357

  22. Hu L, Li J, Nie L, Li X, Shao C (2017) What happens next? future subevent prediction using contextual hierarchical LSTM, In: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, San Francisco, California, USA, pp 3450–3456, AAAI Press, https://doi.org/10.1609/aaai.v31i1.5435.

  23. Li Z, Ding X, Liu T (2018) Constructing narrative event evolutionary graph for script event prediction, In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI, July 13–19, Stockholm, Sweden, pp 4201–4207, ijcai.org, https://doi.org/10.24963/ijcai.2018/583.

  24. Ellis GW, Ge X, Grasso D (1990) Time series analysis of wastewater quality, In: Instrumentation, control and automation of water and wastewater treatment and transport systems, pp 441–448, Pergamon

  25. Holt CC (1960) Forecasting seasonals and trends by exponentially weighted moving averages. J R Stat Soc Ser B Methodol 26(2):211–230. https://doi.org/10.1111/j.2517-6161.1960.tb00212.x

    Article  Google Scholar 

  26. Winters PR (1960) Forecasting sales by exponentially weighted moving averages. Manage Sci 6(3):324–342. https://doi.org/10.1287/mnsc.6.3.324

    Article  MathSciNet  Google Scholar 

  27. Lütkepohl H (2005) New Introduction to Multiple Time Series Analysis. Springer Science and Business Media, Heidelberg

    Book  Google Scholar 

  28. Johansen S (1995) Likelihood-based inference in cointegrated vector autoregressive models. OUP Oxford, England

    Book  Google Scholar 

  29. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  30. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(2):1189–1232

    MathSciNet  Google Scholar 

  31. Rumelhart DE, Hinton GE, Williams RJ (1986) "Learning internal representations by error propagation in parallel distributed processing. MIT Press, Cambridge

    Google Scholar 

  32. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  33. Makridakis S, Wheelwright SC (1989) Forecasting: Methods and Applications. John Wiley & Sons, New Jersey

    Google Scholar 

  34. Januschowski T et al. (2020) Global Forecasting Models for Time Series

  35. Oreshkin BN et al. (2020) N-BEATS: neural basis expansion analysis for interpretable time series forecasting, https://doi.org/10.1145/3447548.3447554.

  36. Zoph B (2018) Learning transferable architectures for scalable image recognition.

  37. Hewamalage H (2021) Deep learning techniques for time series forecasting.

  38. Wen R et al. (2017) A dual-stage attention-based recurrent neural network for time series prediction.

  39. Cho K et al. (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation, https://doi.org/10.3115/v1/D14-1179.

  40. Flunkert V et al. (2020) DeepAR: probabilistic forecasting with autoregressive recurrent networks.

  41. Bai S et al. (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling, https://doi.org/10.1109/ICDM.2018.00131.

  42. van den Oord A et al. (2016) WaveNet: a generative model for raw audio, https://doi.org/10.5555/3045390.3045555.

  43. Vaswani A et al. (2017) Attention is all you need, https://doi.org/10.5555/3295222.3295349.

  44. Lim E et al. (2021) Temporal fusion transformer for time series forecasting.

  45. Brown TB et al. (2020) Language models are few-shot learners, ar**v preprint ar**v:2005.14165.

  46. Gokaslan A, Cohen WW (2019) WebText: a large text corpus for pre-training text generators, ar**v preprint ar**v:1912.05403

  47. Tetlock PE, Gardner D (2016) Superforecasting: The Art and Science of Prediction. Broadway Books, New York

    Google Scholar 

  48. Chen Y et al. (2021) Retrieval-guided neural conversation generation, ar**v preprint ar**v:2103.11729.

  49. Shuster M et al. (2021) A large-scale evaluation of language models, ar**v preprint ar**v:2101.08667.

  50. Lin Y et al. (2021) Faked news: identifying and mitigating the spread of misinformation in microblogs, In: Proceedings of the 20th International Conference on World Wide Web.

  51. Hendrycks D et al. (2021) A baseline for detecting misconceptions in pre-trained language models, ar**v preprint ar**v:2102.05158.

  52. Bai Y et al. (2022) Fine-tuning pre-trained language models for fact-checking, ar**v preprint ar**v:2103.05202.

  53. Nakano R et al. (2021) Fact extraction and verification using pre-trained language models, In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

  54. Hadfield-Menell D et al. (2016) The offswitch game: a formal testbed for investigating corrigibility, ar**v preprint ar**v:1610.08517.

  55. Turner RM et al. (2020) OpenAI's GPT-3: a 10x larger language model, ar**v preprint ar**v:2005.14165.

  56. Wainwright MJ, Eckersley P (2019) The challenges of ai alignment. J Artif Intell Res 64:727–753. https://doi.org/10.1613/jair.1.11303

    Article  Google Scholar 

  57. Irving G et al. (2018) AI for human-robot interaction, In: Proceedings of the IEEE International Conference on Robotics and Automation.

  58. Evans R et al. (2021) AI alignment: a research agenda, ar**v preprint ar**v:2103.09453.

  59. Leike J et al. (2017) AI alignment: theories and methods, ar**v preprint ar**v:1705.08807.

  60. Hendrycks D et al. (2021) Pre-trained language models as provenance-aware programs, ar**v preprint ar**v:2104.05385.

  61. Reddy S et al. (2020) AIAI: AI alignment via interventions, ar**v preprint ar**v:2010.08622.

  62. Nahian R et al. (2021) AI alignment: a survey of methods, ar**v preprint ar**v:2104.05382.

  63. Zhai S, Zhang Z (2023) Read the news, not the books: forecasting firms’ long-term financial performance via deep text mining. ACM Trans Manag Inf Syst 14(1):37. https://doi.org/10.1145/3533018

    Article  Google Scholar 

  64. Liu M, Ying Q (2023) The role of online news sentiment in carbon price prediction of China’s carbon markets. Environ Sci Pollut Res 30:41379–41387. https://doi.org/10.1007/s11356-023-25197-0

    Article  Google Scholar 

  65. Mao Q, Li X, Peng H, Li J, He D, Guo S et al (2022) Event prediction based on evolutionary event ontology knowledge. Futur Gener Comput Syst 115:76–89. https://doi.org/10.1016/j.future.2020.08.046

    Article  Google Scholar 

  66. Radinsky K, Horvitz E (2013) Mining the web to predict future events, In: ACM international conference on web search and data mining, pp 255–264, https://doi.org/10.1145/2433396.2433431.

  67. Barbaglia L, Consoli S, Manzan S (2023) Forecasting with Economic News. J Bus Econ Stat. https://doi.org/10.1080/07350015.2022.2060988

    Article  MathSciNet  Google Scholar 

  68. Pan D, Zhang C, Zhu D et al (2023) Carbon price forecasting based on news text mining considering investor attention. Environ Sci Pollut Res 30:28704–28717. https://doi.org/10.1007/s11356-022-24186-z

    Article  Google Scholar 

  69. Lunde A, Torkar M (2020) Including news data in forecasting macroeconomic performance of China. CMS 17:585–611. https://doi.org/10.1007/s10287-020-00382-5

    Article  Google Scholar 

  70. Awais M, Hassan SU, Ahmed M (2021) Leveraging big data for politics: predicting general election of Pakistan using a novel rigged model. J Ambient Intell Humaniz Comput 12:4305–4313. https://doi.org/10.1007/s12652-019-01378-z

    Article  Google Scholar 

  71. Singh P, Dwivedi YK, Kahlon KS, Pathania A, Sawhney RS (2020) Can Twitter analytics predict election outcome? An insight from 2017 Punjab assembly elections. Gov Inf Q 37(2):101444

    Article  Google Scholar 

  72. FronzettiColladon A, Grippa F, Guardabascio B et al (2023) Forecasting consumer confidence through semantic network analysis of online news. Sci Rep 13:11785. https://doi.org/10.1038/s41598-023-38400-6

    Article  Google Scholar 

  73. Wang Y, Bi Z, Ji S, Xu W (2019) Multi-dimensional news forecasting with recurrent neural networks, In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 1064-1071

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rukhshanda Ihsan.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ihsan, R., Khurshid, S.K., Shoaib, M. et al. A technique to forecast Pakistan’s news using deep hybrid learning model. Int. j. inf. tecnol. 16, 2505–2516 (2024). https://doi.org/10.1007/s41870-024-01781-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-024-01781-6

Keywords

Navigation