Performance of Machine Learning Methods Using Tweets

  • Conference paper
  • First Online:
Mathematical Methods for Engineering Applications (ICMASE 2022)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 414))

  • 308 Accesses

Abstract

Public opinions shared in common platforms like Twitter, Facebook, Instagram, etc. act as the sources of information for experts. Transportation and analysis of such data is very important and difficult due to data regulations and its structure. The pre-processing approaches and word-based dictionaries are used to understand the unprocessed data and make possible the opinions/tweets to be analyzed. Machine learning algorithms learn from past experience and use a variety of statistical, probabilistic and optimization algorithms to detect useful patterns from unstructured data sets. Our study aims to compare the performance of classification algorithms to predict individuals with COVID-19(\(+\)) or COVID-19(−) using the emotions among the tweets by text mining procedures. Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), Artificial Neural Networks (ANN), Gradient Boost (GBM) and XGradient algorithms were used to extract the accuracy of model performance of each model for the detection and identification of the disease related to the COVID-19 virus, which has been on the agenda recently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alsaffar D., Alfahhad A., Alqhtani B., Alamri, L., Alansari, Ş., Alqahtani, N., & Alboaneen, D.A. (2019). Machine and deep learning algorithms for twitter spam detection, Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, pp. 483–491.

    Google Scholar 

  2. Chandra, S.K., & Bajpai, M.K. (2019). Mesh free alternate directional implicit method based three dimensional super-diffusive model for benign brain tumor segmentation, Computers & Mathematics with Applications, Vol. 77, No. 12, pp. 3212–322.

    Google Scholar 

  3. Singh, K. K., Kumar, S., Dixit, P., & Bajpai, M.K. (2020). Kalman filter based short term prediction model for COVID-19 spread, Applied Intelligence, Vol. 51, pp. 2714–2726.

    Google Scholar 

  4. Gill, S.E., dos Santos, C.C., O’Gorman, DB., Carter, D.E., Patterson, E.K., Slessarev, M., Martin, C., Daley, M., Miller, M.R., Cepinskas, G., & Fraser, D.D. (2020). Transcriptional profiling of leukocytes in critically ill COVID-19 patients: implications for interferon response and coagulation, Intensive Care Medicine Experimental, Vol. 8, No. 75.

    Google Scholar 

  5. Delizo, J.P.D., Abisado. M.B., & De Los Trinos, M.I.P. (2020). Philippine twitter sentiments during COVID-19 pandemic using multinomial Naïve-Bayes, International Journal of Advanced Trends in Computer Science and Engineering, Vol. 9, No. 1.

    Google Scholar 

  6. Saba, T., Abunadi, I., Shahzad, M.N., & Khan, A.R. (2020). Machine learning techniques to detect and forecast the daily total COVID-19 infected and deaths cases under different lockdown types, Microscopy Research and Technique. Vol. 84, No. 17, pp. 1462–1474.

    Google Scholar 

  7. Madani, Y., Eritali, M., Boukhalene, & B. (2021). Using artificial intelligence techniques for detecting COVID-19 epidemic fake news in Moroccan tweets, Results in Physics, Vol. 25, 104266.

    Google Scholar 

  8. Chen, W., Fu, K., Zuo, J., et al. (2017). Radar emitter classification for large data set based on weighted-xgboost, IET Radar, Sonar & Navigation, Vol. 11, No. 8, pp. 1203–1207.

    Google Scholar 

  9. Zhong L., Mu L., Li, J., Wang J., Yin, Z., & Liu, D. (2020). Early prediction of the 2019 novel coronavirus out break in the mainland China based on simple mathematical model, IEEE Access, Vol. 8, 51761–51769.

    Google Scholar 

  10. Lu, H.M., Zeng D., & Chen H. (2010). Prospective infectious disease out break detection using markov switching models, IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 4, pp. 565–577.

    Google Scholar 

  11. Elmar, G. (2015). Going Public on Social Media, Social Media + Society, pp. 1–2.

    Google Scholar 

  12. Gupta, V., & Lehal, G.S. (2009). A survey of text mining techniques and applications, Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1, pp. 60–76.

    Google Scholar 

  13. Akilan, A. (2015). Text mining: challenges and future directions, 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 1679–1684.

    Google Scholar 

  14. Sukanya, M., & Biruntha, S. (2012). Techniques on text mining, International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp. 269–271.

    Google Scholar 

  15. Navathe, S.B., & Ramez, E. (2000). Data warehousing and data mining, Fundam, Database Syst., pp. 841–872.

    Google Scholar 

  16. Ergul-Aydin Z., Kamişli-Öztürk Z., Erzurum-Çiçek Z.İ. (2021). Turkish Sentiment Analysis For Open and Distance Education Systems, TOJDE, Vol. 22(3), 124–138.

    Google Scholar 

  17. Salloum, S.A., Al-Emran, M., Monem, A.A., & Shaalan, K. (2018). Using text mining techniques for extracting information from research articles, Intelligent Natural Language Processing: Trends and Applications, pp. 373–397.

    Google Scholar 

  18. Sun. S., Cao, Z., Zhu, H., & Zhao, J. (2020). A Survey of optimization methods from a machine learning perspective, IEEE Transactions on Cybernetics, Vol. 50, No. 8, pp. 3668–3681.

    Google Scholar 

  19. Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques, Informatica, pp. 249–268.

    Google Scholar 

  20. Bilgin, M. (2017). Gerçek veri setlerinde klasik makine öğrenmesi yöntemlerinin performans analizi, Akademik Bilişim.

    Google Scholar 

  21. Wills, S., Underwood, C.J., & Barrett, P.M. (2020). Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth, Palaeontology, Vol. 64, No. 1, pp. 75–99.

    Google Scholar 

  22. Haykin, S., (2008). Neural networks and learning machines, Pearson 3rd edn.

    Google Scholar 

  23. Breiman, L., (2001). Random Forests, Machine Learning, Vol. 45, pp. 5–32.

    Google Scholar 

  24. Freund, Y., Schapire, R.E. (1997). Decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, Vol. 55, No. 1, pp. 119–139.

    Google Scholar 

  25. Natekin, A., & Knoll A. (2013). Gradient boosting machines, a tutorial, Front in Neurorobotics, Vol. 7, No. 21.

    Google Scholar 

  26. Zhao, W., Li, J., Zhao, J., Zhao, D., Lu, J., & Wang, X. (2020). XGB model: Research on evaporation duct height prediction based on XGBoost algorithm, RADIOENGINEERING, Vol. 29, No. 1, pp. 81–93.

    Google Scholar 

  27. Powers, D.M. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation;Technical Report SIE-07-001; School of Informatics and Engineering.

    Google Scholar 

  28. Flach, P.A. (2003). The geometry of ROC space: Understanding machine learning metrics through ROC isometrics, In Proceedings of the 20th International Conference on Machine Learning (ICML-03).

    Google Scholar 

  29. Hossin, M., and Sulaiman, M.N. (2015). A review on evaluation metrics for data classification evaluations, International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol. 5, No. 2.

    Google Scholar 

  30. McCormick, T.H., Lee, H., Cesare, N., Shojaie, A., & Spiro, E.S. (2017). Using Twitter for demographic and social science research: Tools for data collection and processing, Sociological Methods & Research, Vol. 46, No. 3, pp. 390–421.

    Google Scholar 

  31. Misra, P., & Gupta, J. (2021). Impact of COVID-19 on Indian migrant workers: decoding twitter data by text mining, The Indian Journal of Labour Economics, pp. 1–17.

    Google Scholar 

  32. Feinerer, I., & Hornik, K. (2020). tm: Text Mining Package, R package version 0.7-8, 2020.

    Google Scholar 

  33. Silge, J., Robinson, D., & Hester. J. (2016). Tidytext: Text Mining Using Dplyr, Ggplot2, and Other Tidy Tools, The Journal of Open Source Software, Vol. 1, No. 3.

    Google Scholar 

  34. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004).

    Google Scholar 

  35. Goodfellow, I., Bengio, & Y., Courville, A. (2016). Deep Learning, The MIT Press.

    Google Scholar 

  36. Feurer, M., & Hutter, F. (2019). Hyperparameter optimization, The NeurIPS ’18 Competition, pp. 3–33.

    Google Scholar 

  37. Hertel, L., Baldi, P., & Gillen, D.L. (2021). Reproducible hyperparameter optimization, Journal of Computational and Graphical Statistics, Vol. 00, No. 0, pp. 1–16.

    Google Scholar 

Download references

Acknowledgement

This work was supported by Eskişehir Technical University Scientific Projects Commissions under the grade no 22ADP026.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Betül Kan-Kilinç .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tuğ, İ., Kan-Kilinç, B. (2023). Performance of Machine Learning Methods Using Tweets. In: Yilmaz, F., Queiruga-Dios, A., Martín Vaquero, J., Mierluş-Mazilu, I., Rasteiro, D., Gayoso Martínez, V. (eds) Mathematical Methods for Engineering Applications. ICMASE 2022. Springer Proceedings in Mathematics & Statistics, vol 414. Springer, Cham. https://doi.org/10.1007/978-3-031-21700-5_13

Download citation

Publish with us

Policies and ethics

Navigation