Performance of Machine Learning Methods Using Tweets

Tuğ, İlkay; Kan-Kilinç, Betül

doi:10.1007/978-3-031-21700-5_13

İlkay Tuğ⁷ &
Betül Kan-Kilinç⁷

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 414))

Included in the following conference series:

International Conference on Mathematics and its Applications in Science and Engineering

308 Accesses

Abstract

Public opinions shared in common platforms like Twitter, Facebook, Instagram, etc. act as the sources of information for experts. Transportation and analysis of such data is very important and difficult due to data regulations and its structure. The pre-processing approaches and word-based dictionaries are used to understand the unprocessed data and make possible the opinions/tweets to be analyzed. Machine learning algorithms learn from past experience and use a variety of statistical, probabilistic and optimization algorithms to detect useful patterns from unstructured data sets. Our study aims to compare the performance of classification algorithms to predict individuals with COVID-19(\(+\)) or COVID-19(−) using the emotions among the tweets by text mining procedures. Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), Artificial Neural Networks (ANN), Gradient Boost (GBM) and XGradient algorithms were used to extract the accuracy of model performance of each model for the detection and identification of the disease related to the COVID-19 virus, which has been on the agenda recently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Sentiment Analysis of COVID-19 Tweets Using TextBlob and Machine Learning Classifiers

Hybrid machine learning models to detect signs of depression

Article 06 October 2023

Deployment of Sentiment Analysis of Tweets Using Various Classifiers

References

Alsaffar D., Alfahhad A., Alqhtani B., Alamri, L., Alansari, Ş., Alqahtani, N., & Alboaneen, D.A. (2019). Machine and deep learning algorithms for twitter spam detection, Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, pp. 483–491.
Google Scholar
Chandra, S.K., & Bajpai, M.K. (2019). Mesh free alternate directional implicit method based three dimensional super-diffusive model for benign brain tumor segmentation, Computers & Mathematics with Applications, Vol. 77, No. 12, pp. 3212–322.
Google Scholar
Singh, K. K., Kumar, S., Dixit, P., & Bajpai, M.K. (2020). Kalman filter based short term prediction model for COVID-19 spread, Applied Intelligence, Vol. 51, pp. 2714–2726.
Google Scholar
Gill, S.E., dos Santos, C.C., O’Gorman, DB., Carter, D.E., Patterson, E.K., Slessarev, M., Martin, C., Daley, M., Miller, M.R., Cepinskas, G., & Fraser, D.D. (2020). Transcriptional profiling of leukocytes in critically ill COVID-19 patients: implications for interferon response and coagulation, Intensive Care Medicine Experimental, Vol. 8, No. 75.
Google Scholar
Delizo, J.P.D., Abisado. M.B., & De Los Trinos, M.I.P. (2020). Philippine twitter sentiments during COVID-19 pandemic using multinomial Naïve-Bayes, International Journal of Advanced Trends in Computer Science and Engineering, Vol. 9, No. 1.
Google Scholar
Saba, T., Abunadi, I., Shahzad, M.N., & Khan, A.R. (2020). Machine learning techniques to detect and forecast the daily total COVID-19 infected and deaths cases under different lockdown types, Microscopy Research and Technique. Vol. 84, No. 17, pp. 1462–1474.
Google Scholar
Madani, Y., Eritali, M., Boukhalene, & B. (2021). Using artificial intelligence techniques for detecting COVID-19 epidemic fake news in Moroccan tweets, Results in Physics, Vol. 25, 104266.
Google Scholar
Chen, W., Fu, K., Zuo, J., et al. (2017). Radar emitter classification for large data set based on weighted-xgboost, IET Radar, Sonar & Navigation, Vol. 11, No. 8, pp. 1203–1207.
Google Scholar
Zhong L., Mu L., Li, J., Wang J., Yin, Z., & Liu, D. (2020). Early prediction of the 2019 novel coronavirus out break in the mainland China based on simple mathematical model, IEEE Access, Vol. 8, 51761–51769.
Google Scholar
Lu, H.M., Zeng D., & Chen H. (2010). Prospective infectious disease out break detection using markov switching models, IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 4, pp. 565–577.
Google Scholar
Elmar, G. (2015). Going Public on Social Media, Social Media + Society, pp. 1–2.
Google Scholar
Gupta, V., & Lehal, G.S. (2009). A survey of text mining techniques and applications, Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1, pp. 60–76.
Google Scholar
Akilan, A. (2015). Text mining: challenges and future directions, 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 1679–1684.
Google Scholar
Sukanya, M., & Biruntha, S. (2012). Techniques on text mining, International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp. 269–271.
Google Scholar
Navathe, S.B., & Ramez, E. (2000). Data warehousing and data mining, Fundam, Database Syst., pp. 841–872.
Google Scholar
Ergul-Aydin Z., Kamişli-Öztürk Z., Erzurum-Çiçek Z.İ. (2021). Turkish Sentiment Analysis For Open and Distance Education Systems, TOJDE, Vol. 22(3), 124–138.
Google Scholar
Salloum, S.A., Al-Emran, M., Monem, A.A., & Shaalan, K. (2018). Using text mining techniques for extracting information from research articles, Intelligent Natural Language Processing: Trends and Applications, pp. 373–397.
Google Scholar
Sun. S., Cao, Z., Zhu, H., & Zhao, J. (2020). A Survey of optimization methods from a machine learning perspective, IEEE Transactions on Cybernetics, Vol. 50, No. 8, pp. 3668–3681.
Google Scholar
Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques, Informatica, pp. 249–268.
Google Scholar
Bilgin, M. (2017). Gerçek veri setlerinde klasik makine öğrenmesi yöntemlerinin performans analizi, Akademik Bilişim.
Google Scholar
Wills, S., Underwood, C.J., & Barrett, P.M. (2020). Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth, Palaeontology, Vol. 64, No. 1, pp. 75–99.
Google Scholar
Haykin, S., (2008). Neural networks and learning machines, Pearson 3rd edn.
Google Scholar
Breiman, L., (2001). Random Forests, Machine Learning, Vol. 45, pp. 5–32.
Google Scholar
Freund, Y., Schapire, R.E. (1997). Decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, Vol. 55, No. 1, pp. 119–139.
Google Scholar
Natekin, A., & Knoll A. (2013). Gradient boosting machines, a tutorial, Front in Neurorobotics, Vol. 7, No. 21.
Google Scholar
Zhao, W., Li, J., Zhao, J., Zhao, D., Lu, J., & Wang, X. (2020). XGB model: Research on evaporation duct height prediction based on XGBoost algorithm, RADIOENGINEERING, Vol. 29, No. 1, pp. 81–93.
Google Scholar
Powers, D.M. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation;Technical Report SIE-07-001; School of Informatics and Engineering.
Google Scholar
Flach, P.A. (2003). The geometry of ROC space: Understanding machine learning metrics through ROC isometrics, In Proceedings of the 20th International Conference on Machine Learning (ICML-03).
Google Scholar
Hossin, M., and Sulaiman, M.N. (2015). A review on evaluation metrics for data classification evaluations, International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol. 5, No. 2.
Google Scholar
McCormick, T.H., Lee, H., Cesare, N., Shojaie, A., & Spiro, E.S. (2017). Using Twitter for demographic and social science research: Tools for data collection and processing, Sociological Methods & Research, Vol. 46, No. 3, pp. 390–421.
Google Scholar
Misra, P., & Gupta, J. (2021). Impact of COVID-19 on Indian migrant workers: decoding twitter data by text mining, The Indian Journal of Labour Economics, pp. 1–17.
Google Scholar
Feinerer, I., & Hornik, K. (2020). tm: Text Mining Package, R package version 0.7-8, 2020.
Google Scholar
Silge, J., Robinson, D., & Hester. J. (2016). Tidytext: Text Mining Using Dplyr, Ggplot2, and Other Tidy Tools, The Journal of Open Source Software, Vol. 1, No. 3.
Google Scholar
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004).
Google Scholar
Goodfellow, I., Bengio, & Y., Courville, A. (2016). Deep Learning, The MIT Press.
Google Scholar
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization, The NeurIPS ’18 Competition, pp. 3–33.
Google Scholar
Hertel, L., Baldi, P., & Gillen, D.L. (2021). Reproducible hyperparameter optimization, Journal of Computational and Graphical Statistics, Vol. 00, No. 0, pp. 1–16.
Google Scholar

Download references

Acknowledgement

This work was supported by Eskişehir Technical University Scientific Projects Commissions under the grade no 22ADP026.

Author information

Authors and Affiliations

Department of Statistics, Faculty of Science, Eskisehir Technical University, 26470, Eskisehir, Turkey
İlkay Tuğ & Betül Kan-Kilinç

Authors

İlkay Tuğ
View author publications
You can also search for this author in PubMed Google Scholar
Betül Kan-Kilinç
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Betül Kan-Kilinç .

Editor information

Editors and Affiliations

Faculty of Arts and Sciences, Ankara Hacı Bayram Veli University, Polatli, Ankara, Türkiye
Fatih Yilmaz
Department of Applied Mathematics, University of Salamanca, Salamanca, Spain
Araceli Queiruga-Dios
Department of Applied Mathematics, University of Salamanca, Salamanca, Spain
Jesús Martín Vaquero
Department of Mathematics and Computer Science, Technical University of Civil Engineering, Bucharest, Romania
Ion Mierluş-Mazilu
Department of Physics and Mathematics, Instituto Superior de Engenharia de Coimbra, Coimbra, Portugal
Deolinda Rasteiro
Centro Universitario de Tecnología y Arte Digital (U-tad) , Las Rozas de Madrid, Spain
Víctor Gayoso Martínez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tuğ, İ., Kan-Kilinç, B. (2023). Performance of Machine Learning Methods Using Tweets. In: Yilmaz, F., Queiruga-Dios, A., Martín Vaquero, J., Mierluş-Mazilu, I., Rasteiro, D., Gayoso Martínez, V. (eds) Mathematical Methods for Engineering Applications. ICMASE 2022. Springer Proceedings in Mathematics & Statistics, vol 414. Springer, Cham. https://doi.org/10.1007/978-3-031-21700-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-21700-5_13
Published: 09 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21699-2
Online ISBN: 978-3-031-21700-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Performance of Machine Learning Methods Using Tweets