Abstract
Public opinions shared in common platforms like Twitter, Facebook, Instagram, etc. act as the sources of information for experts. Transportation and analysis of such data is very important and difficult due to data regulations and its structure. The pre-processing approaches and word-based dictionaries are used to understand the unprocessed data and make possible the opinions/tweets to be analyzed. Machine learning algorithms learn from past experience and use a variety of statistical, probabilistic and optimization algorithms to detect useful patterns from unstructured data sets. Our study aims to compare the performance of classification algorithms to predict individuals with COVID-19(\(+\)) or COVID-19(−) using the emotions among the tweets by text mining procedures. Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), Artificial Neural Networks (ANN), Gradient Boost (GBM) and XGradient algorithms were used to extract the accuracy of model performance of each model for the detection and identification of the disease related to the COVID-19 virus, which has been on the agenda recently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alsaffar D., Alfahhad A., Alqhtani B., Alamri, L., Alansari, Ş., Alqahtani, N., & Alboaneen, D.A. (2019). Machine and deep learning algorithms for twitter spam detection, Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, pp. 483–491.
Chandra, S.K., & Bajpai, M.K. (2019). Mesh free alternate directional implicit method based three dimensional super-diffusive model for benign brain tumor segmentation, Computers & Mathematics with Applications, Vol. 77, No. 12, pp. 3212–322.
Singh, K. K., Kumar, S., Dixit, P., & Bajpai, M.K. (2020). Kalman filter based short term prediction model for COVID-19 spread, Applied Intelligence, Vol. 51, pp. 2714–2726.
Gill, S.E., dos Santos, C.C., O’Gorman, DB., Carter, D.E., Patterson, E.K., Slessarev, M., Martin, C., Daley, M., Miller, M.R., Cepinskas, G., & Fraser, D.D. (2020). Transcriptional profiling of leukocytes in critically ill COVID-19 patients: implications for interferon response and coagulation, Intensive Care Medicine Experimental, Vol. 8, No. 75.
Delizo, J.P.D., Abisado. M.B., & De Los Trinos, M.I.P. (2020). Philippine twitter sentiments during COVID-19 pandemic using multinomial Naïve-Bayes, International Journal of Advanced Trends in Computer Science and Engineering, Vol. 9, No. 1.
Saba, T., Abunadi, I., Shahzad, M.N., & Khan, A.R. (2020). Machine learning techniques to detect and forecast the daily total COVID-19 infected and deaths cases under different lockdown types, Microscopy Research and Technique. Vol. 84, No. 17, pp. 1462–1474.
Madani, Y., Eritali, M., Boukhalene, & B. (2021). Using artificial intelligence techniques for detecting COVID-19 epidemic fake news in Moroccan tweets, Results in Physics, Vol. 25, 104266.
Chen, W., Fu, K., Zuo, J., et al. (2017). Radar emitter classification for large data set based on weighted-xgboost, IET Radar, Sonar & Navigation, Vol. 11, No. 8, pp. 1203–1207.
Zhong L., Mu L., Li, J., Wang J., Yin, Z., & Liu, D. (2020). Early prediction of the 2019 novel coronavirus out break in the mainland China based on simple mathematical model, IEEE Access, Vol. 8, 51761–51769.
Lu, H.M., Zeng D., & Chen H. (2010). Prospective infectious disease out break detection using markov switching models, IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 4, pp. 565–577.
Elmar, G. (2015). Going Public on Social Media, Social Media + Society, pp. 1–2.
Gupta, V., & Lehal, G.S. (2009). A survey of text mining techniques and applications, Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1, pp. 60–76.
Akilan, A. (2015). Text mining: challenges and future directions, 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 1679–1684.
Sukanya, M., & Biruntha, S. (2012). Techniques on text mining, International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp. 269–271.
Navathe, S.B., & Ramez, E. (2000). Data warehousing and data mining, Fundam, Database Syst., pp. 841–872.
Ergul-Aydin Z., Kamişli-Öztürk Z., Erzurum-Çiçek Z.İ. (2021). Turkish Sentiment Analysis For Open and Distance Education Systems, TOJDE, Vol. 22(3), 124–138.
Salloum, S.A., Al-Emran, M., Monem, A.A., & Shaalan, K. (2018). Using text mining techniques for extracting information from research articles, Intelligent Natural Language Processing: Trends and Applications, pp. 373–397.
Sun. S., Cao, Z., Zhu, H., & Zhao, J. (2020). A Survey of optimization methods from a machine learning perspective, IEEE Transactions on Cybernetics, Vol. 50, No. 8, pp. 3668–3681.
Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques, Informatica, pp. 249–268.
Bilgin, M. (2017). Gerçek veri setlerinde klasik makine öğrenmesi yöntemlerinin performans analizi, Akademik Bilişim.
Wills, S., Underwood, C.J., & Barrett, P.M. (2020). Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth, Palaeontology, Vol. 64, No. 1, pp. 75–99.
Haykin, S., (2008). Neural networks and learning machines, Pearson 3rd edn.
Breiman, L., (2001). Random Forests, Machine Learning, Vol. 45, pp. 5–32.
Freund, Y., Schapire, R.E. (1997). Decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, Vol. 55, No. 1, pp. 119–139.
Natekin, A., & Knoll A. (2013). Gradient boosting machines, a tutorial, Front in Neurorobotics, Vol. 7, No. 21.
Zhao, W., Li, J., Zhao, J., Zhao, D., Lu, J., & Wang, X. (2020). XGB model: Research on evaporation duct height prediction based on XGBoost algorithm, RADIOENGINEERING, Vol. 29, No. 1, pp. 81–93.
Powers, D.M. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation;Technical Report SIE-07-001; School of Informatics and Engineering.
Flach, P.A. (2003). The geometry of ROC space: Understanding machine learning metrics through ROC isometrics, In Proceedings of the 20th International Conference on Machine Learning (ICML-03).
Hossin, M., and Sulaiman, M.N. (2015). A review on evaluation metrics for data classification evaluations, International Journal of Data Mining & Knowledge Management Process (IJDKP), Vol. 5, No. 2.
McCormick, T.H., Lee, H., Cesare, N., Shojaie, A., & Spiro, E.S. (2017). Using Twitter for demographic and social science research: Tools for data collection and processing, Sociological Methods & Research, Vol. 46, No. 3, pp. 390–421.
Misra, P., & Gupta, J. (2021). Impact of COVID-19 on Indian migrant workers: decoding twitter data by text mining, The Indian Journal of Labour Economics, pp. 1–17.
Feinerer, I., & Hornik, K. (2020). tm: Text Mining Package, R package version 0.7-8, 2020.
Silge, J., Robinson, D., & Hester. J. (2016). Tidytext: Text Mining Using Dplyr, Ggplot2, and Other Tidy Tools, The Journal of Open Source Software, Vol. 1, No. 3.
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004).
Goodfellow, I., Bengio, & Y., Courville, A. (2016). Deep Learning, The MIT Press.
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization, The NeurIPS ’18 Competition, pp. 3–33.
Hertel, L., Baldi, P., & Gillen, D.L. (2021). Reproducible hyperparameter optimization, Journal of Computational and Graphical Statistics, Vol. 00, No. 0, pp. 1–16.
Acknowledgement
This work was supported by Eskişehir Technical University Scientific Projects Commissions under the grade no 22ADP026.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tuğ, İ., Kan-Kilinç, B. (2023). Performance of Machine Learning Methods Using Tweets. In: Yilmaz, F., Queiruga-Dios, A., Martín Vaquero, J., Mierluş-Mazilu, I., Rasteiro, D., Gayoso Martínez, V. (eds) Mathematical Methods for Engineering Applications. ICMASE 2022. Springer Proceedings in Mathematics & Statistics, vol 414. Springer, Cham. https://doi.org/10.1007/978-3-031-21700-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-21700-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21699-2
Online ISBN: 978-3-031-21700-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)