Topic Modeling Analysis of Tweets on the Twitter Hashtags with LDA and Creating a New Dataset

  • Conference paper
  • First Online:
Smart Applications with Advanced Machine Learning and Human-Centred Problem Design (ICAIAME 2021)

Abstract

Today, with the increasing use of social media platforms, many sociological problems that did not exist before have emerged. The difficulty of analyzing the resulting large amounts of data using traditional methods makes it difficult to understand, investigate these problems and produce the necessary solutions. For this reason, natural language processing studies performed using artificial intelligence algorithms have become popular again in the literature. In this study, twittler/messages sent about the december on the agenda in a certain time interval via the Twitter application, which is a social media platform, were compiled. After evaluations such as bi-gram and tri-gram were performed on the data set created, Topic modeling analysis (Topic Modeling) was performed using Latent Dirichlet Analysis (LDA) method because the data was unlabeled. Finally, the results obtained have been evaluated and the situations that may arise with natural language processing and the problems that can be proposed for solution have been revealed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 160.49
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 210.99
Price includes VAT (France)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Delibas A (2008) Doğal dil işleme ile Türkçe yazım hatalarının denetlenmesi. Doctoral dissertation, Fen Bilimleri Enstitüsü

    Google Scholar 

  2. Özbilici A (2006) Türkçe Doğal Dili Anlamada İlişkisel Ayrık Bilgiler Modeli ve Uygulaması, Sakarya Üniversitesi FBE, Yüksek Lisans Tezi

    Google Scholar 

  3. Nabiyev VV (2010) Yapay Zeka: İnsan-Bilgisayar Etkileşimi, Seçkin Yayıncılık, 3. Baskı, Ankara

    Google Scholar 

  4. Kesgin F (2007) Türkçe Metinler için Konu Belirleme Sistemi. İstanbul Teknik Üniversitesi Fen Bilimleri Enstitüsü Yüksek Lisans Tezi

    Google Scholar 

  5. Say B (2003) Türkçe İçin Biçimbirimsel ve Sözdizimsel Olarak İşaretlenmiş Ağaç Yapılı Bir Derlem Oluşturma, TÜBİTAK EEEAG Projesi

    Google Scholar 

  6. Onan A (2017, Apr) Sarcasm identification on twitter: a machine learning approach. In: Computer science on-line conference. Springer, Cham, pp 374–383

    Google Scholar 

  7. Szomszor M, Kostkova P, De Quincey E (2010) #Swineflu: twitter predicts swine flu outbreak in 2009. In: International conference on electronic healthcare. Springer, Berlin, pp 18–26

    Google Scholar 

  8. Bian J, Topaloglu U, Yu F (2012) Towards large-scale twitter mining for drug-related adverse events. In: Proceedings of the 2012 international workshop on smart health and wellbeing. ACM, pp 25–32

    Google Scholar 

  9. Nguyen LT, Wu P, Chan W, Peng W, Zhang Y (2012) Predicting collective sentiment dynamics from time-series social media. In: Proceedings of the first international workshop on issues of sentiment discovery and opinion mining. ACM, p 6

    Google Scholar 

  10. Claster WB, Dinh H, Cooper M (2010) Naïve Bayes and unsupervised artificial neural nets for Cancun tourism social media data analysis. In: Nature and biologically ınspired computing (NaBIC), 2010 Second world congress on IEEE, pp 158–163

    Google Scholar 

  11. Turney (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. ACL

    Google Scholar 

  12. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity analysis using subjectivity summarization based on minimum cuts. ACL

    Google Scholar 

  13. Hu M, Liu B (2004) Mining and summarizing customer reviews. KDD

    Google Scholar 

  14. Wilson T, Wiebe J, Hoffman P (2005) Recognizing contextual polarity in phrase level sentiment analysis. AC

    Google Scholar 

  15. Agarwal A, Biadsy F, Mckeown K (2009) Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In: Proceedings of the 12th conference of the European chapter of the ACL (EACL 2009), Mar 2009, pp 24–32

    Google Scholar 

  16. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Technical report, Stanford

    Google Scholar 

  17. Bermingham A, Smeaton A (2010) Classifying sentiment in microblogs: is brevity an advantage is brevity an advantage? ACM, pp 1833–1836

    Google Scholar 

  18. Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of LREC

    Google Scholar 

  19. Akbaş E (2012) Aspect based opinion mining on Turkish tweets, Yüksek Lisans Tezi, Bilkent Üniversitesi, Fen Bilimleri Enstitüsü, Ankara

    Google Scholar 

  20. Nizam H, Akın SS (2014) Sosyal Medyada Makine Öğrenmesi ile Duygu Analizinde Dengeli ve Dengesiz Veri Setlerinin Performanslarının Karşılaştırılması. XIX. Türkiye’de İnternet Konferansı, İzmir

    Google Scholar 

  21. Delibaş A (2008) Doğal Dil İşleme ile Türkçe Yazım Hatalarının Denetlenmesi, İstanbul Teknik Üniversitesi FBE, Yüksek Lisans Tezi

    Google Scholar 

  22. Boynukalın Z (2012) Emotion analysis of Turkish texts by using machine learning methods. MSc, Middle East Technical University, Ankara, Turkey

    Google Scholar 

  23. Yıldırım E, Çetin F, Eryiğit G, Temel T (2015) The impact of NLP on Turkish sentiment analysis. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 7(1):43–51

    Google Scholar 

  24. Agarwal A, **e B, Vovsha I, Rambow O, Passonneau R (2011) Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp 30–38

    Google Scholar 

  25. Yazğılı E, Baykara M (2019, Nov) Cyberbullying and detection methods. In: 2019 1st International ınformatics and software engineering conference (UBMYK)

    Google Scholar 

  26. Yılmaz H, Yumuşak S. Açık Kaynak Doğal Dil İşleme Kütüphaneleri. İstanbul Sabahattin Zaim Üniversitesi Fen Bilimleri Enstitüsü Dergisi 3(1):81–85

    Google Scholar 

  27. Qi X, Davison BD (2009) Web page classification. ACM Comput Surv 41(2):1–31

    Article  Google Scholar 

  28. Yüksel AS, Tan FG (2018) Metin madenciliği teknikleri ile sosyal ağlarda bilgi keşfi. Mühendislik Bilimleri ve Tasarım Dergisi 6(2):324–333

    Article  Google Scholar 

  29. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation David. J Mach Learn Res 3:993–1022

    Google Scholar 

  30. Seker SE (2016) Duygu Analizi (Sentimental analysis). YBS Ansiklopedi 3(3):21–36

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Çilem Koçak .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Koçak, Ç., Yiğit, T., Anitha, J., Mustafayeva, A. (2023). Topic Modeling Analysis of Tweets on the Twitter Hashtags with LDA and Creating a New Dataset. In: Smart Applications with Advanced Machine Learning and Human-Centred Problem Design. ICAIAME 2021. Engineering Cyber-Physical Systems and Critical Infrastructures, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-031-09753-9_41

Download citation

Publish with us

Policies and ethics

Navigation