Sarcasm Detection for Marathi and the role of emoticons

  • Conference paper
  • First Online:
Data Intelligence and Cognitive Informatics (ICDICI 2023)

Abstract

Sarcasm detection has gained a lot of attention in the the natural language processing (NLP) regime, largely due to the complexity involved in identifying and modeling the actual sarcastic intent behind a sentence or text. This problem has also been actively studied in the context of the posts on social media platforms such as Twitter. Several works in this area are available for English and a number of foreign languages. As far as the Indian languages are concerned, some research has also been conducted toward languages such as Hindi and Tamil. However, despite the fact that Marathi is the third most popular language in India, sarcasm recognition in Marathi remains unexplored. Most existing sarcasm detection algorithms focus on textual information, while ignoring rich semantic information expressed by the user in the form of emoticons (emojis). In real-world scenarios, emojis are often perceived as emotion signals, which strongly indicate the intent behind the text, and thus can potentially be used to improve sarcasm detection. In this paper, the problem of sarcasm detection for Marathi language is explored. Also, the significance and effectiveness of using emojis as a strong feature for sarcasm detection using machine learning algorithms is demonstrated.

Supported by SOCS,KBC NMU Jalgaon, Maharashtra, India.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Liu P, Chen W, Ou G, Wang T, Yang D, Lei K (2014) Sarcasm detection in social media based on imbalanced classification. In: Li F, Li G, Hwang Sw, Yao B, Zhang Z (eds) Web-age information management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_49

  2. Barbieri F, Ronzano F, Saggion H (2014) Italian irony detection in twitter: a first approach. In: Basile RA (eds) The first italian conference on computational linguistics cLiC-it 2014 and the fourth international workshop EVALITA, Italy, pp 28-32

    Google Scholar 

  3. Ptacek T, Habernal I, Hong J (2014) Sarcasm detection on czech and english twitter. In: Oden JT,JH (eds) Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, Dublin, Ireland, August 23–29, pp 213-223

    Google Scholar 

  4. Liebrecht C, Kunneman F, van den Bosch A (2013) The perfect solution for detecting sarcasm in tweets #not. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, Atlanta, Georgia. Association for Computational Linguistics, pp 29–37

    Google Scholar 

  5. Lunando E, Purwarianti A (2013) Indonesian social media sentiment analysis with sarcasm detection In: Proceedings of the international conference on advanced computer science and information systems (ICACSIS), Sanur Bali, Indonesia, pp 195–198

    Google Scholar 

  6. Kulkarni DS, Rodd SS (2022) Sentiment analysis in hindi-a survey on the state-of-the-art techniques ACM transactions on Asian and low-resource language information processing vol 21(1). pp 1–46. https://doi.org/10.1145/3469722

  7. Braja P, Dipankar D, Amitava D (2018) Sentiment analysis of code-mixed Indian languages: an overview of SAIL_code-mixed shared task @ICON-2017

    Google Scholar 

  8. Akhtar MS, Kumar A, Ekbal A, Bhattacharyya P (2016) A hybrid deep learning architecture for sentiment analysis. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, Osaka, Japan, pp 482–493

    Google Scholar 

  9. Mukku SS, Mamidi R (2017) ACTSA: annotated corpus for telugu sentiment analysis. In: Proceedings of the first workshop on building linguistically generalizable NLP systems, Copenhagen, Denmark, Association for Computational Linguistics, pp 54–58

    Google Scholar 

  10. Ravishankar N, Raghunathan S (2017) Corpus based sentiment classification of tamil movie tweets using syntactic patterns. IIOAB J: A J Multidisciplinary Sci Technol 8(2):172–178

    Google Scholar 

  11. Swami S, Khandelwal A, Singh V, Akhtar SS, Shrivastava M (2018) A corpus of english-hindi code-mixed tweets for sarcasm detection. In: The proceedings of 19th international conference on computational linguistics and intelligent text processing (CICLing-2018)

    Google Scholar 

  12. Kulkarni A, Mandhane M, Likhitkar M, Kshirsagar G, Joshi R (2021) L3CubeMahaSent: a marathi tweet-based sentiment analysis dataset. In: The Proceedings of the 11th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 213–220

    Google Scholar 

  13. Charalampakis B, Spathis D, Kouslis E, Kermanidis K (2016) A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets, Eng Appl Artif Intell 51:50–57. ISSN 0952-1976. https://doi.org/10.1016/j.engappai.2016.01.007

  14. Khodak M, Saunshi N, Vodrahalli K (2018) A large self-annotated corpus for sarcasm. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018), Miyazaki, Japan, European Language Resources Association (ELRA)

    Google Scholar 

  15. Filatova E (2012) Irony and sarcasm: corpus generation and analysis using crowdsourcing. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, European Language Resources Association (ELRA), pp 392–398

    Google Scholar 

  16. Abercrombie G, Hovy D (2016) Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of twitter conversations. In: Proceedings of the ACL student research workshop, Berlin, Germany, Association for Computational Linguistics, pp 107–113

    Google Scholar 

  17. Bharti SK, Babu KS, Jena SK (2015) Parsing based sarcasm sentiment recognition in twitter data. In: The proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Paris, France, pp 1373–1380

    Google Scholar 

  18. Arora G (2020) inltk: natural language toolkit for indic languages. In: The proceedings of second workshop for NLP open source software (NLP-OSS), Virtual Conference, pp 66–71

    Google Scholar 

  19. Kudo T, Richardson J (2018) SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, Brussels, Belgium, Association for Computational Linguistics pp 66–71

    Google Scholar 

  20. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Melbourne, Australia, Association for Computational Linguistics, pp 328–339

    Google Scholar 

  21. Eisner B, Rocktäschel T, Augenstein I, Bošnjak M, Riedel S (2016) Emoji2vec: learning emoji representations from their description. In: Proceedings of the fourth international workshop on natural language processing for social media, Austin, TX, USA, Association for Computational Linguistics, pp 48–54

    Google Scholar 

  22. Jain D, Kumar A, Garg G (2020) Sarcasm detection in mash-up language using soft attention based bi-directional LSTM and feature-rich CNN, Appl Soft Comput 91:106198. ISSN 1568-4946. https://doi.org/10.1016/j.asoc.2020.106198

  23. https://censusindia.gov.in/nada/index.php

  24. Kumar A, Sangwan SR, Singh AK, Wadhwa G (2022) Hybrid deep learning model for sarcasm detection in Indian indigenous language using word-emoji embeddings. ACM Trans Asian Low-Resour Lang Inf Process. https://doi.org/10.1145/3519299

  25. Subramanian J, Sridharan V, Shu K, Liu H (2019) Exploiting emojis for sarcasm detection. In: Thomson R, Bisgin H, Dancy C, Hyder A (eds) Social, cultural, and behavioral modeling. SBP-BRiMS 2019. Lecture notes in computer science, vol 11549. Springer, Cham. https://doi.org/10.1007/978-3-030-21741-9_8

  26. Pamungkas EW, Patti V (2018) # nondicevosulserio at semeval-2018 task 3: exploiting emojis and affective content for irony detection in english tweets. In: International workshop on semantic evaluation, Association for Computational Linguistics, pp 649–654

    Google Scholar 

  27. Lemmens J, Burtenshaw B, Lotfi E, Markov I, Daelemans W (2020) Sarcasm detection using an ensemble approach. In: Proceedings of the second workshop on figurative language processing, Online. Association for Computational Linguistics, pp 264–269

    Google Scholar 

  28. Sundararajan K, Palanisamy AK (2020) Multi-rule based ensemble feature selection model for sarcasm type detection in twitter. In: Computational intelligence and neuroscience, vol 2020. Article ID 2860479, pp 17. https://doi.org/10.1155/2020/2860479

  29. Patil PK, Kolhe SR (2022) MarathiSarc: a marathi tweets dataset for automatic sarcasm detection of marathi tweets, In: 13th International conference on advances in computing, control, and telecommunication technologies, ACT 2022, vol 8. pp 108–114

    Google Scholar 

  30. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: KDD ’16: proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794. https://doi.org/10.1145/2939672.2939785

  31. Desai N, Dave AD (2016) Sarcasm detection in Hindi sentences using support vector machine. Int J Computat Linguistics 4(7):8–15

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pravin K. Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, P.K., Kolhe, S.R. (2024). Sarcasm Detection for Marathi and the role of emoticons. In: Jacob, I.J., Piramuthu, S., Falkowski-Gilski, P. (eds) Data Intelligence and Cognitive Informatics. ICDICI 2023. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-7962-2_15

Download citation

Publish with us

Policies and ethics

Navigation