Abstract
Sarcasm detection has gained a lot of attention in the the natural language processing (NLP) regime, largely due to the complexity involved in identifying and modeling the actual sarcastic intent behind a sentence or text. This problem has also been actively studied in the context of the posts on social media platforms such as Twitter. Several works in this area are available for English and a number of foreign languages. As far as the Indian languages are concerned, some research has also been conducted toward languages such as Hindi and Tamil. However, despite the fact that Marathi is the third most popular language in India, sarcasm recognition in Marathi remains unexplored. Most existing sarcasm detection algorithms focus on textual information, while ignoring rich semantic information expressed by the user in the form of emoticons (emojis). In real-world scenarios, emojis are often perceived as emotion signals, which strongly indicate the intent behind the text, and thus can potentially be used to improve sarcasm detection. In this paper, the problem of sarcasm detection for Marathi language is explored. Also, the significance and effectiveness of using emojis as a strong feature for sarcasm detection using machine learning algorithms is demonstrated.
Supported by SOCS,KBC NMU Jalgaon, Maharashtra, India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu P, Chen W, Ou G, Wang T, Yang D, Lei K (2014) Sarcasm detection in social media based on imbalanced classification. In: Li F, Li G, Hwang Sw, Yao B, Zhang Z (eds) Web-age information management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_49
Barbieri F, Ronzano F, Saggion H (2014) Italian irony detection in twitter: a first approach. In: Basile RA (eds) The first italian conference on computational linguistics cLiC-it 2014 and the fourth international workshop EVALITA, Italy, pp 28-32
Ptacek T, Habernal I, Hong J (2014) Sarcasm detection on czech and english twitter. In: Oden JT,JH (eds) Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, Dublin, Ireland, August 23–29, pp 213-223
Liebrecht C, Kunneman F, van den Bosch A (2013) The perfect solution for detecting sarcasm in tweets #not. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, Atlanta, Georgia. Association for Computational Linguistics, pp 29–37
Lunando E, Purwarianti A (2013) Indonesian social media sentiment analysis with sarcasm detection In: Proceedings of the international conference on advanced computer science and information systems (ICACSIS), Sanur Bali, Indonesia, pp 195–198
Kulkarni DS, Rodd SS (2022) Sentiment analysis in hindi-a survey on the state-of-the-art techniques ACM transactions on Asian and low-resource language information processing vol 21(1). pp 1–46. https://doi.org/10.1145/3469722
Braja P, Dipankar D, Amitava D (2018) Sentiment analysis of code-mixed Indian languages: an overview of SAIL_code-mixed shared task @ICON-2017
Akhtar MS, Kumar A, Ekbal A, Bhattacharyya P (2016) A hybrid deep learning architecture for sentiment analysis. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, Osaka, Japan, pp 482–493
Mukku SS, Mamidi R (2017) ACTSA: annotated corpus for telugu sentiment analysis. In: Proceedings of the first workshop on building linguistically generalizable NLP systems, Copenhagen, Denmark, Association for Computational Linguistics, pp 54–58
Ravishankar N, Raghunathan S (2017) Corpus based sentiment classification of tamil movie tweets using syntactic patterns. IIOAB J: A J Multidisciplinary Sci Technol 8(2):172–178
Swami S, Khandelwal A, Singh V, Akhtar SS, Shrivastava M (2018) A corpus of english-hindi code-mixed tweets for sarcasm detection. In: The proceedings of 19th international conference on computational linguistics and intelligent text processing (CICLing-2018)
Kulkarni A, Mandhane M, Likhitkar M, Kshirsagar G, Joshi R (2021) L3CubeMahaSent: a marathi tweet-based sentiment analysis dataset. In: The Proceedings of the 11th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 213–220
Charalampakis B, Spathis D, Kouslis E, Kermanidis K (2016) A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets, Eng Appl Artif Intell 51:50–57. ISSN 0952-1976. https://doi.org/10.1016/j.engappai.2016.01.007
Khodak M, Saunshi N, Vodrahalli K (2018) A large self-annotated corpus for sarcasm. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018), Miyazaki, Japan, European Language Resources Association (ELRA)
Filatova E (2012) Irony and sarcasm: corpus generation and analysis using crowdsourcing. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, European Language Resources Association (ELRA), pp 392–398
Abercrombie G, Hovy D (2016) Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of twitter conversations. In: Proceedings of the ACL student research workshop, Berlin, Germany, Association for Computational Linguistics, pp 107–113
Bharti SK, Babu KS, Jena SK (2015) Parsing based sarcasm sentiment recognition in twitter data. In: The proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Paris, France, pp 1373–1380
Arora G (2020) inltk: natural language toolkit for indic languages. In: The proceedings of second workshop for NLP open source software (NLP-OSS), Virtual Conference, pp 66–71
Kudo T, Richardson J (2018) SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, Brussels, Belgium, Association for Computational Linguistics pp 66–71
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Melbourne, Australia, Association for Computational Linguistics, pp 328–339
Eisner B, Rocktäschel T, Augenstein I, Bošnjak M, Riedel S (2016) Emoji2vec: learning emoji representations from their description. In: Proceedings of the fourth international workshop on natural language processing for social media, Austin, TX, USA, Association for Computational Linguistics, pp 48–54
Jain D, Kumar A, Garg G (2020) Sarcasm detection in mash-up language using soft attention based bi-directional LSTM and feature-rich CNN, Appl Soft Comput 91:106198. ISSN 1568-4946. https://doi.org/10.1016/j.asoc.2020.106198
Kumar A, Sangwan SR, Singh AK, Wadhwa G (2022) Hybrid deep learning model for sarcasm detection in Indian indigenous language using word-emoji embeddings. ACM Trans Asian Low-Resour Lang Inf Process. https://doi.org/10.1145/3519299
Subramanian J, Sridharan V, Shu K, Liu H (2019) Exploiting emojis for sarcasm detection. In: Thomson R, Bisgin H, Dancy C, Hyder A (eds) Social, cultural, and behavioral modeling. SBP-BRiMS 2019. Lecture notes in computer science, vol 11549. Springer, Cham. https://doi.org/10.1007/978-3-030-21741-9_8
Pamungkas EW, Patti V (2018) # nondicevosulserio at semeval-2018 task 3: exploiting emojis and affective content for irony detection in english tweets. In: International workshop on semantic evaluation, Association for Computational Linguistics, pp 649–654
Lemmens J, Burtenshaw B, Lotfi E, Markov I, Daelemans W (2020) Sarcasm detection using an ensemble approach. In: Proceedings of the second workshop on figurative language processing, Online. Association for Computational Linguistics, pp 264–269
Sundararajan K, Palanisamy AK (2020) Multi-rule based ensemble feature selection model for sarcasm type detection in twitter. In: Computational intelligence and neuroscience, vol 2020. Article ID 2860479, pp 17. https://doi.org/10.1155/2020/2860479
Patil PK, Kolhe SR (2022) MarathiSarc: a marathi tweets dataset for automatic sarcasm detection of marathi tweets, In: 13th International conference on advances in computing, control, and telecommunication technologies, ACT 2022, vol 8. pp 108–114
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: KDD ’16: proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794. https://doi.org/10.1145/2939672.2939785
Desai N, Dave AD (2016) Sarcasm detection in Hindi sentences using support vector machine. Int J Computat Linguistics 4(7):8–15
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patil, P.K., Kolhe, S.R. (2024). Sarcasm Detection for Marathi and the role of emoticons. In: Jacob, I.J., Piramuthu, S., Falkowski-Gilski, P. (eds) Data Intelligence and Cognitive Informatics. ICDICI 2023. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-7962-2_15
Download citation
DOI: https://doi.org/10.1007/978-981-99-7962-2_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7999-8
Online ISBN: 978-981-99-7962-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)