A Novel Hybrid Model of Word Embedding and Deep Learning to Identify Hate and Abusive Content on Social Media Platform

  • Conference paper
  • First Online:
Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Applications (FAIEMA 2023)

Abstract

The emergence of digital and their products, such as Facebook, Twitter, Reddit, Instagram, and other social media entities, has become the dominant medium for exchanging information, connecting people, expressing emotion and feelings, and influencing audiences and communities. Sometimes, these platforms are used by some social media users to engage in inappropriate behavior and expression by using offensive, hateful, and harassing content to express their views and dissatisfaction. Negative, hateful, and abusive content can hurt or harm other social media users and communities and can lead to law and order problems in society. There is a need to stop and mitigate the effects of hate speech by develo** intelligent systems and models to detect them. Artificial Intelligence/machine learning can help in the detection of social media posts, comments, and replies that are hateful, abusive, and offensive. Such content also correlated with race, gender, sexuality, religion, and age. The objective is to evaluate the performance of the proposed model in comparison to other machine learning and deep learning algorithms on the newly collected data. The experimental results demonstrate that the proposed model, CNN with word embeddings, exhibits outstanding performance in hate speech detection. It achieves a remarkable accuracy of 83%. CNN’s word embeddings’ contextual understanding of language and its ability to capture complex semantic relationships contribute significantly to its superior performance compared to other traditional machine learning and deep learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 149.79
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 192.59
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Arango A, Pérez J, Poblete B (2019) Hate speech detection is not as easy as you may think: a closer look at model validation. In: Proceedings of the 42nd international acm sigir conference on research and development in information retrieval. pp 45–54

    Google Scholar 

  • Battiti R (1992) First-and second-order methods for learning: between steepest descent and newton’s method. Neural Comput 4(2):141–166

    Article  Google Scholar 

  • Bernard S, Heutte L, Adam S (2009) On the selection of decision trees in random forests. In: 2009 international joint conference on neural networks. IEEE, pp 302–307

    Google Scholar 

  • Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010: 19th international conference on computational StatisticsParis France, August 22–27, 2010 Keynote, Invited and contributed papers. Springer, pp 177–186

    Google Scholar 

  • Corazza M, Menini S, Cabrio E, Tonelli S, Villata S (2019) Cross-platform evaluation for italian hate speech detection. In: CLiC-it 2019-6th annual conference of the italian association for computational linguistics

    Google Scholar 

  • Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the international AAAI conference on web and social media, vol 11, pp 512–515

    Google Scholar 

  • García V, Mollineda RA, Sánchez JS (2008) On the k-nn performance in a challenging scenario of imbalance and overlap**. Pattern Anal Appl 11:269–280

    Article  Google Scholar 

  • Gongane VU, Munot MV, Anuse AD (2022) Detection and moderation of detrimental content on social media platforms: current status and future directions. Soc Netw Anal Mining 12(1):129

    Article  Google Scholar 

  • Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J et al (2018) Recent advances in convolutional neural networks. Pattern Recogn 77:354–377

    Article  Google Scholar 

  • Hilbe JM (2009) Logistic regression models. CRC Press

    Google Scholar 

  • Jahan MS, Oussalah M (2023) A systematic review of hate speech automatic detection using natural language processing. Neurocomputing 126232

    Google Scholar 

  • Kolhatkar V, Wu H, Cavasso L, Francis E, Shukla K, Taboada M (2020) The sfu opinion and comments corpus: a corpus for the analysis of online news comments. Corpus Pragmat 4:155–190

    Article  Google Scholar 

  • Kumar S (2020) Ensemble-based extreme learning machine model for occupancy detection with ambient attributes. Int J Syst Assur Eng Manag 1–11

    Google Scholar 

  • Kumar S (2023) A novel hybrid machine learning model for prediction of co2 using socio-economic and energy attributes for climate change monitoring and mitigation policies. Ecolog Inf

    Google Scholar 

  • Kumar S, Kalia A, Sharma A (2018a) Predictive analysis of alertness related features for driver drowsiness detection. Adv Intell Syst Comput 736:368–377

    Article  Google Scholar 

  • Kumar S, Nisha Z (2022) Does social media feed tell about your mental state? a deep randomised neural network approach. In: IEEE world congress on computational intelligence (WCCI). IEEE

    Google Scholar 

  • Kumar S, Pal S, Singh R (2018b) Intra elm variants ensemble based model to predict energy performance in residential buildings. Sustain Energy Grids Netw 16:177–187

    Article  Google Scholar 

  • Kumar S, Pal KS, Singh R (2018c) A novel method based on extreme learning machine to predict heating and cooling load through design and structural attributes. Energy Build 176:275–286

    Google Scholar 

  • Kumar S, Panwar S (2022) icacd: an intelligent deep learning model to categorise current affairs news article for efficient journalistic process. Int J Syst Assur Eng Manag

    Google Scholar 

  • Kumar S, Saibal KP, Singh R (2019) A novel hybrid model based on particle swarm optimisation and extreme learning machine for short-term temperature prediction using ambient sensors. Sustain Cities Soc

    Google Scholar 

  • Kumar S, Sharma A, Reddy BK, Sachan S, Jain V (2021) An intelligent model based on integrated inverse document frequency and multinomial naive bayes for current affairs news categorisation. Int J Syst Assur Eng Manag

    Google Scholar 

  • Martins R, Gomes M, Almeida JJ, Novais P, Henriques P (2018) Hate speech classification in social media using emotional analysis. In: 2018 7th Brazilian conference on intelligent systems (BRACIS). IEEE, pp 61–66

    Google Scholar 

  • Mathur P, Shah R, Sawhney R, Mahata D (2018) Detecting offensive tweets in hindi-english code-switched language. In: Proceedings of the sixth international workshop on natural language processing for social media, pp 18–26

    Google Scholar 

  • Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web, pp 145–153

    Google Scholar 

  • Olteanu A, Castillo C, Boy J, Varshney K (2018) The effect of extremist violence on hateful speech online. In: Proceedings of the international AAAI conference on web and social media, vol 12

    Google Scholar 

  • Qian J, Bethke A, Liu Y, Belding E, Wang WY (2019) A benchmark dataset for learning to intervene in online hate speech. ar**v:1909.04251

  • Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674

    Article  Google Scholar 

  • Sherstinsky A (2020) Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network. Physica D: Nonlinear Phenom 404:132306

    Article  Google Scholar 

  • Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Proc Lett 9:293–300

    Article  Google Scholar 

  • Vasist PN, Chatterjee D, Krishnan S (2023) The polarizing impact of political disinformation and hate speech: a cross-country configural narrative. Inf Syst Front 1–26

    Google Scholar 

  • Wilson RA, Land MK (2020) Hate speech on social media: content moderation in context. Conn Law Rev 52:1029

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sachin Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, S., Bhagat, A.K., Erugurala, A., Mirza, A., Jha, A.N., Verma, A.K. (2024). A Novel Hybrid Model of Word Embedding and Deep Learning to Identify Hate and Abusive Content on Social Media Platform. In: Farmanbar, M., Tzamtzi, M., Verma, A.K., Chakravorty, A. (eds) Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Applications. FAIEMA 2023. Frontiers of Artificial Intelligence, Ethics and Multidisciplinary Applications. Springer, Singapore. https://doi.org/10.1007/978-981-99-9836-4_4

Download citation

Publish with us

Policies and ethics

Navigation