Log in

Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis

  • S.I: INDIA INTL. CONGRESS ON COMPUTATIONAL INTELLIGENCE 2017
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Cyberbullying and hate speeches are common issues in online etiquette. To tackle this highly concerned problem, we propose a text classification model based on convolutional neural networks for the de facto verbal aggression dataset built in our previous work and observe significant improvement, thanks to the proposed 2D TF-IDF features instead of pre-trained methods. Experiments are conducted to demonstrate that the proposed system outperforms our previous methods and other existing methods. A case study of word vectors is carried out to address the difficulty in using pre-trained word vectors for our short-text classification task, demonstrating the necessities of introducing 2D TF-IDF features. Furthermore, we also conduct visual analysis on the convolutional and pooling layers of the convolutional neural networks trained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations and Trends®. Inf Retrieval 2(1–2):1–135

    Article  Google Scholar 

  2. Zhang W, Xu H, Wan W (2012) Weakness Finder: find product weakness from Chinese reviews by using aspects based sentiment analysis. Expert Syst Appl 39(11):10283–10291

    Article  Google Scholar 

  3. Long W, Tang Y-R, Tian Y-J (2016) Investor sentiment identification based on the universum SVM. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2684-y

    Article  Google Scholar 

  4. Hájek P (2018) Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Comput Appl 29(7):343–358. https://doi.org/10.1007/s00521-017-3194-2

    Article  Google Scholar 

  5. Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREc, vol 10, no. 2010

  6. Kouloumpis E, Wilson T, Moore JD (2011) Twitter sentiment analysis: the good the bad and the omg! Icwsm 11(538–541):164

    Google Scholar 

  7. Mullen T, Malouf R (2006) A preliminary investigation into sentiment analysis of informal political discourse. In: AAAI spring symposium: computational approaches to analyzing weblogs, pp 159–162

  8. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 142–150

  9. Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432

  10. Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. In: Mining text data. Springer, New York, pp 415–463

  11. Chen J, Yan S, Wong KC (2017). Aggressivity detection on social network comments. In: Proceedings of the 2017 international conference on intelligent systems, metaheuristics & swarm intelligence. ACM, pp 103–107

  12. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224 N Project Report, Stanford, 1(2009), 12

  13. Fellbaum C (1998) WordNet. Wiley, New York

    Book  MATH  Google Scholar 

  14. Porter MF (1980) An algorithm for suffix strip**. Program 14(3):130–137

    Article  Google Scholar 

  15. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  16. Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, New York

    Book  MATH  Google Scholar 

  17. Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York

  18. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  19. Kim Y (2014) Convolutional neural networks for sentence classification. ar**v preprint ar**v:1408.5882

  20. Lee G, Jeong J, Seo S, Kim C, Kang P (2017) Sentiment classification with word attention based on weakly supervised leaning. ar**v preprint ar**v:1709.09885

  21. Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14):2627–2636

    Article  Google Scholar 

  22. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013). Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  23. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  24. Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: COLING, pp 69–78

  25. Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In Advances in neural information processing systems, pp 1019–1027

  26. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523

    Article  Google Scholar 

  27. Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280

    Article  Google Scholar 

  28. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  29. Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S (2010) Recurrent neural network based language model. In Interspeech, vol 2, p 3

  30. Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL, pp 1480–1489

  31. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ar**v preprint ar**v:1409.0473

  32. Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, New York, pp 532–538

  33. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  34. Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retrieval 1(1):69–90

    Article  Google Scholar 

Download references

Acknowledgements

The work described in this paper was substantially supported by two grants from the Research Grants Council of the Hong Kong Special Administrative Region (CityU 21200816) and (CityU 11203217).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ka-Chun Wong.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Yan, S. & Wong, KC. Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Comput & Applic 32, 10809–10818 (2020). https://doi.org/10.1007/s00521-018-3442-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3442-0

Keywords

Navigation