Log in

Transformer-Encoder-GRU (T-E-GRU) for Chinese Sentiment Analysis on Chinese Comment Text

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Chinese sentiment analysis (CSA) has always been one of the challenges in natural language processing due to its complexity and uncertainty. Transformer has been successfully utilized in the understanding of semantics. However, it captures the sequence features in the text through position encoding, which is naturally insufficient compared with the recurrent model. To address this problem, we propose T-E-GRU. T-E-GRU combines the powerful global feature extraction of Transformer encoder and the natural sequence feature extraction of GRU for CSA. The experimental evaluations are conducted on three real Chinese datasets, the experimental results show that T-E-GRU has unique advantages over recurrent model, recurrent model with attention and BERT-based model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/utmhikari/doubanmovieshortcomments.

  2. https://huggingface.co/bert-base-chinese.

  3. https://huggingface.co/ckiplab/albert-base-chinese.

  4. https://huggingface.co.

  5. https://www.nltk.org/.

  6. https://huggingface.co/bert-base-cased.

References

  1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ar**v preprint ar**v:1409.0473

  2. Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155

    MATH  Google Scholar 

  3. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Networks 5(2):157–166

    Article  Google Scholar 

  4. Bin L, Quan L, ** X, Qian Z, Peng Z (2017) Aspect-based sentiment analysis based on multi-attention CNN. J Comput Res Dev 54(8):1724

    Google Scholar 

  5. Cavnar WB, Trenkle JM, et al. (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, 161175. Citeseer

  6. Chen P, Sun Z, Bing L, Yang W (2017) Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of The 2017 Conference on Empirical Methods in Natural Language Processing, pp 452–461

  7. Chen X, Qiu X, Zhu C, Liu P, Huang XJ (2015) Long short-term memory neural networks for Chinese word segmentation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1197–1206

  8. Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. ar**v preprint ar**v:1601.06733

  9. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. ar**v preprint ar**v:1406.1078

  10. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v preprint ar**v:1810.04805

  11. Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of The 2008 International Conference on Web Search and Data Mining, pp. 231–240

  12. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al. (2020) An image is worth 16x16 words: transformers for image recognition at scale. ar**v preprint ar**v:2010.11929

  13. Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  14. Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89

    Article  Google Scholar 

  15. Gao J (2021) Chinese sentiment classification model based on pre-trained BERT. In: 2021 2nd International Conference on Computers, Information Processing and Advanced Education, pp 1296–1300

  16. Gu F, Askari A, El Ghaoui L (2020) Fenchel lifted networks: A lagrange relaxation of neural network training. In: International Conference on Artificial Intelligence and Statistics, pp 3362–3371. PMLR

  17. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  18. Hu D (2019) An introductory survey on attention mechanisms in NLP problems. In: Proceedings of SAI Intelligent Systems Conference, pp 432–448. Springer

  19. Huang C, Zhao H (2007) Chinese word segmentation: A decade review. J Chin Inf Process 21(3):8–20

    MathSciNet  Google Scholar 

  20. Jordan M (1986) Serial order: a parallel distributed processing approach. Technical report, June 1985-March 1986. Tech. rep., California Univ., San Diego, La Jolla (USA). Inst. for Cognitive Science

  21. Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data

  22. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: A lite BERT for self-supervised learning of language representations. ar**v preprint ar**v:1909.11942

  23. Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on Chinese morphological and semantic relations. ar**v preprint ar**v:1805.06504

  24. Li X, Meng Y, Sun X, Han Q, Yuan A, Li J (2019) Is word segmentation necessary for deep learning of Chinese representations? ar**v preprint ar**v:1905.05526

  25. Liang J, Chai Y, Yuan H, Gao M, Zan H (2015) Polarity shifting and LSTM based recursive networks for sentiment analysis. J Chine Inf Process 29(5):152–159

    Google Scholar 

  26. Liu M, Chen L, Du X, ** L, Shang M (2021) Activated gradients for deep neural networks. IEEE Transactions on Neural Networks and Learning Systems

  27. Liu S, Mocanu DC, Pei Y, Pechenizkiy M (2021) Selfish sparse RNN training. ar**v preprint ar**v:2101.09048

  28. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 142–150. Association for Computational Linguistics, Portland, Oregon, USA. http://www.aclweb.org/anthology/P11-1015

  29. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. ar**v preprint ar**v:1310.4546

  30. Mikolov T, Yih Wt, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Proceedings of The 2013 Conference of The North American Chapter of The Association for Computational Linguistics: Human Language Technologies, pp 746–751

  31. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. ar**v preprint ar**v:1406.6247

  32. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543

  33. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. ar**v preprint ar**v:1802.05365

  34. Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16

    Article  Google Scholar 

  35. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

  36. Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. Tech. rep., California Univ San Diego La Jolla Inst for Cognitive Science

  37. Shi X, Lu R (2019) Attention-based bidirectional hierarchical LSTM networks for text semantic classification. In: 2019 10th International Conference on Information Technology in Medicine and Education (ITME), pp 618–622. IEEE

  38. Tang F, Nongpong K (2021) Chinese sentiment analysis based on lightweight character-level BERT. In: 2021 13th International Conference on Knowledge and Smart Technology (KST), pp 27–32. IEEE

  39. Vashishth S, Upadhyay S, Tomar GS, Faruqui M (2019) Attention interpretability across NLP tasks. ar**v preprint ar**v:1909.11218

  40. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. ar**v preprint ar**v:1706.03762

  41. Wang S, Li J, Hu D (2021) BiGRU-multi-head self-attention network for Chinese sentiment classification. In: Journal of Physics: Conference Series, 1827: 012169. IOP Publishing

  42. **ao Z, Liang P (2016) Chinese sentiment analysis using bidirectional LSTM with word embedding. In: International Conference on Cloud Computing and Security, pp 601–610. Springer

  43. Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International Conference on Neural Information Processing, pp 345–353. Springer

  44. Yu Z, Yu J, Cui Y, Tao D, Tian Q (2019) Deep modular co-attention networks for visual question answering. In: Proceedings of The IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6281–6290

  45. Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: A survey. Wiley Interdiscip Reviews Data Mining Knowl Discov 8(4):e1253

    Article  Google Scholar 

  46. Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems 28:649–657

    Google Scholar 

  47. Zhang Y, Zhang M, Liu Y, Ma S, Feng S (2013) Localized matrix factorization for recommendation based on matrix block diagonal forms. In: Proceedings of The 22nd International Conference on World Wide Web, pp 1511–1520

  48. Zhang Z, Brand M (2017) Convergent block coordinate descent for training tikhonov regularized deep neural networks. Advances in Neural Information Processing Systems 30:1721–1730

    Google Scholar 

  49. Zhang Z, Yue Y, Wu G, Li Y, Zhang H (2021) SBO-RNN: reformulating recurrent neural networks via stochastic bilevel optimization. Adv Neural Inf Process Syst 34:25839–25851

    Google Scholar 

  50. Zhou H, Zhang S, Peng J, Zhang S, Li J, **ong H, Zhang W (2020) Informer: beyond efficient transformer for long sequence time-series forecasting. ar**v preprint ar**v:2012.07436

  51. Zhou L, Zhou Y, Corso JJ, Socher R, **ong C (2018) End-to-end dense video captioning with masked transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8739–8748

  52. Zhu Z, Zhou Y, Xu S (2019) Transformer based Chinese sentiment classification. In: Proceedings of the 2019 2nd International Conference on Computational Intelligence and Intelligent Systems, pp 51–56

  53. Zou H, Tang X, **e B, Liu B (2015) Sentiment classification using machine learning techniques with syntax features. In: 2015 International Conference on Computational Science and Computational Intelligence (CSCI), pp 175–179. IEEE

Download references

Acknowledgements

This work is supported by the Key Research and Development Program of Shaanxi Province (No. 2020KW-068), the National Natural Science Foundation of China under Grant (No. 62106199, No. 62002290, No. 62001385), China Postdoctoral Science Foundation (No. 2021MD703883) and General Project of Education Department of Shaanxi Provincial Government under Grant (No. 21JK0926).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, B., Zhou, W. Transformer-Encoder-GRU (T-E-GRU) for Chinese Sentiment Analysis on Chinese Comment Text. Neural Process Lett 55, 1847–1867 (2023). https://doi.org/10.1007/s11063-022-10966-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10966-8

Keywords

Navigation