System to Correct Toxic Expression with BERT and to Determine the Effect of the Attention Value

  • Conference paper
  • First Online:
Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2022)

Abstract

This paper describes an extended experiment on a system that converts sentences with toxic expressions into safe sentences, along with the evaluation and influence of the system. In recent years, toxicity on social media has created many problems. We evaluated the effectiveness of the proposed system for identifying toxic sentences using a prediction model based on Bidirectional Encoder Representations from Transformers (BERT) and then converting them into safe sentences using attention values and a score that indicates whether the sentences are appropriate after the predictive conversion. Six patterns of methods were tested, with Pattern 6 being the most effective for mitigating toxicity. This pattern is a technique that changes the way to take the top sentences of a beam search for each number of treatments, in addition to converting words with an attention value above a threshold and their adjacent words and phrases and words registered in the toxic dictionary. We used multiple indicators to judge the effectiveness of this method and evaluated its ability to make the text safe while preserving its meaning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 64.19
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 80.24
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Twitter API, https://developer.twitter.com/en/products/twitter-api.

  2. 2.

    TweetL, https://github.com/deepblue-ts/Tweetl”.

  3. 3.

    WikipediaBERT, https://github.com/cl-tohoku/bert-japanese.

  4. 4.

    https://huggingface.co/sonoisa/sentence-bert-base-ja-mean-tokens-v2.

  5. 5.

    https://github.com/hatonobu/my_research.

References

  1. Yoshida, M., Matsumoto, K., Yoshida, M., Kita, K.: A system to correct toxic expression with BERT. In: Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 92–97 (2022). https://doi.org/10.5220/0011586100003335

  2. Takahashi, N., Higashi, Y.: Flaming detection and analysis using emotion analysis on Twitter. In: The Institute of Electronics, Information and Communication Engineers Technical Report, pp. 135–140 (2017)

    Google Scholar 

  3. Ozawa, S., Yoshida, S., Kitazono, J., Sugawara, T., Haga, T.: A sentiment polarity prediction model using transfer learning and its application to SNS flaming event detection. In: Proceedings of IEEE Symposium Series on Computational Intelligence (2016). https://doi.org/10.1109/SSCI.2016.7849868

  4. Steinberger, J., Brychcin, T., Hercig, T., Krejzl, P.: Cross-lingual flames detection in news discussions. In: Proceedings of International Conference Recent Advances in Natural Language Processing (2017)

    Google Scholar 

  5. Iwasaki, Y., Orihara, R., Sei, Y., Nakagawa, H., Tahara, Y., Ohsuga, A.: Analysis of flaming and its applications in CGM. J. Jpn. Soc. Artif. Intell. 30(1), 152–160 (2013)

    Google Scholar 

  6. Karayiğit, H., Aci, C., Akdagli, A.: Detecting abusive Instagram comments in Turkish using convolutional neural network and machine learning methods. Expert Syst. Appl. 17415, 114802 (2021). https://doi.org/10.1016/j.eswa.2021.114802

    Article  Google Scholar 

  7. Omar, A., Mahmoud, T.M., Abd-El-Hafeez, T.: Comparative performance of machine learning and deep learning algorithms for Arabic hate speech detection in OSNs. In: Hassanien, A.-E., Azar, A.T., Gaber, T., Oliva, D., Tolba, F.M. (eds.) AICV 2020. AISC, vol. 1153, pp. 247–257. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44289-7_24

    Chapter  Google Scholar 

  8. Kapli, P., Ekbal, A.: A deep neural network based multi-task learning approach to hate speech detection. Knowl.-Based Syst. 210, 106458 (2020). https://doi.org/10.1016/j.knosys.2020.106458

    Article  Google Scholar 

  9. Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018). https://doi.org/10.1109/ACCESS.2018.2806394

    Article  Google Scholar 

  10. Yamakoshi, T., Komamizu, T., Ogawa, Y., Toyama, K.: Japanese legal term correction using BERT pretrained model. In: The 34th Annual Conference of the Japanese Society for Artificial Intelligence 4P3-OS-8-05 (2020)

    Google Scholar 

  11. Onishi, M., Sawai, Y., Komai, M., Sakai, K., Shindo, H.: Building a comprehensive system for preventing flaming on Twitter. In: The 29th Annual Conference of the Japanese Society for Artificial Intelligence 3O1-3in (2015)

    Google Scholar 

  12. Reid, M., Zhong, V.: LEWIS: levenshtein editing for unsupervised text style transfer. In: Findings of the Association for Computational Linguistics (ACL-IJCNLP), pp. 3932–3934 (2021). https://doi.org/10.18653/v1/2021.findings-acl.344

  13. Jacob, D., Ming-Wei, C., Kenton, L., Kristina, T.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ar**v preprint ar**v:1810.04805 (2018)

  14. Katsumata, S., Sakata, H.: Creation of spoken Japanese BERT with corpus of spontaneous Japanese. In: The 27th Annual Conference of the association for Natural Language Processing (2021)

    Google Scholar 

  15. Ashish, V., et al.: Attention Is All You Need. ar**v:1706.03762 (2017)

  16. Zhang, T., Kishore, V., Wu, F., Wein-Berger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. In: ICLR (2020)

    Google Scholar 

  17. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. Published at EMNLP (2019). https://doi.org/10.48550/ar**v.1908.10084

  18. Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002). https://doi.org/10.3115/1073083.1073135

Download references

Acknowledgements

This work was supported by the 2022 SCAT Research Grant and JSPS KAKENHI Grant Number JP20K12027, JP21K12141.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Motonobu Yoshida .

Editor information

Editors and Affiliations

Appendix

Appendix

This appendix describes the distribution scores of each indicator for the third times (Figs. 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 and 23).

Fig. 6.
figure 6

BERTScore of pattern 1.

Fig. 7.
figure 7

Sentence-BERT of pattern 1.

Fig. 8.
figure 8

BleuScore of pattern 1.

Fig. 9.
figure 9

BERTScore of pattern 2.

Fig. 10.
figure 10

Sentence-BERT of pattern 2.

Fig. 11.
figure 11

BleuScore of pattern 2.

Fig. 12.
figure 12

BERTScore of pattern 3.

Fig. 13.
figure 13

Sentence-BERT of pattern 3.

Fig. 14.
figure 14

BleuScore of pattern 3.

Fig. 15.
figure 15

BERTScore of pattern 4.

Fig. 16.
figure 16

Sentence-BERT of pattern 4.

Fig. 17.
figure 17

BleuScore of pattern 4.

Fig. 18.
figure 18

BERTScore of pattern 5.

Fig. 19.
figure 19

Sentence-BERT of pattern 5.

Fig. 20.
figure 20

BleuScore of pattern 5.

Fig. 21.
figure 21

BERTScore of pattern 6.

Fig. 22.
figure 22

Sentence-BERT of pattern 6.

Fig. 23.
figure 23

BleuScore of pattern 6.

Sentence-BERT distribution tends to score higher than the BERTScore. In addition, BERTScore scores are more widely dispersed. Therefore, BERTScore is considered more effective when considering the meaning of a sentence.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yoshida, M., Matsumoto, K., Yoshida, M., Kita, K. (2023). System to Correct Toxic Expression with BERT and to Determine the Effect of the Attention Value. In: Coenen, F., et al. Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2022. Communications in Computer and Information Science, vol 1842. Springer, Cham. https://doi.org/10.1007/978-3-031-43471-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43471-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43470-9

  • Online ISBN: 978-3-031-43471-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation