Abstract
This paper describes an extended experiment on a system that converts sentences with toxic expressions into safe sentences, along with the evaluation and influence of the system. In recent years, toxicity on social media has created many problems. We evaluated the effectiveness of the proposed system for identifying toxic sentences using a prediction model based on Bidirectional Encoder Representations from Transformers (BERT) and then converting them into safe sentences using attention values and a score that indicates whether the sentences are appropriate after the predictive conversion. Six patterns of methods were tested, with Pattern 6 being the most effective for mitigating toxicity. This pattern is a technique that changes the way to take the top sentences of a beam search for each number of treatments, in addition to converting words with an attention value above a threshold and their adjacent words and phrases and words registered in the toxic dictionary. We used multiple indicators to judge the effectiveness of this method and evaluated its ability to make the text safe while preserving its meaning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Yoshida, M., Matsumoto, K., Yoshida, M., Kita, K.: A system to correct toxic expression with BERT. In: Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 92–97 (2022). https://doi.org/10.5220/0011586100003335
Takahashi, N., Higashi, Y.: Flaming detection and analysis using emotion analysis on Twitter. In: The Institute of Electronics, Information and Communication Engineers Technical Report, pp. 135–140 (2017)
Ozawa, S., Yoshida, S., Kitazono, J., Sugawara, T., Haga, T.: A sentiment polarity prediction model using transfer learning and its application to SNS flaming event detection. In: Proceedings of IEEE Symposium Series on Computational Intelligence (2016). https://doi.org/10.1109/SSCI.2016.7849868
Steinberger, J., Brychcin, T., Hercig, T., Krejzl, P.: Cross-lingual flames detection in news discussions. In: Proceedings of International Conference Recent Advances in Natural Language Processing (2017)
Iwasaki, Y., Orihara, R., Sei, Y., Nakagawa, H., Tahara, Y., Ohsuga, A.: Analysis of flaming and its applications in CGM. J. Jpn. Soc. Artif. Intell. 30(1), 152–160 (2013)
Karayiğit, H., Aci, C., Akdagli, A.: Detecting abusive Instagram comments in Turkish using convolutional neural network and machine learning methods. Expert Syst. Appl. 17415, 114802 (2021). https://doi.org/10.1016/j.eswa.2021.114802
Omar, A., Mahmoud, T.M., Abd-El-Hafeez, T.: Comparative performance of machine learning and deep learning algorithms for Arabic hate speech detection in OSNs. In: Hassanien, A.-E., Azar, A.T., Gaber, T., Oliva, D., Tolba, F.M. (eds.) AICV 2020. AISC, vol. 1153, pp. 247–257. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44289-7_24
Kapli, P., Ekbal, A.: A deep neural network based multi-task learning approach to hate speech detection. Knowl.-Based Syst. 210, 106458 (2020). https://doi.org/10.1016/j.knosys.2020.106458
Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018). https://doi.org/10.1109/ACCESS.2018.2806394
Yamakoshi, T., Komamizu, T., Ogawa, Y., Toyama, K.: Japanese legal term correction using BERT pretrained model. In: The 34th Annual Conference of the Japanese Society for Artificial Intelligence 4P3-OS-8-05 (2020)
Onishi, M., Sawai, Y., Komai, M., Sakai, K., Shindo, H.: Building a comprehensive system for preventing flaming on Twitter. In: The 29th Annual Conference of the Japanese Society for Artificial Intelligence 3O1-3in (2015)
Reid, M., Zhong, V.: LEWIS: levenshtein editing for unsupervised text style transfer. In: Findings of the Association for Computational Linguistics (ACL-IJCNLP), pp. 3932–3934 (2021). https://doi.org/10.18653/v1/2021.findings-acl.344
Jacob, D., Ming-Wei, C., Kenton, L., Kristina, T.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ar**v preprint ar**v:1810.04805 (2018)
Katsumata, S., Sakata, H.: Creation of spoken Japanese BERT with corpus of spontaneous Japanese. In: The 27th Annual Conference of the association for Natural Language Processing (2021)
Ashish, V., et al.: Attention Is All You Need. ar**v:1706.03762 (2017)
Zhang, T., Kishore, V., Wu, F., Wein-Berger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. In: ICLR (2020)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. Published at EMNLP (2019). https://doi.org/10.48550/ar**v.1908.10084
Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002). https://doi.org/10.3115/1073083.1073135
Acknowledgements
This work was supported by the 2022 SCAT Research Grant and JSPS KAKENHI Grant Number JP20K12027, JP21K12141.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
This appendix describes the distribution scores of each indicator for the third times (Figs. 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 and 23).
Sentence-BERT distribution tends to score higher than the BERTScore. In addition, BERTScore scores are more widely dispersed. Therefore, BERTScore is considered more effective when considering the meaning of a sentence.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yoshida, M., Matsumoto, K., Yoshida, M., Kita, K. (2023). System to Correct Toxic Expression with BERT and to Determine the Effect of the Attention Value. In: Coenen, F., et al. Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2022. Communications in Computer and Information Science, vol 1842. Springer, Cham. https://doi.org/10.1007/978-3-031-43471-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-43471-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43470-9
Online ISBN: 978-3-031-43471-6
eBook Packages: Computer ScienceComputer Science (R0)