Research on Proofreading Method of Semantic Collocation Error in Chinese

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Security (ICAIS 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1422))

Included in the following conference series:

Abstract

With the rapid development of network technology and the popularization of electronic documents, Chinese text automatic proofreading technology has attracted increasing attention. Automatic proofreading of semantic errors in Chinese text is a key and difficult point in the field of Chinese information processing. Aiming at this problem, we propose a semantic error proofreading method that contains dependency parsing and statistical theory, and construct a two-layer semantic knowledge base to assist error detection and error correction. The two-layer semantic knowledge base includes (1) knowledge base of word collocations containing structured information of sentences extracted from a large-scale corpus; (2) knowledge base of sememe collocations obtained by sememe map** through HowNet. On this basis, cubic association ratio and degree of polymerization are introduced to evaluate the proofreading results to reduce false positives and improve the accuracy of error correction opinions. The experiment result shows that our method will be of great use for the construction of semantic proofreading knowledge base and semantic error automatic proofreading methods.

Supported by the National Natural Science Foundation of China (NSFC No. 61772081).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wang, D., Song, Y., Li, J., et al.: A hybrid approach to automatic corpus generation for Chinese spelling check. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2517–2527. ACL, Stroudsburg (2018)

    Google Scholar 

  2. Li, C.W., Chen, J.J., Chang, J.S.: Chinese spelling check based on neural machine translation. In: 32nd Pacific Asia Conference on Language, Information and Computation. ACL, Stroudsburg (2018)

    Google Scholar 

  3. Wang, D., Tay, Y., Zhong L.: Confusion set-guided pointer networks for Chinese spelling check. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5780–5785. ACL, Stroudsburg (2019)

    Google Scholar 

  4. Ren, H., Yang, L., Xun, E.: A sequence to sequence learning for Chinese grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018, Part II. LNCS (LNAI), vol. 11109, pp. 401–410. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_36

    Chapter  Google Scholar 

  5. Zhou, J., Li, C., Liu, H., Bao, Z., Xu, G., Li, L.: Chinese grammatical error correction using statistical and neural models. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018, Part II. LNCS (LNAI), vol. 11109, pp. 117–128. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_10

    Chapter  Google Scholar 

  6. Li, S., Zhao, J., Shi, G., et al.: Chinese grammatical error correction based on convolutional sequence to sequence model. IEEE Access 7, 72905–72913 (2019)

    Article  Google Scholar 

  7. Luo, W., Luo, Z., Gong, X.: Semantic error checking in automatic proofreading for Chinese texts. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 7, p. 5. IEEE, Piscataway (2002)

    Google Scholar 

  8. Zhang, Y., Zheng, J.: Study of semantic error detecting method for Chinese text. Chin. J. Comput. 40(4), 911–924 (2017)

    Article  Google Scholar 

  9. Liu, L., Chao, C.: Study of automatic proofreading method for non-multi-character word error in Chinese text. Comput. Sci. 43(10), 200–205 (2016)

    Google Scholar 

  10. Cheng, X., Sun, P., Zhu, Q.: The research of Chinese text proofreading system model based on HNC. Microelectron. Comput. 26(10), 49–52 (2009)

    Google Scholar 

  11. Hai, Z.: Research on text semantic feature detection and proofreading. M.S. dissertation, Zhengzhou University, China (2019)

    Google Scholar 

  12. Dong, Z., Dong, Q.: HowNet - a hybrid language and knowledge resource. In: 2003 International Conference on Natural Language Processing and Knowledge Engineering, pp. 820–824. IEEE, Piscataway (2003)

    Google Scholar 

  13. Bloomfield, L.: A set of postulates for the science of language. Language 2(3), 153–164 (1926)

    Article  Google Scholar 

  14. Niu, Y., **e, R., Liu, Z., et al.: Improved word representation learning with sememes. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 2049–2058. ACL, Stroudsburg (2017)

    Google Scholar 

  15. Zeng, X., Yang, C., Tu, C., et al.: Chinese LIWC lexicon expansion via hierarchical classification of word embeddings with sememe attention. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 5650–5657. AAAI, Menlo Park (2018)

    Google Scholar 

  16. Tao, Y., Hai, Z., Shi, L., Wei, L.: Study of Chinese word collocation feature extraction and text proofreading. J. Chin. Comput. Syst. 39(11), 2485–2490 (2018)

    Google Scholar 

  17. Oakes, M.: Statics for corpus linguistics, pp. 171–172, Edinburgh. Edinburgh University Press, Edinburgh (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, R., Zhang, Y., Huang, G., Chen, R. (2021). Research on Proofreading Method of Semantic Collocation Error in Chinese. In: Sun, X., Zhang, X., **a, Z., Bertino, E. (eds) Advances in Artificial Intelligence and Security. ICAIS 2021. Communications in Computer and Information Science, vol 1422. Springer, Cham. https://doi.org/10.1007/978-3-030-78615-1_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78615-1_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78614-4

  • Online ISBN: 978-3-030-78615-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation