What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2023)

Abstract

Recently, there are a lot of works trying to probe knowledge in pre-trained language models (PLMs). Most probing works use data in knowledge bases to create a “fill-in-the-blank” task form and probe entity knowledge in auto-encoding PLMs (e.g. BERT). Though these works have got success, their methods can not be applied to some complicated knowledge like event-based commonsense knowledge and other PLMs such as auto-regressive models (like GPT). In this paper, we develop a new knowledge probe based on confidence sorting and detect event-based commonsense knowledge with it. To make the probe suitable for different types of PLMs, we integrate different knowledge scoring methods with a new method called probability difference log-likelihood score (PDL) among them. Finally, we conduct extensive experiments on several representative PLMs, explore their commonsense abilities and analyze the factors that influence their performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (Brazil)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (Brazil)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (Brazil)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Apidianaki, M., Soler, A.G.: ALL dolphins are intelligent and SOME are friendly: probing BERT for nouns’ semantic properties and their prototypicality. CoRR abs/2110.06376 (2021)

    Google Scholar 

  2. Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: COMET: commonsense transformers for automatic knowledge graph construction. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, pp. 4762–4779 (2019)

    Google Scholar 

  3. Bouraoui, Z., Camacho-Collados, J., Schockaert, S.: Inducing relational knowledge from BERT. In: AAAI 2020, IAAI 2020, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 7456–7463 (2020)

    Google Scholar 

  4. Brown, T.B., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December 2020, Virtual (2020)

    Google Scholar 

  5. Davison, J., Feldman, J., Rush, A.M.: Commonsense knowledge mining from pretrained models. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 1173–1178 (2019)

    Google Scholar 

  6. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)

    Google Scholar 

  7. Du, H., Le, Z., Wang, H., Chen, Y., Yu, J.: COKG-QA: multi-hop question answering over COVID-19 knowledge graphs. Data Intell. 4(3), 471–492 (2022)

    Article  Google Scholar 

  8. Gong, Y., Mao, L., Li, C.: Few-shot learning for named entity recognition based on BERT and two-level model fusion. Data Intell. 3(4), 568–577 (2021)

    Article  Google Scholar 

  9. Hwang, J.D., et al.: (comet-) atomic 2020: on symbolic and neural commonsense knowledge graphs. In: AAAI 2021, IAAI 2021, EAAI 2021, Virtual Event, 2–9 February 2021, pp. 6384–6392 (2021)

    Google Scholar 

  10. Jiang, Z., Xu, F.F., Araki, J., Neubig, G.: How can we know what language models know. Trans. Assoc. Comput. Linguist. 8, 423–438 (2020)

    Article  Google Scholar 

  11. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) ACL 2020, Online, 5–10 July 2020, pp. 7871–7880 (2020)

    Google Scholar 

  12. Li, X., Taheri, A., Tu, L., Gimpel, K.: Commonsense knowledge base completion. In: ACL 2016, 7–12 August 2016, Berlin, Germany, Volume 1: Long Papers, pp. 1445–1455 (2016)

    Google Scholar 

  13. Petroni, F., et al.: Language models as knowledge bases? In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 2463–2473 (2019)

    Google Scholar 

  14. Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  15. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)

    Google Scholar 

  16. Salazar, J., Liang, D., Nguyen, T.Q., Kirchhoff, K.: Masked language model scoring. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) ACL 2020, Online, 5–10 July 2020, pp. 2699–2712 (2020)

    Google Scholar 

  17. Sap, M., et al.: ATOMIC: an atlas of machine commonsense for if-then reasoning. In: AAAI 2019, IAAI 2019, EAAI 2019, Honolulu, Hawaii, USA, 27 January–1 February 2019, pp. 3027–3035 (2019)

    Google Scholar 

  18. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Singh, S., Markovitch, S. (eds.) AAAI 2017, 4–9 February 2017, San Francisco, California, USA, pp. 4444–4451 (2017)

    Google Scholar 

  19. Sun, T.X., Liu, X.Y., Qiu, X.P., Huang, X.J.: Paradigm shift in natural language processing. Mach. Intell. Res. 19(3), 169–183 (2022)

    Article  Google Scholar 

  20. Wang, C., Li, J., Chen, Y., Liu, K., Zhao, J.: CN-automic: distilling Chinese commonsense knowledge from pretrained language models. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022, pp. 9253–9265. Association for Computational Linguistics (2022)

    Google Scholar 

  21. Wang, C., Liang, S., **, Y., Wang, Y., Zhu, X., Zhang, Y.: SemEval-2020 task 4: commonsense validation and explanation. In: Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E. (eds.) COLING 2020, Barcelona (online), 12–13 December 2020, pp. 307–321 (2020)

    Google Scholar 

  22. Wang, X., et al.: Large-scale multi-modal pre-trained models: a comprehensive survey. Mach. Intell. Res. 20(4), 447–482 (2023)

    Article  Google Scholar 

  23. West, P., et al.: Symbolic knowledge distillation: from general language models to commonsense models. CoRR abs/2110.07178 (2021)

    Google Scholar 

  24. Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A., Choi, Y.: HellaSwag: can a machine really finish your sentence? In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, pp. 4791–4800. Association for Computational Linguistics (2019)

    Google Scholar 

  25. Zhou, X., Zhang, Y., Cui, L., Huang, D.: Evaluating commonsense in pre-trained language models. In: AAAI2020, IAAI 2020, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 9733–9740 (2020)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the National Key Research and Development Program of China (No. 2020AAA0106400), the National Natural Science Foundation of China (No. 61976211, 62176257). This work is also supported by the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDA27020100), the Youth Innovation Promotion Association CAS, and Yunnan Provincial Major Science and Technology Special Plan Projects (No. 202202AD080004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yubo Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Wang, C., Chen, Y., Liu, K., Zhao, J. (2023). What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44693-1_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44692-4

  • Online ISBN: 978-3-031-44693-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation