What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting

Li, Jiachun; Wang, Chenhao; Chen, Yubo; Liu, Kang; Zhao, Jun

doi:10.1007/978-3-031-44693-1_52

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14302))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1422 Accesses

Abstract

Recently, there are a lot of works trying to probe knowledge in pre-trained language models (PLMs). Most probing works use data in knowledge bases to create a “fill-in-the-blank” task form and probe entity knowledge in auto-encoding PLMs (e.g. BERT). Though these works have got success, their methods can not be applied to some complicated knowledge like event-based commonsense knowledge and other PLMs such as auto-regressive models (like GPT). In this paper, we develop a new knowledge probe based on confidence sorting and detect event-based commonsense knowledge with it. To make the probe suitable for different types of PLMs, we integrate different knowledge scoring methods with a new method called probability difference log-likelihood score (PDL) among them. Finally, we conduct extensive experiments on several representative PLMs, explore their commonsense abilities and analyze the factors that influence their performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (Brazil)

eBook: USD 89.00; Price excludes VAT (Brazil)

Softcover Book: USD 119.99; Price excludes VAT (Brazil)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Open Relation Extraction via Query-Based Span Prediction

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

References

Apidianaki, M., Soler, A.G.: ALL dolphins are intelligent and SOME are friendly: probing BERT for nouns’ semantic properties and their prototypicality. CoRR abs/2110.06376 (2021)
Google Scholar
Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: COMET: commonsense transformers for automatic knowledge graph construction. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, pp. 4762–4779 (2019)
Google Scholar
Bouraoui, Z., Camacho-Collados, J., Schockaert, S.: Inducing relational knowledge from BERT. In: AAAI 2020, IAAI 2020, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 7456–7463 (2020)
Google Scholar
Brown, T.B., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December 2020, Virtual (2020)
Google Scholar
Davison, J., Feldman, J., Rush, A.M.: Commonsense knowledge mining from pretrained models. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 1173–1178 (2019)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)
Google Scholar
Du, H., Le, Z., Wang, H., Chen, Y., Yu, J.: COKG-QA: multi-hop question answering over COVID-19 knowledge graphs. Data Intell. 4(3), 471–492 (2022)
Article Google Scholar
Gong, Y., Mao, L., Li, C.: Few-shot learning for named entity recognition based on BERT and two-level model fusion. Data Intell. 3(4), 568–577 (2021)
Article Google Scholar
Hwang, J.D., et al.: (comet-) atomic 2020: on symbolic and neural commonsense knowledge graphs. In: AAAI 2021, IAAI 2021, EAAI 2021, Virtual Event, 2–9 February 2021, pp. 6384–6392 (2021)
Google Scholar
Jiang, Z., Xu, F.F., Araki, J., Neubig, G.: How can we know what language models know. Trans. Assoc. Comput. Linguist. 8, 423–438 (2020)
Article Google Scholar
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) ACL 2020, Online, 5–10 July 2020, pp. 7871–7880 (2020)
Google Scholar
Li, X., Taheri, A., Tu, L., Gimpel, K.: Commonsense knowledge base completion. In: ACL 2016, 7–12 August 2016, Berlin, Germany, Volume 1: Long Papers, pp. 1445–1455 (2016)
Google Scholar
Petroni, F., et al.: Language models as knowledge bases? In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 2463–2473 (2019)
Google Scholar
Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Google Scholar
Salazar, J., Liang, D., Nguyen, T.Q., Kirchhoff, K.: Masked language model scoring. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) ACL 2020, Online, 5–10 July 2020, pp. 2699–2712 (2020)
Google Scholar
Sap, M., et al.: ATOMIC: an atlas of machine commonsense for if-then reasoning. In: AAAI 2019, IAAI 2019, EAAI 2019, Honolulu, Hawaii, USA, 27 January–1 February 2019, pp. 3027–3035 (2019)
Google Scholar
Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Singh, S., Markovitch, S. (eds.) AAAI 2017, 4–9 February 2017, San Francisco, California, USA, pp. 4444–4451 (2017)
Google Scholar
Sun, T.X., Liu, X.Y., Qiu, X.P., Huang, X.J.: Paradigm shift in natural language processing. Mach. Intell. Res. 19(3), 169–183 (2022)
Article Google Scholar
Wang, C., Li, J., Chen, Y., Liu, K., Zhao, J.: CN-automic: distilling Chinese commonsense knowledge from pretrained language models. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022, pp. 9253–9265. Association for Computational Linguistics (2022)
Google Scholar
Wang, C., Liang, S., **, Y., Wang, Y., Zhu, X., Zhang, Y.: SemEval-2020 task 4: commonsense validation and explanation. In: Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E. (eds.) COLING 2020, Barcelona (online), 12–13 December 2020, pp. 307–321 (2020)
Google Scholar
Wang, X., et al.: Large-scale multi-modal pre-trained models: a comprehensive survey. Mach. Intell. Res. 20(4), 447–482 (2023)
Article Google Scholar
West, P., et al.: Symbolic knowledge distillation: from general language models to commonsense models. CoRR abs/2110.07178 (2021)
Google Scholar
Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A., Choi, Y.: HellaSwag: can a machine really finish your sentence? In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, pp. 4791–4800. Association for Computational Linguistics (2019)
Google Scholar
Zhou, X., Zhang, Y., Cui, L., Huang, D.: Evaluating commonsense in pre-trained language models. In: AAAI2020, IAAI 2020, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 9733–9740 (2020)
Google Scholar

Download references

Acknowledgement

This work is supported by the National Key Research and Development Program of China (No. 2020AAA0106400), the National Natural Science Foundation of China (No. 61976211, 62176257). This work is also supported by the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDA27020100), the Youth Innovation Promotion Association CAS, and Yunnan Provincial Major Science and Technology Special Plan Projects (No. 202202AD080004).

Author information

Authors and Affiliations

The Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Bei**g, China
Jiachun Li, Chenhao Wang, Yubo Chen, Kang Liu & Jun Zhao
School of Artificial Intelligence, University of Chinese Academy of Sciences, Bei**g, China
Jiachun Li, Chenhao Wang, Yubo Chen, Kang Liu & Jun Zhao

Authors

Jiachun Li
View author publications
You can also search for this author in PubMed Google Scholar
Chenhao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yubo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yubo Chen .

Editor information

Editors and Affiliations

Emory University, Atlanta, GA, USA
Fei Liu
Microsoft Research Asia, Bei**g, China
Nan Duan
Soochow University, Suzhou, China
Qingting Xu
Soochow University, Suzhou, China
Yu Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Wang, C., Chen, Y., Liu, K., Zhao, J. (2023). What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_52

Download citation

DOI: https://doi.org/10.1007/978-3-031-44693-1_52
Published: 08 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44692-4
Online ISBN: 978-3-031-44693-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Open Relation Extraction via Query-Based Span Prediction

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

What Events Do Pre-trained Language Models Learn from Text? Probing Event-Based Commonsense Knowledge by Confidence Sorting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Open Relation Extraction via Query-Based Span Prediction

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation