Abstract
Extracting causality from scientific literature is a crucial task that underpins many downstream knowledge-driven applications. To this end, this paper presents a novel causality extraction framework for scientific literature, called 2-Stage Causality Extraction for Scientific Literature (2SCE-4SL). The framework consists of two stages: in the stage 1, terms and causal trigger words are identified from causal sentences in the literature, and noisy causal triplets are then collocated. In the stage 2, we propose a Denoising AutoEncoder based on Transformer to represent the causal sentences. This approach is used to learn the causal dependency and contextual information of sentences, incorporating causal trigger word tagging and noise elimination, as well as injecting domain-specific knowledge. By combining the causality structure of stage 1 and the causality representation of stage 2, the true causal triplets are identified from the noisy causal triplets. We conducted experiments on an open access scientific literature dataset, comparing the performance of different disciplines, different training data volume, different document length and whether causality representation. We found that the average precision of 2SCE-4SL was 0.8146, and the average F1 was 0.8308, with the best performance achieved on full-text data. We also verified the effectiveness of the causality representation in stage 2, demonstrating that the architecture can capture the causal dependency of sentences and achieve good performance on two related tasks. Overall, detailed comparative and ablation experiments revealed that 2SCE-4SL requires only a small amount of annotated data to achieve better performance and domain adaptability in scientific literature.
Similar content being viewed by others
References
An, N., **. Computers in Biology and Medicine, 115, 103524. https://doi.org/10.1016/j.compbiomed.2019.103524
Beltagy, I., Lo, K., & Cohan, A. (2019). Scibert: A pretrained language model for scientific text. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 3606–3611). https://doi.org/10.18653/v1/D19-1371
Bonner, S., & Vasile, F. (2018). Causal embeddings for recommendation. In Proceedings of the 12th ACM conference on recommender systems—RecSys ‘18 (pp. 104–112). https://doi.org/10.1145/3240323.3240360
Bornmann, L., & Mutz, R. (2015). Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. Journal of the Association for Information Science and Technology, 66(11), 2215–2222. https://doi.org/10.1002/asi.23329
Chang, D.-S., & Choi, K.-S. (2006). Incremental cue phrase learning and bootstrap** method for causality extraction using cue phrase and word pair probabilities. Information Processing & Management, 42(3), 662–678. https://doi.org/10.1016/j.ipm.2005.04.004
Dasgupta, T., Saha, R., Dey, L., & Naskar, A. (2018). Automatic extraction of causal relations from text using linguistically informed deep neural networks. In Proceedings of the 19th annual sigdial meeting on discourse and dialogue (pp. 306–316). https://doi.org/10.18653/v1/W18-5035
Ding, X., Li, Z., Liu, T., & Liao, K. (2019). ELG: An event logic graph. Ar**v Preprint. https://arxiv.org/abs/1907.08015
Du, L., Ding, X., Liu, T., & Qin, B. (2021). Learning event graph knowledge for abductive reasoning. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: Long papers) (pp. 5181–5190). https://doi.org/10.18653/v1/2021.acl-long.403
Feder, A., Keith, K. A., Manzoor, E., Pryzant, R., Sridhar, D., Wood-Doughty, Z., Eisenstein, J., Grimmer, J., Reichart, R., Roberts, M. E., Stewart, B. M., Veitch, V., Yang D. (2021). Causal inference in natural language processing: Estimation, prediction, interpretation and beyond. Ar**v Preprint. https://arxiv.org/abs/2109.00725
Fu, J., Liu, Z., Liu, W., & Zhou, W. (2011). Event causal relation extraction based on cascaded conditional random fields. Pattern Recognition and Artiflcial Intelligence, 24(4), 567–573.
Fytas, P., Rizos, G., & Specia, L. (2021). What makes a scientific paper be accepted for publication? In Proceedings of the first workshop on causal inference and NLP (pp. 44–60).
Garcia, D., EDF-DER & IMA-TIEM. (1997). COATIS, an NLP system to locate expressions of actions connected by causality links. In International conference on knowledge engineering and knowledge management (pp. 347–352). https://springer.longhoe.net/chapter/10.1007/BFb0026799
Guo, Z., Liu, Z., Ling, Z., Wang, S., **, L., & Li, Y. (2020). Text classification by contrastive learning and cross-lingual data augmentation for Alzheimer’s disease detection. In Proceedings of the 28th international conference on computational linguistics (pp. 6161–6171). https://doi.org/10.18653/v1/2020.coling-main.542
Heindorf, S., Scholten, Y., Wachsmuth, H., Ngonga Ngomo, A.-C., & Potthast, M. (2020). CauseNet: Towards a causality graph extracted from the web. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 3023–3030). https://doi.org/10.1145/3340531.3412763
Hidey, C., & McKeown, K. (2016). Identifying causal relations using parallel wikipedia articles. In Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 1424–1433). https://doi.org/10.18653/v1/P16-1135
Hong, L., Lin, J., Li, S., Wan, F., Yang, H., Jiang, T., Zhao, D., & Zeng, J. (2020). A novel machine learning framework for automated biomedical relation extraction from large-scale literature repositories. Nature Machine Intelligence. https://doi.org/10.1038/s42256-020-0189-y
Jiang, T., Zhao, T., Qin, B., Liu, T., Chawla, N., & Jiang, M. (2019). Multi-input multi-output sequence labeling for joint extraction of fact and condition tuples from scientific text. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 302–312). https://doi.org/10.18653/v1/D19-1029
Kabir, M. A., Almulhim, A., Luo, X., & Al Hasan, M. (2022). Informative causality extraction from medical literature via dependency-tree-based patterns. Journal of Healthcare Informatics Research, 6, 295–316. https://doi.org/10.1007/s41666-022-00116-z
Kayesh, H., Islam, Md. S., & Wang, J. (2019). Event causality detection in tweets by context word extension and neural networks. In 2019 20th international conference on parallel and distributed computing, applications and technologies (PDCAT) (pp. 352–357). https://doi.org/10.1109/PDCAT46702.2019.00070
Khoo, C. S., Chan, S., & Niu, Y. (2000). Extracting causal knowledge from a medical database using graphical patterns. In Proceedings of the 38th annual meeting of the association for computational linguistics (pp. 336–343). https://doi.org/10.3115/1075218.1075261
Kruengkrai, C., Torisawa, K., Hashimoto, C., Kloetzer, J., Oh, J.-H., & Tanaka, M. (2017). Improving event causality recognition with multiple background knowledge sources using multi-column convolutional neural networks. Proceedings of the AAAI Conference on Artificial Intelligence. https://doi.org/10.1609/aaai.v31i1.11005
Lee, D.-G., & Shin, H. (2017). Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature. BMC Medical Informatics and Decision Making, 17(Suppl 1), 53. https://doi.org/10.1186/s12911-017-0448-y
Li, P., & Mao, K. (2019). Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts. Expert Systems with Applications, 115, 512–523. https://doi.org/10.1016/j.eswa.2018.08.009
Li, Z., Hu, H., Wang, H., Cai, L., Zhang, H., & Zhang, K. (2022). Why does the president tweet this? Discovering reasons and contexts for politicians’ tweets from news articles. Information Processing & Management, 59(3), 102892. https://doi.org/10.1016/j.ipm.2022.102892
Li, Z., Li, Q., Zou, X., & Ren, J. (2021). Causality extraction based on self-attentive BiLSTM-CRF with transferred embeddings. Neurocomputing, 423, 207–219. https://doi.org/10.1016/j.neucom.2020.08.078
Liu, J., Shen, Z., Cui, P., Zhou, L., Kuang, K., Li, B., & Lin, Y. (2021). Stable adversarial learning under distributional shifts. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 8662–8670.
Lo, K., Wang, L. L., Neumann, M., Kinney, R., & Weld, D. (2020). S2ORC: The semantic scholar open research corpus. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 4969–4983). https://doi.org/10.18653/v1/2020.acl-main.447
Luo, Z., Sha, Y., Zhu, K. Q., Hwang, S., & Wang, Z. (2016). Commonsense causal reasoning between short texts. In Fifteenth international conference on the principles of knowledge representation and reasoning.
Moraffah, R., Karami, M., Guo, R., Raglin, A., & Liu, H. (2020). Causal interpretability for machine learning—Problems, methods and evaluation. ACM SIGKDD Explorations Newsletter, 22(1), 18–33. https://doi.org/10.1145/3400051.3400058
Neumann, M., King, D., Beltagy, I., & Ammar, W. (2019). Scispacy: Fast and robust models for biomedical natural language processing. In Proceedings of the 18th BioNLP workshop and shared task (pp. 319–327). https://doi.org/10.18653/v1/W19-5034
Paul, M. (2017). Feature selection as causal inference: Experiments with text classification. In Proceedings of the 21st conference on computational natural language learning (CoNLL 2017) (pp. 163–172). https://doi.org/10.18653/v1/K17-1018
Radinsky, K., Davidovich, S., & Markovitch, S. (2012). Learning causality for news events prediction. In Proceedings of the 21st international conference on world wide web (pp. 909–918). https://doi.org/10.1145/2187836.2187958
Rosenthal, S., Farra, N., & Nakov, P. (2017). SemEval-2017 task 4: Sentiment analysis in twitter. In Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017) (pp. 502–518). https://doi.org/10.18653/v1/S17-2088
Sun, X., & Ding, K. (2018). Identifying and tracking scientific and technological knowledge memes from citation networks of publications and patents. Scientometrics, 116(3), 1735–1748. https://doi.org/10.1007/s11192-018-2836-1
Tshitoyan, V., Dagdelen, J., Weston, L., Dunn, A., Rong, Z., Kononova, O., Persson, K. A., Ceder, G., & Jain, A. (2019). Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763), 95–98. https://doi.org/10.1038/s41586-019-1335-8
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, U., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 6000–6010.
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., & Bottou, L. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11(12), 3371–3408.
Vo, D.-T., Al-Obeidat, F., & Bagheri, E. (2020). Extracting temporal and causal relations based on event networks. Information Processing & Management, 57(6), 102319. https://doi.org/10.1016/j.ipm.2020.102319
Wang, K., Reimers, N., & Gurevych, I. (2021, November). TSDAE: Using transformer-based sequential denoising auto-encoderfor unsupervised sentence embedding learning. In Findings of the association for computational linguistics: EMNLP 2021 (pp. 671–688). https://doi.org/10.18653/v1/2021.findings-emnlp.59
Wang, Y., Zhang, C., & Li, K. (2022). A review on method entities in the academic literature: Extraction, evaluation, and application. Scientometrics, 127, 2479–2520. https://doi.org/10.1007/s11192-022-04332-7
Wolff, P., & Song, G. (2003). Models of causation and the semantics of causal verbs. Cognitive Psychology, 47(3), 276–332. https://doi.org/10.1016/S0010-0285(03)00036-7
Xu, J., Zuo, W., Liang, S., & Zuo, X. (2020). A review of dataset and labeling methods for causality extraction. In Proceedings of the 28th international conference on computational linguistics.
Yang, J., **ong, H., Zhang, H., Hu, M., & An, N. (2022). Causal pattern representation learning for extracting causality from literature. In Proceedings of the 2022 5th international conference on machine learning and natural language processing. https://doi.org/10.1145/3578741.3578787
Yao, L., Chu, Z., Li, S., Li, Y., Gao, J., & Zhang, A. (2021). A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), 15, 1–46.
Zhang, C., Mayr, P., Lu, W., & Zhang, Y. (2022c). JCDL2022c workshop: Extraction and evaluation of knowledge entities from scientific documents (EEKE2022). In Proceedings of the 22nd ACM/IEEE joint conference on digital libraries. https://doi.org/10.1145/3529372.3530917
Zhang, C., Mayr, P., Lu, W., & Zhang, Y. (2023). Guest editorial: Extraction and evaluation of knowledge entities in the age of artificial intelligence. Aslib Journal of Information Management, 75(3), 433–437. https://doi.org/10.1108/AJIM-05-2023-507
Zhang, Y., Bai, R., Chen, Q., Zhang, Y., & Feng, M. (2022a). Causal discovery and knowledge linkage in scientific literature: A case study in biomedicine. In International conference on information (pp. 319–328). https://doi.org/10.1007/978-3-030-96957-8_28
Zhang, Y., Bai, R., Kong, L., & Wang, X. (2022b). 2SCE-4SL: A 2-stage causality extraction framework for scientific literature. In 3rd workshop on extraction and evaluation of knowledge entities from scientific documents (EEKE2022b) (pp. 29–40).
Zhao, S., Jiang, M., Liu, M., Qin, B., & Liu, T. (2018). CausalTriad: Toward pseudo causal relation discovery and hypotheses generation from medical text data.In Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics—BCB ‘18 (pp. 184–193). https://doi.org/10.1145/3233547.3233555
Zhao, S., Liu, T., Zhao, S., Chen, Y., & Nie, J.-Y. (2016). Event causality extraction based on connectives analysis. Neurocomputing, 173, 1943–1950. https://doi.org/10.1016/j.neucom.2015.09.066
Zhao, S., Wang, Q., Massung, S., Qin, B., Liu, T., Wang, B., & Zhai, C. (2017). Constructing and embedding abstract event causality networks from text snippets. In Proceedings of the tenth ACM international conference on web search and data mining—WSDM ‘17 (pp. 335–344). https://doi.org/10.1145/3018661.3018707
Zhao, Y., Zuo, W., Liang, S., Yuan, X., Zhang, Y., & Zuo, X. (2022). A word-granular adversarial attacks framework for causal event extraction. Entropy. https://doi.org/10.3390/e24020169
Acknowledgements
This research is funded by the National Social Science Foundation of China (No. 21BTQ071) and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (No. KYCX210560). The initial version of this article was accepted by EEKE2022, and this is an expanded and refined version.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
All authors disclosed no relevant relationships.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Y., Bai, R., Kong, L. et al. 2SCE-4SL: a 2-stage causality extraction framework for scientific literature. Scientometrics (2023). https://doi.org/10.1007/s11192-023-04817-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11192-023-04817-z