Abstract
Since sequence labeling-based methods take into account the dependencies between neighbouring labels, they have been widely used for keyphrase prediction. Existing methods mainly focus on the word-level sequence labeling over the word-level features, and fail to capture the phrase-level information (i.e., inner properties of multi-word keyphrases). In this paper, we concentrate on how to effectively capture the phrase-level features and then integrate them with the word-level features to improve the performance of keyphrase extraction in the sequence labeling-based method. Specifically, we propose a phrase-level attention enhanced conditional random field (PAE-CRF) model for keyphrase extraction, which consists of two major modules: a phrase-level attention module that captures phrase-level features, and a phrase-level attention enhanced CRF module that integrates the phrase-level attention information with the word-level features into CRF to extract keyphrases. Finally, these two modules are jointly trained to help them learn complementary information from each other. Compared with the recent state-of-the-art methods, our model can achieve better results through experiments on four benchmark datasets. The code and keyphrase prediction results of our model are available in public at https://github.com/pae-crf/PAE-CRF.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmad, W., Bai, X., Lee, S., Chang, K.W.: Select, extract and generate: neural keyphrase generation with layer-wise coverage attention. In: Proceedings of ACL, pp. 1389–1404 (2021)
Alzaidy, R., Caragea, C., Giles, C.L.: Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. In: Proceedings of WWW, pp. 2551–2557 (2019)
Bhaskar, P., Nongmeikapam, K., Bandyopadhyay, S.: Keyphrase extraction in scientific articles: a supervised approach. In: Proceedings of COLING, pp. 17–24 (2012)
Chan, H.P., Chen, W., Wang, L., King, I.: Neural keyphrase generation via reinforcement learning with adaptive rewards. In: Proceedings of ACL, pp. 2163–2174 (2019)
Chen, J., Zhang, X., Wu, Y., Yan, Z., Li, Z.: Keyphrase generation with correlation constraints. In: Proceedings of EMNLP, pp. 4057–4066 (2018)
Chen, W., Chan, H.P., Li, P., King, I.: Exclusive hierarchical decoding for deep keyphrase generation. In: Proceedings of ACL, pp. 1095–1105 (2020)
Chen, W., Gao, Y., Zhang, J., King, I., Lyu, M.R.: Title-guided encoding for keyphrase generation. In: Proceedings of AAAI, pp. 6268–6275 (2019)
Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches. In: Proceedings of SSST, pp. 103–111 (2014)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning (2014)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pp. 4171–4186 (2019)
Gollapalli, S.D., Li, X.L., Yang, P.: Incorporating expert knowledge into keyphrase extraction. In: Proceedings of AAAI, pp. 3180–3187 (2017)
Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of ACL. pp. 1631–1640 (2016)
Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of ACL, pp. 1262–1273 (2014)
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of EMNLP, pp. 216–223 (2003)
Weston, J., Sumit Chopra, A.B.: Memory networks. In: Proceedings of ICLR (2015)
Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, pp. 282–289 (2001)
Liu, T., Iwaihara, M.: Supervised learning of keyphrase extraction utilizing prior summarization. In: Proceedings of ICADL, pp. 157–166 (2021)
Lu, X., Chow, T.W.S.: Duration modeling with semi-Markov conditional random fields for keyphrase extraction. IEEE Trans. Knowl. Data Eng. 33(4), 1453–1466 (2021)
Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: Proceedings of ACL, pp. 582–592 (2017)
Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: Proceedings of ACL, pp. 582–592 (2017)
Nguyen, T.D., Kan, M.Y.: Keyphrase extraction in scientific publications. In: Proceedings of ICADL, pp. 317–326 (2007)
Santosh, T.Y.S.S., Sanyal, D.K., Bhowmick, P.K., Das, P.P.: Dake: document-level attention for keyphrase extraction. In: Proceedings of ECIR, pp. 392–401 (2020)
Sarawagi, S., Cohen, W.W.: Semi-Markov conditional random fields for information extraction. In: Proceedings of NeurIPS, pp. 1185–1192 (2004)
Song, M., Liu, H., **g, L.: Hyperrank: hyperbolic ranking model for unsupervised keyphrase extraction. In: Proceedings of EMNLP, pp. 16070–16080 (2023)
Sterckx, L., Caragea, C., Demeester, T., Develder, C.: Supervised keyphrase extraction as positive unlabeled learning. In: Proceedings of EMNLP, pp. 1924–1929 (2016)
Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., Liu, Y.: Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2023)
Su, J., et al.: Global pointer: novel efficient span-based approach for named entity recognition (2022). https://arxiv.org/abs/2208.03054
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS, pp. 3104–3112 (2014)
Tang, Y., et al.: Qalink: enriching text documents with relevant q &a site contents. In: Proceedings of CIKM, pp. 1359–1368 (2017)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS, pp. 6000–6010 (2017)
Wang, S., Jiang, J., Huang, Y., Wang, Y.: Automatic keyphrase generation by incorporating dual copy mechanisms in sequence-to-sequence learning. In: Proceedings of COLING, pp. 2328–2338 (2022)
Wang, Y., Li, J., Chan, H.P., King, I., Lyu, M.R., Shi, S.: Topic-aware neural keyphrase generation for social media language. In: Proceedings of ACL, pp. 2516–2526 (2019)
Wu, H., Ma, B., Liu, W., Chen, T., Nie, D.: Fast and constrained absent keyphrase generation by prompt-based learning. In: Proceedings of AAAI, pp. 11495–11503 (2022)
Yang, T., Hu, L., Shi, C., Ji, H., Li, X., Nie, L.: HGAT: heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans. Inf. Syst. 39(3), 1–29 (2021)
Ye, J., Gui, T., Luo, Y., Xu, Y., Zhang, Q.: One2Set: generating diverse keyphrases as a set. In: Proceedings of ACL, pp. 4598–4608 (2021)
Yuan, X., et al.: One size does not fit all: Generating and evaluating variable number of keyphrases. In: Proceedings of ACL, pp. 7961–7975 (2020)
Zhang, C., Wang, H., Liu, Y., Wu, D., Liao, Y.P., Wang, B.: Automatic keyword extraction from documents using conditional random fields. J. Comput. Inf. Syst. 4(3), 1169–1180 (2008)
Zhang, Y., Jiang, T., Yang, T., Li, X., Wang, S.: Htkg: deep keyphrase generation with neural hierarchical topic guidance. In: Proceedings of SIGIR, pp. 1044–1054 (2022)
Zhang, Y., Yang, T., Jiang, T., Li, X., Wang, S.: Hyperbolic deep keyphrase generation. In: Proceedings of ECML-PKDD, pp. 521–536 (2022)
Zhao, J., Zhang, Y.: Incorporating linguistic constraints into keyphrase generation. In: Proceedings of ACL, pp. 5224–5233 (2019)
Zhou, T., Zhang, Y., Zhu, H.: Multi-level memory network with CRFs for keyphrase extraction. In: Proceedings of PAKDD, pp. 726–738 (2020)
Acknowledgements
This work was partially supported by grants from the Scientific Research Project of Tian** Educational Committee (Grant No. 2021ZD002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, S., Jiang, T., Zhang, Y. (2024). A Phrase-Level Attention Enhanced CRF for Keyphrase Extraction. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14608. Springer, Cham. https://doi.org/10.1007/978-3-031-56027-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-56027-9_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56026-2
Online ISBN: 978-3-031-56027-9
eBook Packages: Computer ScienceComputer Science (R0)