Abstract
Attention-based encoder-decoder neural network models have recently shown promising results in machine translation and speech recognition. In this work, we propose an attention based neural network model for joint named entity recognition and relation extraction. We explore different strategies in incorporating the alignment information to the encoder-decoder framework, and propose introducing attention mechanism to the alignment-based recurrent neural networks (RNN) models. Such attentions provide additional information to relation extraction and named entity recognition. Our independent models achieve state-of-the-art named entity recognition performance on the benchmark CoNLL04 dataset. Our joint training model further obtains 0.5% F1 absolute gain on named entity recognition and 0.9% F1 absolute improvement on relation extraction over the best models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nanda, K.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 22. Association for Computational Linguistics (2004)
Luo, G., et al.: Joint entity recognition and disambiguation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015)
Gers, F.: Long short-term memory in recurrent neural networks. Diss. Verlag nicht ermittelbar (2001)
Roth, D., Yih, W.: A linear programming formulation for global inference in natural language tasks. Illinois Univ at Urbana-Champaign Dept of Computer Science (2004)
El Hihi, S., Bengio, Y.: Hierarchical recurrent neural networks for long-term dependencies. In: Advances in Neural Information Processing Systems (1996)
Hermans, M., Schrauwen, B.: Training and analysing deep recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 190–198, (2013)
Pascanu, R., et al.: How to construct deep recurrent neural networks. ar**v preprint ar**v:1312.6026 (2013)
Qi, Y., et al.: Combining labeled and unlabeled data with word-class distribution learning. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (2009)
Collobert, R., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. ar**v preprint ar**v:1508.01991 (2015)
Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. ar**v preprint ar**v:1404.5367 (2014)
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In Proceedings of COLING, pp. 2335–2344 (2014)
Socher, R., et al.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics (2012)
Kate, R., Mooney, R.: Joint entity and relation extraction using card-pyramid parsing. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 203–212. Association for Computational Linguistics (2010)
Miwa, M., Sasaki, Y.: Modeling joint entity and relation extraction with table representation. In: EMNLP, pp. 1858–1869 (2014)
Gupta, P., Schütze, H., Andrassy, B.: Table filling multi-task recurrent neural network for joint entity and relation extraction. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (2016)
Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2014)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. ar**v preprint ar**v:1406.1078 (2014)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. ICML 3(28), 1310–1318 (2013)
Sutskever, I., Oriol, V., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in neural information processing systems (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. ar**v preprint ar**v:1409.0473 (2014)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning (2015)
Qiu, J., et al.: Dependency-based local attention approach to neural machine translation. Comput. Mater. Continua 58, 547–562 (2019)
Qu, Z., et al.: Feedback LSTM network based on attention for image description generator. CMC-Comput. Mater. Continua 59(2), 575–589 (2019)
Ling, H., et al.: Attention-aware network with latent semantic analysis for clothing invariant gait recognition. Comput. Mater. Continua 60, 1041–1054 (2019)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5-6), 602–610 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mai, Y., Shen, Y., Qi, G., Shen, X. (2020). Joint Extraction of Entity and Semantic Relation Using Encoder - Decoder Model Based on Attention Mechanism. In: Sun, X., Wang, J., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2020. Lecture Notes in Computer Science(), vol 12239. Springer, Cham. https://doi.org/10.1007/978-3-030-57884-8_54
Download citation
DOI: https://doi.org/10.1007/978-3-030-57884-8_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57883-1
Online ISBN: 978-3-030-57884-8
eBook Packages: Computer ScienceComputer Science (R0)