Two-Phase Semantic Retrieval for Explainable Multi-Hop Question Answering

Wang, Qin; Feng, Jianzhou; Xu, Ganlin; Huang, Lei

doi:10.1007/978-981-99-8082-6_35

Qin Wang¹²,
Jianzhou Feng¹²,
Ganlin Xu¹² &
…
Lei Huang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14448))

Included in the following conference series:

International Conference on Neural Information Processing

682 Accesses

Abstract

Explainable Multi-Hop Question Answering (MHQA) requires an ability to reason explicitly across facts to arrive at the answer. The majority of multi-hop reasoning methods concentrate on semantic similarity to obtain the next hops or act as entity-centric inference. However, approaches that ignore the rationales required for problems can easily lead to blindness in reasoning. In this paper, we propose a two-Phase text Retrieval method with an entity Mask mechanism (PRM), which focuses on the rationale from global semantics along with entity consideration. Specifically, it consists of two components: 1) The rationale-aware retriever is pre-trained via a dual encoder framework with an entity mask mechanism. The learned representations of hypotheses and facts are utilized to obtain top K candidate core facts by a sentence-level dense retrieval. 2) The entity-aware validator determines the reachability of hypotheses and core facts with an entity granularity sparse matrix. Our experiments on three public datasets in the scientific domain (i.e., OpenbookQA, Worldtree, and ARC-Challenge) demonstrate that the proposed model has achieved remarkable performance over the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Thailand)

eBook: EUR 69.54; Price includes VAT (Thailand)

Softcover Book: EUR 81.99; Price excludes VAT (Thailand)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
OPT-30B has 30B parameters, and the accuracy on ARC-Challenge is under the zero-shot setting.
2.
GPT-3 has 175B parameters, and the accuracy on OpenBookQA is under the zero-shot setting from [25].

References

Asai, A., Hashimoto, K., Hajishirzi, H., Socher, R., **ong, C.: Learning to retrieve reasoning paths over wikipedia graph for question answering. ar**v preprint ar**v:1911.10470 (2019)
Brown, T.B., et al.: Language models are few-shot learners (2020)
Google Scholar
Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading wikipedia to answer open-domain questions. ar**v preprint ar**v:1704.00051 (2017)
Clark, P., et al.: Think you have solved question answering? Try arc, the AI2 reasoning challenge. ar**v preprint ar**v:1803.05457 (2018)
Demszky, D., Guu, K., Liang, P.: Transforming question answering datasets into natural language inference datasets. ar**v preprint ar**v:1809.02922 (2018)
Dhingra, B., **, Q., Yang, Z., Cohen, W.W., Salakhutdinov, R.: Neural models for reasoning over multiple mentions using coreference. ar**v preprint ar**v:1804.05922 (2018)
Feldman, Y., El-Yaniv, R.: Multi-hop paragraph retrieval for open-domain question answering. ar**v preprint ar**v:1906.06606 (2019)
Izacard, G., Grave, E.: Distilling knowledge from reader to retriever for question answering. ar**v preprint ar**v:2012.04584 (2020)
Izacard, G., Grave, E.: Leveraging passage retrieval with generative models for open domain question answering. ar**v preprint ar**v:2007.01282 (2020)
Jansen, P.A., Wainwright, E., Marmorstein, S., Morrison, C.T.: Worldtree: a corpus of explanation graphs for elementary science questions supporting multi-hop inference. ar**v preprint ar**v:1802.03052 (2018)
Wei, J., et al.: Chain of thought prompting elicits reasoning in large language models. ar**v, abs/2201.11903 (2022)
Google Scholar
Jiang, Y., Joshi, N., Chen, Y.C., Bansal, M.: Explore, propose, and assemble: an interpretable model for multi-hop reading comprehension. ar**v preprint ar**v:1906.05210 (2019)
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. ar**v preprint ar**v:2004.04906 (2020)
Kundu, S., Khot, T., Sabharwal, A., Clark, P.: Exploiting explicit paths for multi-hop reading comprehension. ar**v preprint ar**v:1811.01127 (2018)
Lan, Y., Jiang, J.: Query graph generation for answering multi-hop complex questions from knowledge bases. Association for Computational Linguistics (2020)
Google Scholar
Lin, B.Y., Sun, H., Dhingra, B., Zaheer, M., Ren, X., Cohen, W.W.: Differentiable open-ended commonsense reasoning. ar**v preprint ar**v:2010.14439 (2020)
Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? A new dataset for open book question answering. ar**v preprint ar**v:1809.02789 (2018)
Pan, X., Yao, W., Zhang, H., Yu, D., Yu, D., Chen, J.: Knowledge-in-context: towards knowledgeable semi-parametric language models. In: The Eleventh International Conference on Learning Representations (2023)
Google Scholar
Qi, P., Lin, X., Mehr, L., Wang, Z., Manning, C.D.: Answering complex open-domain questions through iterative query generation. ar**v preprint ar**v:1910.07000 (2019)
Robertson, S., Zaragoza, H., et al.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)
Google Scholar
Sun, H., Bedrax-Weiss, T., Cohen, W.W.: Pullnet: open domain question answering with iterative retrieval on knowledge bases and text. ar**v preprint ar**v:1904.09537 (2019)
Sun, K., Yu, D., Yu, D., Cardie, C.: Improving machine reading comprehension with general reading strategies. ar**v preprint ar**v:1810.13441 (2018)
Kojima, T., Gu, S.S., Reid, M., Matsuo, Y., Iwasawa, Y.: Large language models are zero-shot reasoners. ar**v, abs/2205.11916 (2022)
Google Scholar
Thayaparan, M., Valentino, M., Freitas, A.: Explanationlp: abductive reasoning for explainable science question answering. ar**v preprint ar**v:2010.13128 (2020)
Touvron, H., et al.: Llama: open and efficient foundation language models (2023)
Google Scholar
Valentino, M., Thayaparan, M., Freitas, A.: Case-based abductive natural language inference. ar**v e-prints pp. ar**v-2009 (2020)
Google Scholar
**e, Z., Thiem, S., Martin, J., Wainwright, E., Marmorstein, S., Jansen, P.: Worldtree V2: a corpus of science-domain structured explanations and inference patterns supporting multi-hop inference. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5456–5473 (2020)
Google Scholar
Yadav, V., Bethard, S., Surdeanu, M.: Quick and (not so) dirty: unsupervised selection of justification sentences for multi-hop question answering. ar**v preprint ar**v:1911.07176 (2019)
Yang, Z., et al.: Hotpotqa: a dataset for diverse, explainable multi-hop question answering. ar**v preprint ar**v:1809.09600 (2018)
Zhang, S., et al.: OPT: open pre-trained transformer language models. ar**v preprint ar**v:2205.01068 (2022)
Zhou, Z., Valentino, M., Landers, D., Freitas, A.: Encoding explanatory knowledge for zero-shot science question answering. ar**v preprint ar**v:2105.05737 (2021)
Jansen, P., Balasubramanian, N., Surdeanu, M., Clark, P.: What’s in an explanation? characterizing knowledge and inference requirements for elementary science exams. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2956–2965 (2016)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (62172352), the Central leading local science and Technology Development Fund Project (No. 226Z0305G), Project of Hebei Key Laboratory of Software Engineering (22567637H), the Natural Science Foundation of Hebei Province (F20222 03028) and Program for Top 100 Innovative Talents in Colleges and Universities of Hebei Province (CXZZSS2023038).

Author information

Authors and Affiliations

School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Qin Wang, Jianzhou Feng, Ganlin Xu & Lei Huang

Authors

Qin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhou Feng
View author publications
You can also search for this author in PubMed Google Scholar
Ganlin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianzhou Feng .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Bei**g, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Q., Feng, J., Xu, G., Huang, L. (2024). Two-Phase Semantic Retrieval for Explainable Multi-Hop Question Answering. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14448. Springer, Singapore. https://doi.org/10.1007/978-981-99-8082-6_35

Download citation

DOI: https://doi.org/10.1007/978-981-99-8082-6_35
Published: 15 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8081-9
Online ISBN: 978-981-99-8082-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Two-Phase Semantic Retrieval for Explainable Multi-Hop Question Answering