![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Chapter and Conference Paper
Towards Automated End-to-End Health Misinformation Free Search with a Large Language Model
In the information age, health misinformation remains a notable challenge to public welfare. Integral to addressing this issue is the development of search systems adept at identifying and filtering out mislea...
-
Chapter and Conference Paper
PyGaggle: A Gaggle of Resources for Open-Domain Question Answering
Text retrieval using dense–sparse hybrids has been gaining popularity because of their effectiveness. Improvements to both sparse and dense models have also been noted, in the context of open-domain question a...
-
Chapter and Conference Paper
Answer Retrieval for Math Questions Using Structural and Dense Retrieval
Answer retrieval for math questions is a challenging task due to the complex and structured nature of mathematical expressions. In this paper, we combine a structure retriever and a domain-adapted ColBERT retr...
-
Chapter and Conference Paper
Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering
One of the contributions of the landmark Dense Passage Retriever (DPR) work is the curation of a corpus of passages generated from Wikipedia articles that have been segmented into non-overlap** passages of 1...
-
Chapter and Conference Paper
Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking
While much recent work has demonstrated that hard negative mining can be used to train better bi-encoder models, few have considered it in the context of cross-encoders, which are key ingredients in modern re...
-
Chapter
Setting the Stage
This section begins by more formally characterizing the text ranking problem, explicitly enumerating our assumptions about characteristics of the input and output, and more precisely circumscribing the scope o...
-
Chapter
Refining Query and Document Representations
The vocabulary mismatch problem [Furnas et al., 1987]—where searchers and the authors of the texts to be searched use different words to describe the same concepts—was introduced in Section 1.2.2 as a core pro...
-
Chapter
Future Directions and Conclusions
It is quite remarkable that BERT debuted in October 2018, only around three years ago. Taking a step back and reflecting, the field has seen an incredible amount of progress in a short amount of time. As we ha...
-
Chapter and Conference Paper
Another Look at DPR: Reproduction of Training and Replication of Retrieval
Text retrieval using learned dense representations has recently emerged as a promising alternative to “traditional” text retrieval using sparse bag-of-words representations. One foundational work that has garn...
-
Chapter
Introduction
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query for a particular task. The most common formulation of text ranking is search, where the search en...
-
Chapter
Multi-Stage Architectures for Reranking
The simplest and most straightforward formulation of text ranking is to convert the task into a text classification problem, and then sort the texts to be ranked based on the probability that each item belongs...
-
Chapter
Learned Dense Representations for Ranking
Arguably, the single biggest benefit brought about by modern deep learning techniques to text ranking is the move away from sparse signals, mostly limited to exact matches, to continuous dense representations ...
-
Chapter and Conference Paper
Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study
Pseudo-Relevance Feedback (PRF) utilises the relevance signals from the top-k passages from the first round of retrieval to perform a second round of retrieval aiming to improve search effectiveness. A recent res...
-
Chapter and Conference Paper
Comparing Score Aggregation Approaches for Document Retrieval with Pretrained Transformers
While BERT has been shown to be effective for passage retrieval, its maximum input length limitation poses a challenge when applying the model to document retrieval. In this work, we reproduce three passage sc...
-
Chapter and Conference Paper
From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance
The latest major release of Lucene (version 8) in March 2019 incorporates block-max indexes and exploits the block-max variant of Wand for query evaluation, which are innovations that originated from academia. Th...
-
Chapter and Conference Paper
Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants
When researchers speak of BM25, it is not entirely clear which variant they mean, since many tweaks to Robertson et al.’s original formulation have been proposed. When practitioners speak of BM25, they most li...
-
Chapter and Conference Paper
Reproducibility is a Process, Not an Achievement: The Replicability of IR Reproducibility Experiments
This paper espouses a view of reproducibility in the computational sciences as a process and not just a point-in-time “achievement”. As a concrete case study, we revisit the Open-Source IR Reproducibility Challen...
-
Chapter and Conference Paper
Simple Techniques for Cross-Collection Relevance Feedback
We tackle the problem of transferring relevance judgments across document collections for specific information needs by reproducing and generalizing the work of Grossman and Cormack from the TREC 2017 Common C...
-
Chapter and Conference Paper
Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval
In the framework of axiomatic information retrieval, the semantic term matching technique proposed by Fang and Zhai in SIGIR 2006 has been shown to be effective in addressing the vocabulary mismatch problem, w...
-
Chapter
Comparative Assessment of Alignment Algorithms for NGS Data: Features, Considerations, Implementations, and Future
Due to the nature of massively parallel sequencing use of shorter reads, the algorithms developed for alignment have been crucial to the widespread adoption of Next-Generation Sequencing (NGS). There has been gre...