-
Chapter and Conference Paper
COILcr: Efficient Semantic Matching in Contextualized Exact Match Retrieval
Lexical exact match systems that use inverted lists are a fundamental text retrieval architecture. A recent advance in neural IR, COIL, extends this approach with contextualized inverted lists from a deep languag...
-
Chapter and Conference Paper
Rethink Training of BERT Rerankers in Multi-stage Retrieval Pipeline
Pre-trained deep language models (LM) have advanced the state-of-the-art of text retrieval. Rerankers fine-tuned from deep LM estimates candidate relevance based on rich contextualized matching signals. Meanwh...
-
Chapter and Conference Paper
PGT: Pseudo Relevance Feedback Using a Graph-Based Transformer
Most research on pseudo relevance feedback (PRF) has been done in vector space and probabilistic retrieval models. This paper shows that Transformer-based rerankers can also benefit from the extra context that...
-
Chapter and Conference Paper
Assessing the Benefits of Model Ensembles in Neural Re-ranking for Passage Retrieval
Our work aimed at experimentally assessing the benefits of model ensembling within the context of neural methods for passage re-ranking. Starting from relatively standard neural models, we use a previous techn...
-
Chapter and Conference Paper
Complement Lexical Retrieval Model with Semantic Residual Embeddings
This paper presents clear, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model.clear explicitly train...
-
Chapter and Conference Paper
Rethinking Query Expansion for BERT Reranking
Recent studies have shown promising results of using BERT for Information Retrieval with its advantages in understanding the text content of documents and queries. Compared to short, keywords queries, higher a...
-
Chapter and Conference Paper
Inverted List Caching for Topical Index Shards
Selective search is a distributed retrieval architecture that intentionally creates skewed postings and access patterns. This work shows that the well-known QtfDf inverted list caching algorithm is as effective w...
-
Article
Efficient distributed selective search
Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. By partitioning the collection into small topical shards, and then using a resource ...
-
Chapter and Conference Paper
Jitter Search: A News-Based Real-Time Twitter Search Interface
In this demo we show how we can enhance real-time microblog search by monitoring news sources on Twitter. We improve retrieval through query expansion using pseudo-relevance feedback. However, instead of doing...
-
Chapter and Conference Paper
Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge
The Open-Source IR Reproducibility Challenge brought together developers of open-source search engines to provide reproducible baselines of their systems in a common environment on Amazon EC2. The product is a...
-
Chapter and Conference Paper
Does Selective Search Benefit from WAND Optimization?
Selective search is a distributed retrieval technique that reduces the computational cost of large-scale information retrieval. By partitioning the collection into topical shards, and using a resource selectio...
-
Chapter and Conference Paper
Exploratory Learning
In multiclass semi-supervised learning (SSL), it is sometimes the case that the number of classes present in the data is not known, and hence no labeled examples are provided for some classes. In this paper we...
-
Chapter and Conference Paper
A Methodology for Evaluating Aggregated Search Results
Aggregated search is the task of incorporating results from different specialized search services, or verticals, into Web search results. While most prior work focuses on deciding which verticals to present, the ...
-
Article
On the number of terms used in automatic query expansion
This paper investigates the number of expansion terms to use in automatic query expansion by examining the behavior of eight retrieval systems participating in the NRRC Reliable Information Access Workshop. Th...
-
Article
Measuring incremental changes in word knowledge: Experimental validation and implications for learning and assessment
The goal of this study was to test a new technique for assessing vocabulary development. This technique is based on an algorithm for scoring the accuracy of word definitions using a continuous scale (Collins-T...
-
Article
An effective and efficient results merging strategy for multilingual information retrieval in federated search environments
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target languages in response to a user query in a single source language. In a multilingual ...
-
Chapter and Conference Paper
Word Sense Disambiguation for Vocabulary Learning
Words with multiple meanings are a phenomenon inherent to any natural language. In this work, we study the effects of such lexical ambiguities on second language vocabulary learning. We demonstrate that machin...
-
Article
Full-text federated search of text-based digital libraries in peer-to-peer networks
Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search ...
-
Chapter and Conference Paper
CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists
We participated in two tasks: Multi-8 two-years-on retrieval and Multi-8 results merging. For the multi-8 two-years-on retrieval work, algorithms are proposed to combine simple multilingual ranked lists into a...
-
Chapter and Conference Paper
Parameter Estimation for a Simple Hierarchical Generative Model for XML Retrieval
This paper explores the possibility of using a modified Expectation-Maximization algorithm to estimate parameters for a simple hierarchical generative model for XML retrieval. The generative model for an XML e...