Skip to main content

previous disabled Page of 2
and
  1. No Access

    Chapter and Conference Paper

    Towards Automated End-to-End Health Misinformation Free Search with a Large Language Model

    In the information age, health misinformation remains a notable challenge to public welfare. Integral to addressing this issue is the development of search systems adept at identifying and filtering out mislea...

    Ronak Pradeep, Jimmy Lin in Advances in Information Retrieval (2024)

  2. No Access

    Chapter and Conference Paper

    PyGaggle: A Gaggle of Resources for Open-Domain Question Answering

    Text retrieval using dense–sparse hybrids has been gaining popularity because of their effectiveness. Improvements to both sparse and dense models have also been noted, in the context of open-domain question a...

    Ronak Pradeep, Haonan Chen, Lingwei Gu in Advances in Information Retrieval (2023)

  3. No Access

    Chapter and Conference Paper

    Answer Retrieval for Math Questions Using Structural and Dense Retrieval

    Answer retrieval for math questions is a challenging task due to the complex and structured nature of mathematical expressions. In this paper, we combine a structure retriever and a domain-adapted ColBERT retr...

    Wei Zhong, Yuqing **e, Jimmy Lin in Experimental IR Meets Multilinguality, Mul… (2023)

  4. No Access

    Chapter and Conference Paper

    Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering

    One of the contributions of the landmark Dense Passage Retriever (DPR) work is the curation of a corpus of passages generated from Wikipedia articles that have been segmented into non-overlap** passages of 1...

    Manveer Singh Tamber, Ronak Pradeep, Jimmy Lin in Advances in Information Retrieval (2023)

  5. No Access

    Chapter and Conference Paper

    Squeezing Water from a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking

    While much recent work has demonstrated that hard negative mining can be used to train better bi-encoder models, few have considered it in the context of cross-encoders, which are key ingredients in modern re...

    Ronak Pradeep, Yuqi Liu, **nyu Zhang, Yilin Li in Advances in Information Retrieval (2022)

  6. No Access

    Chapter and Conference Paper

    Another Look at DPR: Reproduction of Training and Replication of Retrieval

    Text retrieval using learned dense representations has recently emerged as a promising alternative to “traditional” text retrieval using sparse bag-of-words representations. One foundational work that has garn...

    Xueguang Ma, Kai Sun, Ronak Pradeep, Minghan Li in Advances in Information Retrieval (2022)

  7. No Access

    Chapter and Conference Paper

    Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

    Pseudo-Relevance Feedback (PRF) utilises the relevance signals from the top-k passages from the first round of retrieval to perform a second round of retrieval aiming to improve search effectiveness. A recent res...

    Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma in Advances in Information Retrieval (2022)

  8. No Access

    Chapter and Conference Paper

    Comparing Score Aggregation Approaches for Document Retrieval with Pretrained Transformers

    While BERT has been shown to be effective for passage retrieval, its maximum input length limitation poses a challenge when applying the model to document retrieval. In this work, we reproduce three passage sc...

    **nyu Zhang, Andrew Yates, Jimmy Lin in Advances in Information Retrieval (2021)

  9. Chapter and Conference Paper

    From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance

    The latest major release of Lucene (version 8) in March 2019 incorporates block-max indexes and exploits the block-max variant of Wand for query evaluation, which are innovations that originated from academia. Th...

    Adrien Grand, Robert Muir, Jim Ferenczi, Jimmy Lin in Advances in Information Retrieval (2020)

  10. Chapter and Conference Paper

    Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants

    When researchers speak of BM25, it is not entirely clear which variant they mean, since many tweaks to Robertson et al.’s original formulation have been proposed. When practitioners speak of BM25, they most li...

    Chris Kamphuis, Arjen P. de Vries, Leonid Boytsov in Advances in Information Retrieval (2020)

  11. Chapter and Conference Paper

    Reproducibility is a Process, Not an Achievement: The Replicability of IR Reproducibility Experiments

    This paper espouses a view of reproducibility in the computational sciences as a process and not just a point-in-time “achievement”. As a concrete case study, we revisit the Open-Source IR Reproducibility Challen...

    Jimmy Lin, Qian Zhang in Advances in Information Retrieval (2020)

  12. No Access

    Chapter and Conference Paper

    Simple Techniques for Cross-Collection Relevance Feedback

    We tackle the problem of transferring relevance judgments across document collections for specific information needs by reproducing and generalizing the work of Grossman and Cormack from the TREC 2017 Common C...

    Ruifan Yu, Yuhao **e, Jimmy Lin in Advances in Information Retrieval (2019)

  13. No Access

    Chapter and Conference Paper

    Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval

    In the framework of axiomatic information retrieval, the semantic term matching technique proposed by Fang and Zhai in SIGIR 2006 has been shown to be effective in addressing the vocabulary mismatch problem, w...

    Peilin Yang, Jimmy Lin in Advances in Information Retrieval (2019)

  14. No Access

    Chapter and Conference Paper

    Compressing and Decoding Term Statistics Time Series

    There is growing recognition that temporality plays an important role in information retrieval, particularly for timestamped document collections such as tweets. This paper examines the problem of compressing ...

    **feng Rao, **ng Niu, Jimmy Lin in Advances in Information Retrieval (2016)

  15. No Access

    Chapter and Conference Paper

    Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge

    The Open-Source IR Reproducibility Challenge brought together developers of open-source search engines to provide reproducible baselines of their systems in a common environment on Amazon EC2. The product is a...

    Jimmy Lin, Matt Crane, Andrew Trotman, Jamie Callan in Advances in Information Retrieval (2016)

  16. No Access

    Chapter and Conference Paper

    Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search

    “Evaluation as a service” (EaaS) is a new methodology for community-wide evaluations where an API provides the only point of access to the collection for completing the evaluation task. Two important advantage...

    **feng Rao, Jimmy Lin, Miles Efron in Advances in Information Retrieval (2015)

  17. No Access

    Chapter and Conference Paper

    The Impact of Future Term Statistics in Real-Time Tweet Search

    In the real-time tweet search task operationalized in the TREC Microblog evaluations, a topic consists of a query Q and a time t, modeling the task where the user wishes to see the most recent but relevant tweets...

    Yulu Wang, Jimmy Lin in Advances in Information Retrieval (2014)

  18. No Access

    Chapter and Conference Paper

    Column Stores as an IR Prototy** Tool

    We make the suggestion that instead of implementing custom index structures and query evaluation algorithms, IR researchers should simply store document representations in a column-oriented relational database...

    Hannes Mühleisen, Thaer Samar, Jimmy Lin in Advances in Information Retrieval (2014)

  19. No Access

    Chapter and Conference Paper

    10 Bit 1.5b/Stage Pipeline ADC Design for Video Application

    This paper proposes a design of a 10-bit fully differential pipeline analog-to-digital converter (ADC). The main component of this ADC is the sample and hold (S/H) circuit and eight stages of 1.5 bit sub-ADC a...

    Chin-Fa Hsieh, Chun-Sheng Chen, Jimmy Lin in Proceedings of the 2nd International Confe… (2014)

  20. No Access

    Chapter and Conference Paper

    Training Efficient Tree-Based Models for Document Ranking

    Gradient-boosted regression trees (GBRTs) have proven to be an effective solution to the learning-to-rank problem. This work proposes and evaluates techniques for training GBRTs that have efficient runtime charac...

    Nima Asadi, Jimmy Lin in Advances in Information Retrieval (2013)

previous disabled Page of 2