Search Results - Springer

Chapter and Conference Paper

BLUEX: A Benchmark Based on Brazilian Leading Universities Entrance eXams

One common trend in recent studies of language models (LMs) is the use of standardized tests for evaluation. However, despite being the fifth most spoken language worldwide, few such evaluations have been con...

Thales Sales Almeida, Thiago Laitz, Giovana K. Bonás… in Intelligent Systems (2023)

Chapter and Conference Paper

Visconde: Multi-document QA with GPT-3 and Neural Reranking

This paper proposes a question-answering system that can answer questions whose supporting evidence is spread over multiple (potentially long) documents. The system, called Visconde, uses a three-step pipeline...

Jayr Pereira, Robson Fidalgo, Roberto Lotufo… in Advances in Information Retrieval (2023)

Chapter and Conference Paper

Exploring Text Decoding Methods for Portuguese Legal Text Generation

In recent years, there has been considerable growth in the volume of legal proceedings in Brazil. In this context, there is a lot of potential in using recent advances in Natural Language Processing to automat...

Kenzo Sakiyama, Raphael Montanari, Roseval Malaquias Junior… in Intelligent Systems (2023)

Chapter and Conference Paper

Sabiá: Portuguese Large Language Models

As the capabilities of language models continue to advance, it is conceivable that “one-size-fits-all” model will remain as the main paradigm. For instance, given the vast number of languages worldwide, many o...

Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira in Intelligent Systems (2023)

Chapter and Conference Paper

Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents

A typical information extraction pipeline consists of token- or span-level classification models coupled with a series of pre- and post-processing scripts. In a production pipeline, requirements often change, ...

Ramon Pires, Fábio C. de Souza, Guilherme Rosa… in Document Analysis Systems (2022)

Chapter

Setting the Stage

This section begins by more formally characterizing the text ranking problem, explicitly enumerating our assumptions about characteristics of the input and output, and more precisely circumscribing the scope o...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Chapter

Refining Query and Document Representations

The vocabulary mismatch problem [Furnas et al., 1987]—where searchers and the authors of the texts to be searched use different words to describe the same concepts—was introduced in Section 1.2.2 as a core pro...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Chapter

Future Directions and Conclusions

It is quite remarkable that BERT debuted in October 2018, only around three years ago. Taking a step back and reflecting, the field has seen an incredible amount of progress in a short amount of time. As we ha...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Chapter

Introduction

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query for a particular task. The most common formulation of text ranking is search, where the search en...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Chapter

Multi-Stage Architectures for Reranking

The simplest and most straightforward formulation of text ranking is to convert the task into a text classification problem, and then sort the texts to be ranked based on the probability that each item belongs...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Chapter

Learned Dense Representations for Ranking

Arguably, the single biggest benefit brought about by modern deep learning techniques to text ranking is the move away from sparse signals, mostly limited to exact matches, to continuous dense representations ...

Jimmy Lin, Rodrigo Nogueira, Andrew Yates in Pretrained Transformers for Text Ranking (2022)

Book

Pretrained Transformers for Text Ranking

BERT and Beyond

Jimmy Lin, Rodrigo Nogueira… in Synthesis Lectures on Human Language Technologies (2022)

Article

Navigation-based candidate expansion and pretrained language models for citation recommendation

Citation recommendation systems for the scientific literature, to help authors find papers that should be cited, have the potential to speed up discoveries and uncover new routes for scientific exploration. We...

Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, Jimmy Lin in Scientometrics (2020)

Chapter

EpiRL: A Reinforcement Learning Agent to Facilitate Epistasis Detection

Epistasis (gene-gene interaction) is crucial to predicting genetic disease. Our work tackles the computational challenges faced by previous works in epistasis detection by modeling it as a one-step Markov Deci...

Kexin Huang, Rodrigo Nogueira in Precision Health and Medicine (2020)

Chapter and Conference Paper

BERTimbau: Pretrained BERT Models for Brazilian Portuguese

Recent advances in language representation using neural networks have made it viable to transfer the learned internal states of large pretrained language models (LMs) to downstream natural language processing ...

Fábio Souza, Rodrigo Nogueira, Roberto Lotufo in Intelligent Systems (2020)

15 Result(s)

BLUEX: A Benchmark Based on Brazilian Leading Universities Entrance eXams

Visconde: Multi-document QA with GPT-3 and Neural Reranking

Exploring Text Decoding Methods for Portuguese Legal Text Generation

Sabiá: Portuguese Large Language Models

Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents

Setting the Stage

Refining Query and Document Representations

Future Directions and Conclusions

Introduction

Multi-Stage Architectures for Reranking

Learned Dense Representations for Ranking

Pretrained Transformers for Text Ranking

Navigation-based candidate expansion and pretrained language models for citation recommendation

EpiRL: A Reinforcement Learning Agent to Facilitate Epistasis Detection

BERTimbau: Pretrained BERT Models for Brazilian Portuguese

Our Content

Other Sites

Help & Contacts