Search Results - Springer

Sort By Newest First Oldest First

Article

Open Access

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records

Capturing sentence semantics plays a vital role in a range of text mining applications. Despite continuous efforts on the development of related datasets and models in the general domain, both datasets and mod...

Qingyu Chen, **gcheng Du, Sun Kim… in BMC Medical Informatics and Decision Making (2020)

Download PDF (1036 KB) View Article
Article

Open Access

Discovering themes in biomedical literature using a projection-based algorithm

The need to organize any large document collection in a manner that facilitates human comprehension has become crucial with the increasing volume of information available. Two common approaches to provide a br...

Lana Yeganova, Sun Kim, Grigory Balasanov, W. John Wilbur in BMC Bioinformatics (2018)

Download PDF (1136 KB) View Article
Article

Open Access

PubMed Phrases, an open set of coherent phrases for searching biomedical literature

In biomedicine, key concepts are often expressed by multiple words (e.g., ‘zinc finger protein’). Previous work has shown treating a sequence of words as a meaningful unit, where applicable, is not only import...

Sun Kim, Lana Yeganova, Donald C. Comeau, W. John Wilbur, Zhiyong Lu in Scientific Data (2018)

Download PDF (1598 KB) View Article
Article

Open Access

Optimizing graph-based patterns to extract biomedical events from the literature

We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs o...

Haibin Liu, Karin Verspoor, Donald C Comeau, Andrew D MacKinlay… in BMC Bioinformatics (2015)

Download PDF (901 KB) View Article
Article

Open Access

Identifying named entities from PubMed® for enriching semantic categories

Controlled vocabularies such as the Unified Medical Language System (UMLS®) and Medical Subject Headings (MeSH®) are widely used for biomedical natural language processing (NLP) tasks. However, the standard te...

Sun Kim, Zhiyong Lu, W John Wilbur in BMC Bioinformatics (2015)

Download PDF (987 KB) View Article
Article

Open Access

Finding biomedical categories in Medline^®

There are several humanly defined ontologies relevant to Medline. However, Medline is a fast growing collection of biomedical documents which creates difficulties in updating and expanding these humanly define...

Lana Yeganova, Won Kim, Donald C Comeau, W John Wilbur in Journal of Biomedical Semantics (2012)

Download PDF (343 KB) View Article
Article

Open Access

Thematic clustering of text documents using an EM-based approach

Clustering textual contents is an important step in mining useful information on the web or other text-based resources. The common task in text clustering is to handle text in a multi-dimensional space, and to...

Sun Kim, W John Wilbur in Journal of Biomedical Semantics (2012)

Download PDF (447 KB) View Article
Article

Open Access

The gene normalization task in BioCreative III

We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 full...

Zhiyong Lu, Hung-Yu Kao, Chih-Hsuan Wei, Minlie Huang, **gchen Liu… in BMC Bioinformatics (2011)

Download PDF (629 KB) View Article
Article

Open Access

Overview of the BioCreative III Workshop

The overall goal of the BioCreative Workshops is to promote the development of text mining and text processing tools which are useful to the communities of researchers and database curators in the biological s...

Cecilia N Arighi, Zhiyong Lu, Martin Krallinger, Kevin B Cohen… in BMC Bioinformatics (2011)

Download PDF (313 KB) View Article
Article

Open Access

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional ...

Martin Krallinger, Miguel Vazquez, Florian Leitner, David Salgado… in BMC Bioinformatics (2011)

Download PDF (2554 KB) View Article
Article

Open Access

Classifying protein-protein interaction articles using word and syntactic features

Identifying protein-protein interactions (PPIs) from literature is an important step in mining the function of individual proteins as well as their biological network. Since it is known that PPIs have distinct...

Sun Kim, W John Wilbur in BMC Bioinformatics (2011)

Download PDF (756 KB) View Article
Article

Open Access

Machine learning with naturally labeled data for identifying abbreviation definitions

The rapid growth of biomedical literature requires accurate text analysis and text processing tools. Detecting abbreviations and identifying their definitions is an important component of such tools. Most exis...

Lana Yeganova, Donald C Comeau, W John Wilbur in BMC Bioinformatics (2011)

Download PDF (320 KB) View Article
Article

Open Access

Improving a gold standard: treating human relevance judgments of MEDLINE document pairs

Given prior human judgments of the condition of an object it is possible to use these judgments to make a maximal likelihood estimate of what future human judgments of the condition of that object will be. How...

W John Wilbur, Won Kim in BMC Bioinformatics (2011)

Download PDF (359 KB) View Article
Article

Open Access

Finding related sentence pairs in MEDLINE

We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detec...

Larry H. Smith, W. John Wilbur in Information Retrieval (2010)

Download PDF (610 KB) View Article
Article

Open Access

The ineffectiveness of within-document term frequency in text classification

For the purposes of classification it is common to represent a document as a bag of words. Such a representation consists of the individual terms making up the document together with the number of times each t...

W. John Wilbur, Won Kim in Information Retrieval (2009)

Download PDF (438 KB) View Article
Article

Open Access

Modeling actions of PubMed users with n-gram language models

Transaction logs from online search engines are valuable for two reasons: First, they provide insight into human information-seeking behavior. Second, log data can be used to train user models, which can then ...

Jimmy Lin, W. John Wilbur in Information Retrieval (2009)

Download PDF (347 KB) View Article
Article

Open Access

Evaluation of query expansion using MeSH in PubMed

This paper investigates the effectiveness of using MeSH^® in PubMed through its automatic query expansion process: Automatic Term Map** (ATM). We run Boolean searches based on a collection of 55 topics and about...

Zhiyong Lu, Won Kim, W. John Wilbur in Information Retrieval (2009)

Download PDF (263 KB) View Article
Article

Open Access

Abbreviation definition identification based on automatic precision estimates

The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders...

Sunghwan Sohn, Donald C Comeau, Won Kim, W John Wilbur in BMC Bioinformatics (2008)

Download PDF (435 KB) View Article
Article

Open Access

Overview of BioCreative II gene mention recognition

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A ...

Larry Smith, Lorraine K Tanabe, Rie Johnson nee Ando, Cheng-Ju Kuo… in Genome Biology (2008)

Download PDF (408 KB) View Article
Article

Open Access

PubMed related articles: a probabilistic topic-based model for content similarity

We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from ...

Jimmy Lin, W John Wilbur in BMC Bioinformatics (2007)

Download PDF (1551 KB) View Article

33 Result(s)

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records

Discovering themes in biomedical literature using a projection-based algorithm

PubMed Phrases, an open set of coherent phrases for searching biomedical literature

Optimizing graph-based patterns to extract biomedical events from the literature

Identifying named entities from PubMed® for enriching semantic categories

Finding biomedical categories in Medline^®

Thematic clustering of text documents using an EM-based approach

The gene normalization task in BioCreative III

Overview of the BioCreative III Workshop

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

Classifying protein-protein interaction articles using word and syntactic features

Machine learning with naturally labeled data for identifying abbreviation definitions

Improving a gold standard: treating human relevance judgments of MEDLINE document pairs

Finding related sentence pairs in MEDLINE

The ineffectiveness of within-document term frequency in text classification

Modeling actions of PubMed users with n-gram language models

Evaluation of query expansion using MeSH in PubMed

Abbreviation definition identification based on automatic precision estimates

Overview of BioCreative II gene mention recognition

PubMed related articles: a probabilistic topic-based model for content similarity

Our Content

Other Sites

Help & Contacts