Search
Search Results
-
Parallel and private generalized suffix tree construction and query on genomic data
BackgroundSeveral technological advancements and digitization of healthcare data have provided the scientific community with a large quantity of...
-
SamQL: a structured query language and filtering tool for the SAM/BAM file format
BackgroundThe Sequence Alignment/Map Format Specification (SAM) is one of the most widely adopted file formats in bioinformatics and many researchers...
-
Fast, parallel, and cache-friendly suffix array construction
PurposeString indexes such as the suffix array ( sa ) and the closely related longest common prefix ( lcp ) array are fundamental objects in...
-
gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections
BackgroundThe construction of a suffix array for a collection of strings is a fundamental task in Bioinformatics and in many other applications that...
-
Prediction of plant secondary metabolic pathways using deep transfer learning
BackgroundPlant secondary metabolites are highly valued for their applications in pharmaceuticals, nutrition, flavors, and aesthetics. It is of great...
-
Finding maximal exact matches in graphs
BackgroundWe study the problem of finding maximal exact matches (MEMs) between a query string Q and a labeled graph G . MEMs are an important class...
-
MCProj: metacell projection for interpretable and quantitative use of transcriptional atlases
We describe MCProj—an algorithm for analyzing query scRNA-seq data by projections over reference single-cell atlases. We represent the reference as a...
-
Indexing and searching petabase-scale nucleotide resources
Searching vast and rapidly growing nucleotide content in resources, such as runs in the Sequence Read Archive and assemblies for whole-genome shotgun...
-
Pfp-fm: an accelerated FM-index
FM-indexes are crucial data structures in DNA alignment, but searching with them usually takes at least one random access per character in the query...
-
Genome-wide screening reveals the genetic basis of mammalian embryonic eye development
BackgroundMicrophthalmia, anophthalmia, and coloboma (MAC) spectrum disease encompasses a group of eye malformations which play a role in childhood...
-
The flax genome reveals orbitide diversity
BackgroundRibosomally-synthesized cyclic peptides are widely found in plants and exhibit useful bioactivities for humans. The identification of...
-
Finding identical sequence repeats in multiple protein sequences: An algorithm
In recent years, several experimental evidences suggest that amino acid repeats are closely linked to many disease conditions, as they have a...
-
Fast and robust metagenomic sequence comparison through sparse chaining with skani
Sequence comparison tools for metagenome-assembled genomes (MAGs) struggle with high-volume or low-quality data. We present skani (
https://github.com/bluenote-1577/skani... -
Suffix sorting via matching statistics
We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a...
-
Fulgor : a fast and compact k-mer index for large-scale matching and color queriesThe problem of sequence identification or matching—determining the subset of reference sequences from a given collection that are likely to contain a...
-
Efficient privacy-preserving variable-length substring match for genome sequence
The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely...
-
Intrinsic disorder in PRAME and its role in uveal melanoma
IntroductionThe PReferentially expressed Antigen in MElanoma ( PRAME) protein has been shown to be an independent biomarker for increased risk of...
-
An optimized FM-index library for nucleotide and amino acid search
BackgroundPattern matching is a key step in a variety of biological sequence analysis pipelines. The FM-index is a compressed data structure for...
-
EZCancerTarget: an open-access drug repurposing and data-collection tool to enhance target validation and optimize international research efforts against highly progressive cancers
The expanding body of potential therapeutic targets requires easily accessible, structured, and transparent real-time interpretation of molecular...
-
Integration of fingerprint-based similarity searching and kernel-based partial least squares analysis to predict inhibitory activity against CSK, HER2, JAK1, JAK2, and JAK3
Fingerprint-based similarity searching is an important strategy for virtual screening in drug discovery. In the present study, we carried out a...