Search
Search Results
-
Pan-genome de Bruijn graph using the bidirectional FM-index
BackgroundPan-genome graphs are gaining importance in the field of bioinformatics as data structures to represent and jointly analyze multiple...
-
Compression algorithm for colored de Bruijn graphs
A colored de Bruijn graph (also called a set of k-mer sets), is a set of k-mers with every k-mer assigned a set of colors. Colored de Bruijn graphs...
-
Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads
Although most existing genome assemblers are based on de Bruijn graphs, the construction of these graphs for large genomes and large k -mer sizes has...
-
Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2
The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many...
-
Detecting gene breakpoints in noisy genome sequences using position-annotated colored de-Bruijn graphs
BackgroundIdentifying the locations of gene breakpoints between species of different taxonomic groups can provide useful insights into the underlying...
-
USTAR: Improved Compression of k-mer Sets with Counters Using de Bruijn Graphs
A fundamental operation in computational genomics is to reduce the input sequences to their constituent k-mers. Finding a space-efficient way to... -
Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph
Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling...
-
Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
The graph traversal edit distance (GTED), introduced by Ebrahimpour Boroojeny et al. (2018), is an elegant distance measure defined as the minimum...
-
A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes
BackgroundWith the rapid development of accurate sequencing and assembly technologies, an increasing number of high-quality chromosome-level and...
-
Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs
Memory consumption of de Bruijn graphs is often prohibitive. Most de Bruijn graph-based assemblers reduce the complexity by compacting paths into...
-
Simplitigs as an efficient and scalable representation of de Bruijn graphs
de Bruijn graphs play an essential role in bioinformatics, yet they lack a universal scalable representation. Here, we introduce simplitigs as a...
-
An External Memory Approach for Large Genome De Novo Assembly
De novo genome assembly of sequenced reads is a fundamental problem in bioinformatics. When there is no reference genome sequence to guide the... -
Detecting circular RNA from high-throughput sequence data with de Bruijn graph
BackgroundCircular RNA is a type of non-coding RNA, which has a circular structure. Many circular RNAs are stable and contain exons, but are not...
-
Graph-Based Machine Learning Approaches for Pangenomics
Deciphering the relationship between genotype and phenotype is a crucial yet challenging step in genetic research. Genome-wide association studies... -
Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields
BackgroundDe Bruijn graphs are key data structures for the analysis of next-generation sequencing data. They efficiently represent the overlap...
-
A Classification of de Bruijn Graph Approaches for De Novo Fragment Assembly
Research in bioinformatics has changed rapidly since the advent of next-generation sequencing (NGS). Despite the positive impact on cost reduction,... -
Comparing methods for constructing and representing human pangenome graphs
BackgroundAs a single reference genome cannot possibly represent all the variation present across human individuals, pangenome graphs have been...
-
Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time
A fundamental operation in computational genomics is to reduce the input sequences to their constituent k -mers. For maximum performance of downstream...
-
Telomere-to-telomere assembly of diploid chromosomes with Verkko
The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this...
-
Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph
Despite advances in long-read sequencing technologies, constructing a near telomere-to-telomere assembly is still computationally demanding. Here we...