-
Article
Open AccessFalse gene and chromosome losses in genome assemblies caused by GC content variation and repeats
Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as com...
-
Article
Open AccessAuthor Correction: Improved reference genome of the arboviral vector Aedes albopictus
-
Article
Open AccessComplete vertebrate mitogenomes reveal widespread repeats and gene duplications
Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly.
-
Article
Open AccessMerqury: reference-free quality, completeness, and phasing assessment for genome assemblies
Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation...
-
Article
Open AccessImproved reference genome of the arboviral vector Aedes albopictus
The Asian tiger mosquito Aedes albopictus is globally expanding and has become the main vector for human arboviruses in Europe. With limited antiviral drugs and vaccines available, vector control is the primary a...
-
Article
Open AccessMash Screen: high-throughput sequence containment estimation for genome discovery
The MinHash algorithm has proven effective for rapidly estimating the resemblance of two genomes or metagenomes. However, this method cannot reliably estimate the containment of a genome within a metagenome. H...
-
Article
Open AccessAssignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation
We describe a method that adds long-read sequencing to a mix of technologies used to assemble a highly complex cattle rumen microbial community, and provide a comparison to short read-based methods. Long-read ...
-
Article
Open AccessRefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification
In order to determine the role of the database in taxonomic sequence classification, we examine the influence of the database over time on k-mer-based lowest common ancestor taxonomic classification. We present t...
-
Article
Open AccessMash: fast genome and metagenome distance estimation using MinHash
Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence collections. Mas...
-
Article
Open AccessThe Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes
Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequence...
-
Article
Open AccessReducing assembly complexity of microbial genomes with single-molecule sequencing
The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant r...
-
Article
Open AccessMetAMOS: a modular and open source metagenomic assembly and analysis pipeline
We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequ...
-
Article
Irreconcilable differences: divorcing geographic mutation and recombination rates within a global MRSA clone
A growing resource of methicillin-resistant Staphylococcus aureus (MRSA) genomes uncovers intriguing phylogeographic and recombination patterns and highlights challenges in identifying the source of these phenome...
-
Article
Open AccessGenome assembly forensics: finding the elusive mis-assembly
We present the first collection of tools aimed at automated genome assembly validation. This work formalizes several mechanisms for detecting mis-assemblies, and describes their implementation in our automated...
-
Article
Open AccessHawkeye: an interactive visual analytics tool for genome assemblies
Genome sequencing remains an inexact science, and genome sequences can contain significant errors if they are not carefully examined. Hawkeye is our new visual analytics tool for genome assemblies, designed to...