-
Article
Open AccessLarge scale sequence alignment via efficient inference in generative models
Finding alignments between millions of reads and genome sequences is crucial in computational biology. Since the standard alignment algorithm has a large computational cost, heuristics have been developed to s...
-
Article
Open AccessHarvestman: a framework for hierarchical feature learning and selection from whole genome sequencing data
Supervised learning from high-throughput sequencing data presents many challenges. For one, the curse of dimensionality often leads to overfitting as well as issues with scalability. This can bring about inacc...
-
Chapter and Conference Paper
Lower Density Selection Schemes via Small Universal Hitting Sets with Short Remaining Path Length
Universal hitting sets are sets of words that are unavoidable: every long enough sequence is hit by the set (i.e., it contains a word from the set). There is a tight relationship between universal hitting sets...
-
Chapter and Conference Paper
Compact Universal k-mer Hitting Sets
We address the problem of finding a minimum-size set of k-mers that hits L-long sequences. The problem arises in the design of compact hash functions and other data structures for efficient handling of large sequ...
-
Article
Open AccessA new rhesus macaque assembly and annotation for next-generation sequencing analyses
The rhesus macaque (Macaca mulatta) is a key species for advancing biomedical research. Like all draft mammalian genomes, the draft rhesus assembly (rheMac2) has gaps, sequencing errors and misassemblies that hav...
-
Article
Open AccessDecoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies
The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early ca...
-
Article
Open AccessParsimonious reconstruction of network evolution
Understanding the evolution of biological networks can provide insight into how their modular structure arises and how they are affected by environmental changes. One approach to studying the evolution of thes...
-
Chapter and Conference Paper
Parsimonious Reconstruction of Network Evolution
We consider the problem of reconstructing a maximally parsimonious history of network evolution under models that support gene duplication and loss and independent interaction gain and loss. We introduce a com...
-
Article
Open AccessA whole-genome assembly of the domestic cow, Bos taurus
The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods.
-
Chapter and Conference Paper
An Automated Benchmarking Toolset
The drive for performance in parallel computing and the need to evaluate platform upgrades or replacements are major reasons frequent running of benchmark codes has become commonplace for application and platf...