-
Chapter and Conference Paper
Suffix trees in the functioned programming paradigm
We explore the design space of implementing suffix tree algorithms in the functional paradigm. We review the linear time and space algorithms of McCreight and Ukkonen. Based on a new terminology of nested suff...
-
Chapter and Conference Paper
Estimating the probability of approximate matches
-
Chapter and Conference Paper
Efficient Implementation of Lazy Suffix Trees
We present an efficient implementation of a write-only topdown construction for suffix trees. Our implementation is based on a new, space-efficient representation of suffix trees which requires only 12 bytes p...
-
Chapter
Space Efficient Linear Time Computation of the Burrows and Wheeler-Transformation
In [4] a universal data compression algorithm (BW-algorithm, for short) is described which achieves compression rates that are close to the best known rates achieved in practice. Due to its simplicity, the alg...
-
Chapter and Conference Paper
Optimal Exact String Matching Based on Suffix Arrays
Using the suffix tree of a string S, decision queries of the type “Is P a substring of S?” can be answered in O(|P|) time and enumeration queries of the type “Where are all z occurrences of P in S?” can be answer...
-
Chapter and Conference Paper
The Enhanced Suffix Array and Its Applications to Genome Analysis
In large scale applications as computational genome analysis, the space requirement of the suffix tree is a severe drawback. In this paper, we present a uniform framework that enables us to systematically repl...
-
Article
Comparative genomics of Arabidopsisand maize: prospects and limitations
The completed Arabidopsis genome seems to be of limited value as a model for maize genomics. In addition to the expansion of repetitive sequences in maize and the lack of genomic micro-colinearity, maize-specific...
-
Article
Open AccessVersatile and open software for comparing large genomes
The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes. Two new graphical viewing tools provi...
-
Chapter
A Computational Approach to Search for Non-Coding RNAs in Large Genomic Data
Over the last few years several specialized software tools have been developed, each allowing a certain class of RNAs insequencedatatobe found.Herewedescribeageneral tool that allows us to specify many differe...
-
Article
Open AccessFast index based algorithms and software for matching position specific scoring matrices
In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes ...
-
Protocol
Visualization of Syntenic Relationships With SynBrowse
Synteny is the preserved order of genes between related species. To detect syntenic regions one usually first applies sequence comparison methods to the genomic sequences of the considered species. Sequence si...
-
Article
Open AccessOptimising oligonucleotide array design for ChIP-on-chip
-
Article
Open AccessLTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons
Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic g...
-
Article
Open AccessEfficient computation of absent words in genomic sequences
Analysis of sequence composition is a routine task in genome research. Organisms are characterized by their base composition, dinucleotide relative abundance, codon usage, and so on. Unique subsequences are ma...
-
Article
Open AccessA new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes
The challenges of accurate gene prediction and enumeration are further aggravated in large genomes that contain highly repetitive transposable elements (TEs). Yet TEs play a substantial role in genome evolutio...
-
Article
Open AccessCoCoNUT: an efficient system for the comparison and analysis of genomes
Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and so...
-
Protocol
MetaGenomeThreader: A Software Tool for Predicting Genes in DNA-Sequences of Metagenome Projects
We consider a gene finding method that is specifically designed to work on metagenome sequences. The method can handle short metagenome sequences with in-frame stop codons as well as frame shifts. It delivers ...
-
Article
Open AccessSelective regain of egfr gene copies in CD44+/CD24-/lowbreast cancer cellular model MDA-MB-468
Increased transcription of oncogenes like the epidermal growth factor receptor (EGFR) is frequently caused by amplification of the whole gene or at least of regulatory sequences. Aim of this study was to pinpo...
-
Article
Open AccessSequencing, annotation, and comparative genome analysis of the gerbil-adapted Helicobacter pylori strain B8
The Mongolian gerbils are a good model to mimic the Helicobacter pylori-associated pathogenesis of the human stomach. In the current study the gerbil-adapted strain B8 was completely sequenced, annotated and comp...
-
Article
Open AccessStructator: fast index-based search for RNA sequence-structure patterns
The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence...