Skip to main content

previous disabled Page of 3
and
  1. Chapter

    Scalable Text Index Construction

    We survey recent advances in scalable text index construction with a focus on practical algorithms in distributed, shared, and external memory.

    Timo Bingmann, Patrick Dinklage, Johannes Fischer in Algorithms for Big Data (2022)

  2. No Access

    Chapter and Conference Paper

    On the Optimisation of the GSACA Suffix Array Construction Algorithm

    The suffix array is arguably one of the most important data structures in sequence analysis and consequently there is a multitude of suffix sorting algorithms. However, to this date the GSACA algorithm introduced...

    Jannik Olbrich, Enno Ohlebusch in String Processing and Information Retrieval (2022)

  3. No Access

    Chapter and Conference Paper

    On the Computation of Longest Previous Non-overlap** Factors

    The f-factorization of a string is similar to the well-known Lempel-Ziv (LZ) factorization, but differs from it in that the factors must be non-overlap**. There are two linear time algorithms that compute the

    Enno Ohlebusch, Pascal Weber in String Processing and Information Retrieval (2019)

  4. No Access

    Chapter and Conference Paper

    Trickier XBWT Tricks

    A trie [11] is one of the best data structures for implementing and searching a dictionary. However, to build the trie structure for larger collections of strings takes up a lot of memory. Since the eXtended Burr...

    Enno Ohlebusch, Stefan Stauß, Uwe Baier in String Processing and Information Retrieval (2018)

  5. Article

    Open Access

    Erratum to: A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

    Timo Beller, Enno Ohlebusch in Algorithms for Molecular Biology (2016)

  6. Article

    Open Access

    A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

    Recently, Marcus et al. (Bioinformatics 30:3476–83, 2014) proposed to use a compressed de Bruijn graph to describe the relationship between the genomes of many individuals/strains of the same or closely related s...

    Timo Beller, Enno Ohlebusch in Algorithms for Molecular Biology (2016)

  7. No Access

    Chapter and Conference Paper

    Parallel Construction of Succinct Representations of Suffix Tree Topologies

    A compressed suffix tree usually consists of three components: a compressed suffix array, a compressed \(\mathsf {LCP}\) -array,...

    Uwe Baier, Timo Beller, Enno Ohlebusch in String Processing and Information Retrieval (2015)

  8. No Access

    Chapter and Conference Paper

    Efficient Construction of a Compressed de Bruijn Graph for Pan-Genome Analysis

    Recently, Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph of maximal exact matches to describe the relationship between the genomes of many individuals/strains of the same or c...

    Timo Beller, Enno Ohlebusch in Combinatorial Pattern Matching (2015)

  9. No Access

    Chapter and Conference Paper

    Alphabet-Independent Algorithms for Finding Context-Sensitive Repeats in Linear Time

    The identification of repetitive sequences (repeats) is an essential component of genome sequence analysis, and there are dozens of algorithms that search for exact or approximate repeats. The notions of maxim...

    Enno Ohlebusch, Timo Beller in String Processing and Information Retrieval (2014)

  10. No Access

    Chapter and Conference Paper

    Space-Efficient Construction of the Burrows-Wheeler Transform

    The Burrows-Wheeler transform (BWT), originally invented for data compression, is nowadays also the core of many self-indexes, which can be used to solve many problems in bioinformatics. However, the memory requi...

    Timo Beller, Maike Zwerger, Simon Gog in String Processing and Information Retrieval (2013)

  11. No Access

    Chapter and Conference Paper

    Space-Efficient Computation of Maximal and Supermaximal Repeats in Genome Sequences

    The identification of repetitive sequences (repeats) is an essential component of genome sequence analysis, and the notions of maximal and supermaximal repeats capture all exact repeats in a genome in a compac...

    Timo Beller, Katharina Berger in String Processing and Information Retrieval (2012)

  12. No Access

    Chapter and Conference Paper

    Computing the Burrows-Wheeler Transform of a String and Its Reverse

    The contribution of this paper is twofold. First, we provide new theoretical insights into the relationship between a string and its reverse: If the Burrows-Wheeler transform (BWT) of a string has been compute...

    Enno Ohlebusch, Timo Beller, Mohamed I. Abouelhoda in Combinatorial Pattern Matching (2012)

  13. No Access

    Article

    Linear Time Algorithms for Generalizations of the Longest Common Substring Problem

    In its simplest form, the longest common substring problem is to find a longest substring common to two or multiple strings. Using (generalized) suffix trees, this problem can be solved in linear time and space. ...

    Michael Arnold, Enno Ohlebusch in Algorithmica (2011)

  14. No Access

    Chapter and Conference Paper

    Lempel-Ziv Factorization Revisited

    For 30 years the Lempel-Ziv factorization of a string has played an important role in data compression, and more recently it was used as the basis of linear time algorithms for the detection of all maximal rep...

    Enno Ohlebusch, Simon Gog in Combinatorial Pattern Matching (2011)

  15. No Access

    Chapter and Conference Paper

    Computing the Longest Common Prefix Array Based on the Burrows-Wheeler Transform

    Many sequence analysis tasks can be accomplished with a suffix array, and several of them additionally need the longest common prefix array. In large scale applications, suffix arrays are being replaced with f...

    Timo Beller, Simon Gog, Enno Ohlebusch in String Processing and Information Retrieval (2011)

  16. No Access

    Chapter and Conference Paper

    Bidirectional Search in a String with Wavelet Trees

    Searching for genes encoding microRNAs (miRNAs) is an important task in genome analysis. Because the secondary structure of miRNA (but not the sequence) is highly conserved, the genes encoding it can be determ...

    Thomas Schnattinger, Enno Ohlebusch, Simon Gog in Combinatorial Pattern Matching (2010)

  17. No Access

    Chapter and Conference Paper

    CST++

    Let A be an array of n elements taken from a totally ordered set. We present a data structure of size 3n + o(n) bits that allows us to answer the following queries on A in constant time, without accessing A: (1) ...

    Enno Ohlebusch, Johannes Fischer, Simon Gog in String Processing and Information Retrieval (2010)

  18. No Access

    Chapter and Conference Paper

    Computing Matching Statistics and Maximal Exact Matches on Compressed Full-Text Indexes

    Exact string matching is a problem that computer programmers face on a regular basis, and full-text indexes like the suffix tree or the suffix array provide fast string search over large texts. In the last dec...

    Enno Ohlebusch, Simon Gog, Adrian Kügel in String Processing and Information Retrieval (2010)

  19. No Access

    Chapter and Conference Paper

    A Compressed Enhanced Suffix Array Supporting Fast String Matching

    Index structures like the suffix tree or the suffix array are of utmost importance in stringology, most notably in exact string matching. In the last decade, research on compressed index structures has flouris...

    Enno Ohlebusch, Simon Gog in String Processing and Information Retrieval (2009)

  20. Article

    Open Access

    A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions

    Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a...

    Martin Bader, Mohamed I Abouelhoda, Enno Ohlebusch in BMC Bioinformatics (2008)

previous disabled Page of 3