Compact DFA Representation for Fast Regular Expression Search

  • Conference paper
  • First Online:
Algorithm Engineering (WAE 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2141))

Included in the following conference series:

Abstract

We present a new technique to encode a deterministic finite automaton (DFA). Based on the specific properties of Glushkov’s nondeterministic finite automaton (NFA) construction algorithm, we are able to encode the DFA using (m+ 1)(2m+1 + |Σ|) bits, where m is the number of characters (excluding operator symbols) in the regular expression and Σ is the alphabet. This compares favorably against the worst case of (m + 1)2m+1|Σ| bits needed by a classical DFA representation and m(22m+1 + |Σ|) bits needed by the Wu and Manber approach implemented in Agrep.

Our approach is practical and simple to implement, and it permits searching regular expressions of moderate size (which include most cases of interest) faster than with any previously existing algorithm, as we show experimentally.

Partially supported by ECOS-Sud project C99E04 and, for the first author, Fondecyt grant 1-990627.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. A. Aho, R. Sethi, and J. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, 1985.

    Google Scholar 

  2. R. Baeza-Yates and G. Gonnet. A new approach to text searching. CACM, 35(10):74–82, October 1992.

    Google Scholar 

  3. G. Berry and R. Sethi. From regular expression to deterministic automata. Theoretical Computer Science, 48(1):117–126, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  4. A. Brüggemann-Klein. Regular expressions into finite automata. Theoretical Computer Science, 120(2):197–213, November 1993.

    Google Scholar 

  5. C.-H. Chang and R. Paige. From regular expression to DFA’s using NFA’s. In Proceedings of the 3rd Annual Symposium on Combinatorial Pattern Matching, LNCS v. 664, pages 90–110, 1992.

    Google Scholar 

  6. V.-M. Glushkov. The abstract theory of automata. Russian Mathematical Surveys, 16:1–53, 1961.

    Article  Google Scholar 

  7. E. Myers. A four-russian algorithm for regular expression pattern matching. J. of the ACM, 39(2):430–448, 1992.

    Article  MATH  Google Scholar 

  8. G. Navarro and M. Raffinot. Fast regular expression search. In Proceedings of the 3rd Workshop on Algorithm Engineering, LNCS v. 1668, pages 199–213, 1999.

    Google Scholar 

  9. G. Navarro and M. Raffinot. Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA), 5(4), 2000. http://www.jea.acm.org/2000/NavarroString.

  10. K. Thompson. Regular expression search algorithm. CACM, 11(6):419–422, 1968.

    MATH  Google Scholar 

  11. B. Watson. Taxonomies and Toolkits of Regular Language Algorithms. Phd. dissertation, Eindhoven University of Technology, The Netherlands, 1995.

    MATH  Google Scholar 

  12. S. Wu and U. Manber. Agrep-a fast approximate pattern-matching tool. In Proc. of USENIX Technical Conference, pages 153–162, 1992.

    Google Scholar 

  13. S. Wu and U. Manber. Fast text searching allowing errors. CACM, 35(10):83–91, October 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Navarro, G., Raffinot, M. (2001). Compact DFA Representation for Fast Regular Expression Search. In: Brodal, G.S., Frigioni, D., Marchetti-Spaccamela, A. (eds) Algorithm Engineering. WAE 2001. Lecture Notes in Computer Science, vol 2141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44688-5_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-44688-5_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42500-7

  • Online ISBN: 978-3-540-44688-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Navigation