Abstract
To an RNA pseudoknot structure is naturally associated a topological surface, which has its associated genus, and structures can thus be classified by the genus. Based on earlier work of Harer–Zagier, we compute the generating function \(\mathbf{D}_{g,\sigma }(z)=\sum _{n}\mathbf{d}_{g,\sigma }(n)z^n\) for the number \(\mathbf{d}_{g,\sigma }(n)\) of those structures of fixed genus \(g\) and minimum stack size \(\sigma \) with \(n\) nucleotides so that no two consecutive nucleotides are basepaired and show that \(\mathbf{D}_{g,\sigma }(z)\) is algebraic. In particular, we prove that \(\mathbf{d}_{g,2}(n)\sim k_g\,n^{3(g-\frac{1}{2})} \gamma _2^n\), where \(\gamma _2\approx 1.9685\). Thus, for stack size at least two, the genus only enters through the sub-exponential factor, and the slow growth rate compared to the number of RNA molecules implies the existence of neutral networks of distinct molecules with the same structure of any genus. Certain RNA structures called shapes are shown to be in natural one-to-one correspondence with the cells in the Penner–Strebel decomposition of Riemann’s moduli space of a surface of genus \(g\) with one boundary component, thus providing a link between RNA enumerative problems and the geometry of Riemann’s moduli space.
Similar content being viewed by others
Notes
These combinatorial structures occur in a number of instances in pure mathematics including finite type invariants of knots and links (Bar-Natan 1995; Kontsevich 1993), the representation theory of Lie algebras (Campoamor-Stursberg and Manturov 2004), the geometry of moduli spaces of flat connections on surfaces (Andersen et al. 1996, 1998), map** class groups (Andersen et al. 2010) and the Four-Color Theorem (Bar-Natan 1997), and in applied mathematics including codifying the pairings among nucleotides in RNA molecules (Reidys 2011; Bon et al. 2008; Orland and Zee 2002), or more generally the contacts of any binary macromolecule (Penner et al. 2010; Penner and Waterman 1993; Waterman 1995), and in the analysis of data structures (Flajolet 1980; Flajolet et al. 1980).
The numbers \(\mathbf{c}_g(n)\) had been computed in another generating function over two decades ago by Harer and Zagier (1986) in the equivalent guise of the number of side pairings of a polygon with \(2n\) sides that produce a surface of genus \(g\), namely,
$$\begin{aligned} 1+2\sum _{n\ge 0} \sum _{2g\le n}{{\mathbf{c}_g(n)}\over {(2n-1)!!}} x^{n+1-2g} z^{n+1}= \biggl ({{1+z}\over {1-z}}\biggr )^x, \end{aligned}$$a striking and beautiful formula.
As a general notational point for any power series \(R(z)=\sum a_iz^i\), we shall write \([z^i]R(z)=a_i\) for the extraction of the coefficient \(a_i\) of \(z^i\).
References
Andersen JE, Bene AJ, Meilhan J-B, Penner RC (2010) Finite type invariants and fatgraphs. Adv Math 225:2117–2161
Andersen JE, Mattes J, Reshetikhin N (1996) The poisson structure on the moduli space of flat connections and chord diagrams. Topology 35:1069–1083
Andersen JE, Mattes J, Reshetikhin N (1998) Quantization of the algebra of chord diagrams. Math Proc Camb Phil Soc 124:451–467
Bar-Natan D (1995) On the Vassiliev knot invariants. Topology 34:423–475
Bar-Natan D (1997) Lie algebras and the four colour problem. Combinatorica 17:43–52
Bender EA, Rodney Canfield E (1988) The asymptotic number of tree-rooted maps on a surface. J Comb Theory Ser A 48(2):156–164
Bon M, Vernizzi G, Orland H, Zee A (2008) Topological classification of RNA structures. J Mol Biol 379:900–911
Campoamor-Stursberg R, Manturov VO (2004) Invariant tensor formulas via chord diagrams. J Math Sci 108:3018–3029
dell’Erba MG, Zemba GR (2009) Thermodynamics of a model for RNA folding. Phys Rev E 79:011913
Euler L (1752) Elementa doctrinae solidorum. Novi Comm Acad Sci Imp Petropol 4:109–140
Flajolet P (1980) Combinatorial aspects of continued fractions. Discret Math 32:125–161
Flajolet P, Francon J, Vuillemin J (1980) Sequence of operations analysis for dynamic data structures. J Algorithms 1:111–141
Flajolet P, Sedgewick R (2009) Analytical combinatorics. Cambridge University Press, Cambridge
Gao JZM, Li LYM, Reidys CM (2010) Inverse folding of RNA pseudoknot structures. Algorithms Mol Biol 5:R27
Garg I, Deo N (2009) RNA matrix models with external interactions and their asymptotic behavior. Phys Rev E 79:061903
Goulden P, Nica A (2005) A direct bijection for the Harer–Zagier formula. J Comb Theory (A) 111:224–238
Goupil A, Schaeffer G (1998) Factoring n-cycles and counting maps of given genus. Eur J Comb 19(7): 819–834
Grüner WG, Strothmann D, Reidys CM, Weber J, Hofacker IL, Stadler PF, Schuster P (1996) Analysis of RNA sequence structure maps by exhaustive enumeration II. Neutral Netw Chem Mon 127:375–389
Grüner WG, Strothmann D, Reidys CM, Weber J, Hofacker IL, Stadler PF, Schuster P (1996) Analysis of RNA sequence structure maps by exhaustive enumeration I. Neutral Netw Chem Mon 127:355–374
Harer J, Zagier D (1986) The Euler characteristic of the moduli space of curves. Invent Math 85:457–485
Haslinger C, Stadler PF (1999) RNA structures with pseudo-knots. Bull Math Biol 61:437–467
** EY, Reidys CM (2011) Random induced subgraphs of Cayley graphs induced by transpositions. Discret Math 21(311):2496–2511
Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge
Konings DAM, Gutell RR (1995) A comparison of thermodynamic foldings with comparatively derived structures of 16s and 16s-like r RNAs. RNA 1:559–574
Kontsevich M (1993) Vassiliev’s knot invariants. Adv Sov Math 16:137–150
Lando SK, Zvonkin AK (2004) Graphs on surfaces and their applications: with an appendix by Don B. Zagier. Encyclopaedia of Mathematical Sciences, 141. Low-Dimensional Topology, II. Springer-Verlag, Berlin
Li TJX, Reidys CM (2012) The genus filtration of \(\gamma \)-structures. Math Biosci (submitted)
Loria A, Pan T (1996) Domain structure of the ribozyme from eubacterial ribonuclease. RNA 2:551–563
Milgram RJ, Penner RC (1993) Riemann’s moduli space and the symmetric groups. In: Bödigheimer C-F, Hain RM (eds) Map** class groups and moduli spaces of Riemann surfaces. AMS contemporary math, vol 150. pp 247–290
Orland H, Zee A (2002) RNA folding and large N matrix theory. Nucl Phys B 620:456–476
Penner RC (1987) The Teichmuller space of a punctured surface. Commun Math Phys
Penner RC (1988) Perturbative series and the moduli space of Riemann surfaces. J Diff Geom 27:35–53
Penner RC (1992) Weil–Petersson volumes. J Diff Geom 35:559–608
Penner RC (2004) Cell decomposition and compactification of Riemann’s moduli space in decorated Teichmüller theory. In: Tongring N, Penner RC (eds) Woods hole mathematics-perspectives in math and physics. World Scientific, Singapore, pp 263–301 (ar**v)
Penner RC, Knudsen M, Wiuf C, Andersen J (2010) Fatgraph model of proteins. Comm Pure Appl Math 63:1249–1297
Penner RC, Waterman MS (1993) Spaces of RNA secondary structures. Adv Math 101:31–49
Pillsbury M, Orland H, Zee A (2005) Steepest descent calculation of RNA pseudoknots. Phys Rev E 72:011911
Pillsbury M, Taylor JA, Orland H, Zee A (2005) An algorithm for RNA pseudoknots. ar**v: cond-mat/0310505v2
Reidys CM, Huang FWD, Andersen JE, Penner RC, Stadler PF, Nebel ME (2011) Topology and prediction of RNA pseudoknots. Bioinformatics. doi:10.1093/bioinformatics/btr090
Reidys CM (2011) Combinatorial computational biology of RNA. Springer, New York
Reidys CM, Wang RR, Zhao AYY (2010) Modular, \(k\)-noncrossing diagrams. Electr J Comb 1(17):R76
Reidys CM, Stadler PF, Schuster PK (1997) Generic properties of combinatory maps and neutral networks of RNA secondary structrures. Bull Math Biol 59:339–397
Reidys CM, Stadler PF (2002) Combinatorial landscapes. SIAM Rev 44:3–54
Rivas E, Eddy SR (1999) A dynamic programming algorithm for RNA structure prediction including pseudoknots. J Mol Biol 285:2053–2068
Reidys CM, Forst CV, Schuster P (2001) Replication and mutation on neutral networks. Bull Math Biol 63:57–94
Reidys CM (2009) Large Components of Random induced subgraphs of n-cubes. Discret Math 309:3113–3124
Stanley RP (1997) Enumerative combinatorics. Cambridge studies in advanced mathematics, vol 49. Cambridge University Press, Cambridge
Strebel K (1984) Quadratic differentials. Springer, Berlin
Staple DW, Butcher SE (2005) Pseudoknots: RNA structures with diverse functions. PLoS Biol 3(6): 956–959
Vernizzi G, Orland H, Zee A (2005) Enumeration of RNA structures by matrix models. Phys Rev Lett 94:168103
Vernizzi G, Ribecca P, Orland H, Zee A (2006) Topology of pseudoknotted homopolymers. Phys Rev E 73:031902
Waterman M (1979) Combinatorics of RNA hairpins and cloverleafs. Stud Appl Math 60:91–96
Waterman M (1978) Secondary structure of single-stranded nucleic acids. Adv Math (Suppl Stud) 1:167–212
Howell J, Smith T, Waterman M (1980) Computation of generating functions for biological molecules. SIAM J Appl Math 39:119–133
Waterman M, Schmitt W (1994) Linear trees and RNA secondary structure. Discret Appl Math 51:317–323
Waterman MS (1995) An introduction computational biology. Chapman and Hall, New York
Westhof E, Jaeger L (1992) RNA pseudoknots. Curr Opin Chem Biol 2:327–333
Author information
Authors and Affiliations
Corresponding author
Additional information
J.E. Andersen and R.C. Penner are supported by QGM (Centre for Quantum Geometry of Moduli Spaces, funded by the Danish National Research Foundation).
Rights and permissions
About this article
Cite this article
Andersen, J., Penner, R., Reidys, C. et al. Topological classification and enumeration of RNA structures by genus. J. Math. Biol. 67, 1261–1278 (2013). https://doi.org/10.1007/s00285-012-0594-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-012-0594-x