Log in

Two results about the Sackin and Colless indices for phylogenetic trees and their shapes

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

The Sackin and Colless indices are two widely-used metrics for measuring the balance of trees and for testing evolutionary models in phylogenetics. This short paper contributes two results about the Sackin and Colless indices of trees. One result is the asymptotic analysis of the expected Sackin and Colless indices of tree shapes (which are full binary rooted unlabelled trees) under the uniform model where tree shapes are sampled with equal probability. Another is a short direct proof of the closed formula for the expected Sackin index of phylogenetic trees (which are full binary rooted trees with leaves being labelled with taxa) under the uniform model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://oeis.org/

References

  • Avino M, Ng GT, He Y, Renaud MS, Jones BR, Poon AF (2019) Tree shape-based approaches for the comparative study of cophylogeny. Ecol Evol 9(12):6756–6771

    Article  Google Scholar 

  • Blum MG, François O (2005) On statistical tests of phylogenetic tree imbalance: the sackin and other indices revisited. Math Biosci 195(2):141–153

    Article  MathSciNet  MATH  Google Scholar 

  • Blum MG, François O, Janson S (2006) The mean, variance and limiting distribution of two statistics sensitive to phylogenetic tree balance. Ann Appl Probab 16(4):2195–2214

    Article  MathSciNet  MATH  Google Scholar 

  • Blum MGB, Heyer E, François O, Austerlitz F (2006) Matrilineal fertility inheritance detected in hunter-gatherer populations using the imbalance of gene genealogies. PLoS Genet 2(8):122

    Article  Google Scholar 

  • Broutin N, Flajolet P (2012) The distribution of height and diameter in random non-plane binary trees. Random Struct Algorithms 41(2):215–252

    Article  MathSciNet  MATH  Google Scholar 

  • Colijn C, Plazzotta G (2018) A metric on phylogenetic tree shapes. Syst Biol 67(1):113–126

    Article  Google Scholar 

  • Colless DH (1982) Review of “phylogenetics: the theory and practice of phylogenetic systematics’’. Syst Zool 31(1):100–104

    Article  Google Scholar 

  • Coronado TM, Mir A, Rosselló F, Rotger L (2020) On Sackin’s original proposal: the variance of the leaves’ depths as a phylogenetic balance index. BMC Bioinform 21(1):1–17

    Article  Google Scholar 

  • Felsenstein J (2004) Inferring Phylogenies. Sinauer Assoc Inc, Sunderland

    Google Scholar 

  • Fill JA, Kapur N (2004) Limiting distributions for additive functionals on Catalan trees. Theoret Comput Sci 326(1–3):69–102

    Article  MathSciNet  MATH  Google Scholar 

  • Fischer M, Herbst L, Kersting S, Kühn L, Wicke K (2021) Tree balance indices: a comprehensive survey. ar**v preprint ar**v:2109.12281

  • Flajolet P, Odlyzko A (1982) The average height of binary trees and other simple trees. J Comput Syst Sci 25(2):171–213

    Article  MathSciNet  MATH  Google Scholar 

  • Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Fuchs M, ** EY (2015) Equality of Shapley value and fair proportion index in phylogenetic trees. J Math Biol 71(5):1133–1147

    Article  MathSciNet  MATH  Google Scholar 

  • Goh G (2022) Metrics for measuring the shape of phylogenetic trees. Honors Thesis, National University of Singapore

  • Heard SB (1992) Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees. Evolution 46(6):1818–1826

    Article  Google Scholar 

  • Kim J, Rosenberg NA, Palacios JA (2020) Distance metrics for ranked evolutionary trees. Proc Natl Acad Sci 117(46):28876–28886

    Article  MATH  Google Scholar 

  • King MC, Rosenberg NA (2021) A simple derivation of the mean of the Sackin index of tree balance under the uniform model on rooted binary labeled trees. Math Biosci 342:108688

    Article  MathSciNet  MATH  Google Scholar 

  • Kirkpatrick M, Slatkin M (1993) Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution 47(4):1171–1181

    Article  Google Scholar 

  • Mir A, Rosselló F et al (2013) A new balance index for phylogenetic trees. Math Biosci 241(1):125–136

    Article  MathSciNet  MATH  Google Scholar 

  • Mooers AO, Heard SB (1997) Inferring evolutionary process from phylogenetic tree shape. Q Rev Biol 72(1):31–54

    Article  Google Scholar 

  • Rogers JS (1996) Central moments and probability distributions of three measures of phylogenetic tree imbalance. Syst Biol 45(1):99–110

    Article  Google Scholar 

  • Sackin MJ (1972) “Good’’ and “Bad’’ Phenograms. Syst Biol 21(2):225–226. https://doi.org/10.1093/sysbio/21.2.225

    Article  Google Scholar 

  • Scott JG, Maini PK, Anderson AR, Fletcher AG (2020) Inferring tumor proliferative organization from phylogenetic tree measures in a computational model. Syst Biol 69(4):623–637

    Article  Google Scholar 

  • Shao K-T, Sokal RR (1990) Tree balance. Syst Zool 39(3):266–276

    Google Scholar 

  • Steel M (2016) Phylogeny: discrete and random processes in evolution. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  • Xue C, Liu Z, Goldenfeld N (2020) Scale-invariant topology and bursty branching of evolutionary trees emerge from niche construction. Proc Natl Acad Sci 117(14):7879–7887

    Article  Google Scholar 

  • Zhang L (2019) Generating normal networks via leaf insertion and nearest neighbor interchange. BMC Bioinform 20(20):1–9

    Google Scholar 

Download references

Acknowledgements

The authors thanks the two anonymous reviewers for useful suggestions and comments for preparing the final version of this paper. LZ was supported by MOE Tier 1 grant R-146-000-318-114; MF was supported by MOST-109-2115-M-004-003-MY2.

Author information

Authors and Affiliations

Authors

Contributions

GG: Recurrence formulas; LZ: Recurrence formulas, writing; MF: Asymptotic analysis, writing.

Corresponding author

Correspondence to Louxin Zhang.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goh, G., Fuchs, M. & Zhang, L. Two results about the Sackin and Colless indices for phylogenetic trees and their shapes. J. Math. Biol. 85, 69 (2022). https://doi.org/10.1007/s00285-022-01831-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00285-022-01831-2

Keywords

Mathematics Subject Classification

Navigation