Abstract
The Sackin and Colless indices are two widely-used metrics for measuring the balance of trees and for testing evolutionary models in phylogenetics. This short paper contributes two results about the Sackin and Colless indices of trees. One result is the asymptotic analysis of the expected Sackin and Colless indices of tree shapes (which are full binary rooted unlabelled trees) under the uniform model where tree shapes are sampled with equal probability. Another is a short direct proof of the closed formula for the expected Sackin index of phylogenetic trees (which are full binary rooted trees with leaves being labelled with taxa) under the uniform model.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00285-022-01831-2/MediaObjects/285_2022_1831_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00285-022-01831-2/MediaObjects/285_2022_1831_Fig2_HTML.png)
Similar content being viewed by others
Notes
References
Avino M, Ng GT, He Y, Renaud MS, Jones BR, Poon AF (2019) Tree shape-based approaches for the comparative study of cophylogeny. Ecol Evol 9(12):6756–6771
Blum MG, François O (2005) On statistical tests of phylogenetic tree imbalance: the sackin and other indices revisited. Math Biosci 195(2):141–153
Blum MG, François O, Janson S (2006) The mean, variance and limiting distribution of two statistics sensitive to phylogenetic tree balance. Ann Appl Probab 16(4):2195–2214
Blum MGB, Heyer E, François O, Austerlitz F (2006) Matrilineal fertility inheritance detected in hunter-gatherer populations using the imbalance of gene genealogies. PLoS Genet 2(8):122
Broutin N, Flajolet P (2012) The distribution of height and diameter in random non-plane binary trees. Random Struct Algorithms 41(2):215–252
Colijn C, Plazzotta G (2018) A metric on phylogenetic tree shapes. Syst Biol 67(1):113–126
Colless DH (1982) Review of “phylogenetics: the theory and practice of phylogenetic systematics’’. Syst Zool 31(1):100–104
Coronado TM, Mir A, Rosselló F, Rotger L (2020) On Sackin’s original proposal: the variance of the leaves’ depths as a phylogenetic balance index. BMC Bioinform 21(1):1–17
Felsenstein J (2004) Inferring Phylogenies. Sinauer Assoc Inc, Sunderland
Fill JA, Kapur N (2004) Limiting distributions for additive functionals on Catalan trees. Theoret Comput Sci 326(1–3):69–102
Fischer M, Herbst L, Kersting S, Kühn L, Wicke K (2021) Tree balance indices: a comprehensive survey. ar**v preprint ar**v:2109.12281
Flajolet P, Odlyzko A (1982) The average height of binary trees and other simple trees. J Comput Syst Sci 25(2):171–213
Flajolet P, Sedgewick R (2009) Analytic combinatorics. Cambridge University Press, Cambridge
Fuchs M, ** EY (2015) Equality of Shapley value and fair proportion index in phylogenetic trees. J Math Biol 71(5):1133–1147
Goh G (2022) Metrics for measuring the shape of phylogenetic trees. Honors Thesis, National University of Singapore
Heard SB (1992) Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees. Evolution 46(6):1818–1826
Kim J, Rosenberg NA, Palacios JA (2020) Distance metrics for ranked evolutionary trees. Proc Natl Acad Sci 117(46):28876–28886
King MC, Rosenberg NA (2021) A simple derivation of the mean of the Sackin index of tree balance under the uniform model on rooted binary labeled trees. Math Biosci 342:108688
Kirkpatrick M, Slatkin M (1993) Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution 47(4):1171–1181
Mir A, Rosselló F et al (2013) A new balance index for phylogenetic trees. Math Biosci 241(1):125–136
Mooers AO, Heard SB (1997) Inferring evolutionary process from phylogenetic tree shape. Q Rev Biol 72(1):31–54
Rogers JS (1996) Central moments and probability distributions of three measures of phylogenetic tree imbalance. Syst Biol 45(1):99–110
Sackin MJ (1972) “Good’’ and “Bad’’ Phenograms. Syst Biol 21(2):225–226. https://doi.org/10.1093/sysbio/21.2.225
Scott JG, Maini PK, Anderson AR, Fletcher AG (2020) Inferring tumor proliferative organization from phylogenetic tree measures in a computational model. Syst Biol 69(4):623–637
Shao K-T, Sokal RR (1990) Tree balance. Syst Zool 39(3):266–276
Steel M (2016) Phylogeny: discrete and random processes in evolution. SIAM, Philadelphia
Xue C, Liu Z, Goldenfeld N (2020) Scale-invariant topology and bursty branching of evolutionary trees emerge from niche construction. Proc Natl Acad Sci 117(14):7879–7887
Zhang L (2019) Generating normal networks via leaf insertion and nearest neighbor interchange. BMC Bioinform 20(20):1–9
Acknowledgements
The authors thanks the two anonymous reviewers for useful suggestions and comments for preparing the final version of this paper. LZ was supported by MOE Tier 1 grant R-146-000-318-114; MF was supported by MOST-109-2115-M-004-003-MY2.
Author information
Authors and Affiliations
Contributions
GG: Recurrence formulas; LZ: Recurrence formulas, writing; MF: Asymptotic analysis, writing.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Goh, G., Fuchs, M. & Zhang, L. Two results about the Sackin and Colless indices for phylogenetic trees and their shapes. J. Math. Biol. 85, 69 (2022). https://doi.org/10.1007/s00285-022-01831-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00285-022-01831-2