Abstract
A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T , and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.
This work was supported in part by the National Science Foundation under grant DEB-0829674.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bansal, M., Burleigh, J.G., Eulenstein, O., Fernández-Baca, D.: Robinson-Foulds supertrees. Algorithms for Molecular Biology 5(1), 18 (2010)
Baum, B.R.: Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 41(1), 3–10 (1992)
de Queiroz, A., Gatesy, J.: The supermatrix approach to systematics. Trends in Ecology & Evolution 22(1), 34–41 (2007)
Deepak, A., Fernández-Baca, D., McMahon, M.: Extracting conflict-free information from multi-labeled trees (2012), http://arxiv.org/abs/1205.6359
Fellows, M., Hallett, M., Stege, U.: Analogs & duals of the mast problem for sequences & trees. Journal of Algorithms 49(1), 192–216 (2003); 1998 European Symposium on Algorithms
Ganapathy, G., Goodson, B., Jansen, R., Le, H., Ramachandran, V., Warnow, T.: Pattern identification in biogeography. IEEE/ACM Trans. Comput. Biol. Bioinformatics 3, 334–346 (2006)
Grundt, H., Popp, M., Brochmann, C., Oxelman, B.: Polyploid origins in a circumpolar complex in draba (brassicaceae) inferred from cloned nuclear dna sequences and fingerprints. Molecular Phylogenetics and Evolution 32(3), 695–710 (2004)
Huber, K., Lott, M., Moulton, V., Spillner, A.: The complexity of deriving multi-labeled trees from bipartitions. Journal of Computational Biology 15(6), 639–651 (2008)
Huber, K., Moulton, V.: Phylogenetic networks from multi-labelled trees. Journal of Mathematical Biology 52, 613–632 (2006)
Huber, K., Spillner, A., Suchecki, R., Moulton, V.: Metrics on multilabeled trees: Interrelationships and diameter bounds. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(4), 1029–1040 (2011)
Johnson, K.P., Adams, R.J., Page, R.D.M., Clayton, D.H.: When do parasites fail to speciate in response to host speciation? Syst. Biol. 52, 37–47 (2003)
Lott, M., Spillner, A., Huber, K., Petri, A., Oxelman, B., Moulton, V.: Inferring polyploid phylogenies from multiply-labeled gene trees. BMC Evolutionary Biology 9(1), 216 (2009)
Marcet-Houben, M., Gabaldón, T.: Treeko: a duplication-aware algorithm for the comparison of phylogenetic trees. Nucleic Acids Research 39, e66 (2011)
Popp, M., Oxelman, B.: Inferring the history of the polyploid silene aegaea (caryophyllaceae) using plastid and homoeologous nuclear dna sequences. Molecular Phylogenetics and Evolution 20(3), 474–481 (2001)
Puigbò, P., Garcia-Vallvé, S., McInerney, J.: Topd/fmts: a new software to compare phylogenetic trees. Bioinformatics 23(12), 1556 (2007)
Ragan, M.: Phylogenetic inference based on matrix representation of trees. Molecular Phylogenetics and Evolution 1(1), 53–58 (1992)
Rasmussen, M.D., Kellis, M.: Unified modeling of gene duplication, loss, and coalescence using a locus tree. Genome Research 22, 755–765 (2012)
Sanderson, M., Boss, D., Chen, D., Cranston, K., Wehe, A.: The PhyLoTA browser: processing GenBank for molecular phylogenetics research. Systematic Biology 57(3), 335 (2008)
Scornavacca, C., Berry, V., Ranwez, V.: Building species trees from larger parts of phylogenomic databases. Information and Computation 209(3), 590–605 (2011); Special Issue: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.): LATA 2009. LNCS, vol. 5457. Springer, Heidelberg (2009)
Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)
Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification 9(1), 91–116 (1992)
Swenson, M., Suri, R., Linder, C., Warnow, T.: Superfine: fast and accurate supertree estimation. Systematic Biology 61(2), 214–227 (2012)
Wiens, J.J., Reeder, T.W.: Combining data sets with different numbers of taxa for phylogenetic analysis. Systematic Biology 44(4), 548–558 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deepak, A., Fernández-Baca, D., McMahon, M.M. (2012). Extracting Conflict-Free Information from Multi-labeled Trees. In: Raphael, B., Tang, J. (eds) Algorithms in Bioinformatics. WABI 2012. Lecture Notes in Computer Science(), vol 7534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33122-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-33122-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33121-3
Online ISBN: 978-3-642-33122-0
eBook Packages: Computer ScienceComputer Science (R0)