Abstract
Reconstructing ancestral gene orders from the genome data of extant species is an important problem in comparative and evolutionary genomics. In a phylogenomics setting that accounts for gene family evolution through gene duplication and gene loss, the reconstruction of ancestral gene orders involves several steps, including multiple sequence alignment, the inference of reconciled gene trees, and the inference of ancestral syntenies and gene adjacencies. For each of the steps of such a process, several methods can be used and implemented using a growing corpus of, often parameterized, tools; in practice, interfacing such tools into an ancestral gene order reconstruction pipeline is far from trivial. This chapter introduces AGO, a Python-based framework aimed at creating ancestral gene order reconstruction pipelines allowing to interface and parameterize different bioinformatics tools. The authors illustrate the features of AGO by reconstructing ancestral gene orders for the X chromosome of three ancestral Anopheles species using three different pipelines. AGO is freely available at https://github.com/cchauve/AGO-pipeline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boussau B, Daubin V (2009) Genomes as documents of evolutionary history. Trends Ecol Evol 25:224–232. https://doi.org/10.1016/j.tree.2009.09.007
Joy JB, Liang RH, McCloskey RM, Nguyen T, Poon AFY (2016) Ancestral reconstruction. PLoS Comput Biol 12:e1004763. https://doi.org/10.1371/journal.pcbi.1004763
Groussin M, Daubin V, Gouy M, Tannier E (2016) Ancestral reconstruction: theory and practice. In: Encyclopedia of evolutionary biology. Elsevier, Oxford, pp 70–77. https://doi.org/10.1016/B978-0-12-800049-6.00166-9
Murat F, Van de Peer Y, Salse J (2012) Decoding plant and animal genome plasticity from differential paleo-evolutionary patterns and processes. Genome Biol Evol 4:917–928. https://doi.org/10.1093/gbe/evs066
Bakloushinskaya IY (2016) Chromosomal rearrangements, genome reorganization, and speciation. Biol Bull 43:759–775. https://doi.org/10.1134/S1062359016080057
Pont C, Wagner S, Kremer A, Orlando L, Plomion C, Salse J (2019) Paleogenomics: reconstruction of plant evolutionary trajectories from modern and ancient DNA. Genome Biol 20:29. https://doi.org/10.1186/s13059-019-1627-1
Wellenreuther M, Mérot C, Berdan E, Bernatchez L (2019) Going beyond SNPs: the role of structural genomic variants in adaptive evolution and species diversification. Mol Ecol 28:1203–1209. https://doi.org/10.1111/mec.15066
El-Mabrouk N (2021) Predicting the evolution of syntenies—an algorithmic review. Algorithms 14:152. https://doi.org/10.3390/a14050152
Anselmetti Y, Luhmann N, Bérard S, Tannier E, Chauve C (2018) Comparative methods for reconstructing ancient genome organization. In: Setubal JC, Stoye J, Stadler PF (eds) Comparative genomic, Methods in molecular biology, vol 1704. Humana, New York. https://doi.org/10.1007/978-1-4939-7463-4_13
Moret BME, Wyman SK, Bader DA, Warnow TJ, Yan M (2001) A new implementation and detailed study of breakpoint analysis. In: Altman RB, Dunker AK, Hunter L, Klein TE (eds) Proceedings of the 6th Pacific Symposium on Biocomputing, PSB 2001, Hawaii, USA, 3–7 Jan 2001
Tesler G (2002) GRIMM: genome rearrangements web server. Bioinformatics 18:492–493. https://doi.org/10.1093/bioinformatics/18.3.492
Feijao P, Meidanis J (2011) SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans Comput Biol Bioinform 8:1318–1329. https://doi.org/10.1109/TCBB.2011.34
Thornton JW, DeSalle R (2000) Gene family evolution and homology: genomics meets phylogenetics. Annu Rev Genomics Hum Genet 1:41–73. https://doi.org/10.1146/annurev.genom.1.1.41
Bohnenkämper L, Braga MDV, Doerr D, Stoye J (2021) Computing the rearrangement distance of natural genomes. J Comput Biol 28:410–431. https://doi.org/10.1089/cmb.2020.0434
Earnest-DeYoung JV, Lerat E, Moret BME (2004) Reversing gene erosion – reconstructing ancestral bacterial genomes from gene-content and order data. In: Jonassen I, Kim J (eds) Algorithms in bioinformatics, 4th international workshop, WABI 2004, Bergen, Norway, 17–21 Sept 2004, Proceedings, Lecture notes in computer science, vol 3240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30219-3_1
Gagnon Y, Blanchette M, El-Mabrouk N (2012) A flexible ancestral genome reconstruction method based on gapped adjacencies. BMC Bioinform 13:S4. https://doi.org/10.1186/1471-2105-13-S19-S4
Hu F, Zhou J, Zhou L, Tang J (2014) Probabilistic reconstruction of ancestral gene orders with insertions and deletions. IEEE/ACM Trans Comput Biol Bioinform 11:667–672. https://doi.org/10.1109/TCBB.2014.2309602
Yang N, Hu F, Zhou L, Tang J (2014) Reconstruction of ancestral gene orders using probabilistic and gene encoding approaches. PLoS One 9:e108796. https://doi.org/10.1371/journal.pone.0108796
Rajaraman A, Ma J (2016) Reconstructing ancestral gene orders with duplications guided by synteny level genome reconstruction. BMC Bioinform 17:414. https://doi.org/10.1186/s12859-016-1262-8
Avdeyev P, Jiang S Jr, Aganezov S, Hu F, Alekseyev MA (2016) Reconstruction of ancestral genomes in presence of gene gain and loss. J Comput Biol 23:150–164. https://doi.org/10.1089/cmb.2015.0160
Doerr D, Chauve C (2021) Small parsimony for natural genomes in the DCJ-indel model. J Bioinforma Comput Biol 19:2140009. https://doi.org/10.1142/S0219720021400096
Xu Q, ** L, Zheng C, Zhang X, Leebens-Mack J, Sankoff D (2023) From comparative gene content and gene order to ancestral contigs, chromosomes and karyotypes. Sci Rep 13:6095. https://doi.org/10.1038/s41598-023-33029-x
Menet H, Daubin V, Tannier E (2022) Phylogenetic reconciliation. PLoS Comput Biol 18:1–29. https://doi.org/10.1371/journal.pcbi.1010621
Boussau B, Scornavacca C (2020) Reconciling gene trees with species trees. In: Scornavacca C, Delsuc F, Galtier N (eds) Phylogenetics in the genomic era. https://hal.science/hal-02535529
Sankoff D, El-Mabrouk N (2000) Duplication, rearrangement, and reconciliation. In: Sankoff D, Nadeau JH (eds) Comparative genomics, Computational biology, vol 1. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-4309-7_46
Ma J, Ratan A, Raney BJ, Suh BB, Zhang L, Miller W et al (2008) DUPCAR: reconstructing contiguous ancestral regions with duplications. J Comput Biol 15:1007–1027. https://doi.org/10.1089/cmb.2008.0069
Chauve C, El-Mabrouk N, Guéguen L, Semeria M, Tannier E (2013) Duplication, rearrangement and reconciliation: a follow-up 13 years later. In: Chauve C, El-Mabrouk N, Tannier E (eds) Models and algorithms for genome evolution, Computational biology, vol 19. Springer, London. https://doi.org/10.1007/978-1-4471-5298-9_4
Ma J, Zhang L, Suh BB, Raney BJ, Burhans R, Kent WJ et al (2006) Reconstructing contiguous regions of an ancestral genome. Genome Res 16:1557–1565. https://doi.org/10.1101/gr.5383506
Szöllősi GJ, Tannier E, Lartillot N, Daubin V (2013) Lateral gene transfer from the dead. Syst Biol 62:386–397. https://doi.org/10.1093/sysbio/syt003
Amos B, Aurrecoechea C, Barba M, Barreto A, Basenko E, Bażant W et al (2021) VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center. Nucleic Acids Res 50:D898–D911. https://doi.org/10.1093/nar/gkab929
Herrero J, Muffato M, Beal K, Fitzgerald S, Gordon L, Pignatelli M et al (2016) Ensembl comparative genomics resources. Database 2016:bav096. https://doi.org/10.1093/database/bav096
Altenhoff AM, Glover NM, Dessimoz C (2019) Inferring orthology and paralogy. In: Anisimova M (ed) Evolutionary genomics, Methods in molecular biology, vol 1910. Humana, New York. https://doi.org/10.1007/978-1-4939-9074-0_5
Duchemin W, Gence G, Arigon Chifolleau AM, Arvestad L, Bansal MS, Berry V et al (2018) RecPhyloXML: a format for reconciled gene trees. Bioinformatics 34:3646–3652. https://doi.org/10.1093/bioinformatics/bty389
Ranwez V, Douzery EJP, Cambon C, Chantret N, Delsuc F (2018) MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons. Mol Biol Evol 35:2582–2584. https://doi.org/10.1093/molbev/msy159
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A et al (2020) IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37:1530–1534. https://doi.org/10.1093/molbev/msaa015
Morel B, Kozlov AM, Stamatakis A, Szöllősi GJ (2020) GeneRax: a tool for species-tree-aware maximum likelihood-based gene family tree inference under gene duplication, transfer, and loss. Mol Biol Evol 37:2763–2774. https://doi.org/10.1093/molbev/msaa141
Szöllősi GJ, Rosikiewicz W, Boussau B, Tannier E, Daubin V (2013) Efficient exploration of the space of reconciled gene trees. Syst Biol 62:901–912. https://doi.org/10.1093/sysbio/syt054
Duchemin W, Anselmetti Y, Patterson M, Ponty Y, Bérard S, Chauve C et al (2017) DeCoSTAR: reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol Evol 9:1312–1319. https://doi.org/10.1093/gbe/evx069
Jacox E, Chauve C, Szöllősi GJ, Ponty Y, Scornavacca C (2016) ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32:2056–2058. https://doi.org/10.1093/bioinformatics/btw105
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589. https://doi.org/10.1038/nmeth.4285
Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS (2017) UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol 35:518–522. https://doi.org/10.1093/molbev/msx281
Chauve C, Ponty Y, Zanetti JPP (2015) Evolution of genes neighborhood within reconciled phylogenies: an ensemble approach. BMC Bioinform 16:S6. https://doi.org/10.1186/1471-2105-16-S19-S6
Chauve C, Tannier E (2008) A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PLoS Comput Biol 4:e1000234. https://doi.org/10.1371/journal.pcbi.1000234
Manuch J, Patterson M, Wittler R, Chauve C, Tannier E (2012) Linearization of ancestral multichromosomal genomes. BMC Bioinform 13:S11. https://doi.org/10.1186/1471-2105-13-S19-S11
Luhmann N, Lafond M, Thevenin A, Ouangraoua A, Wittler R, Chauve C (2017) The SCJ small parsimony problem for weighted gene adjacencies. IEEE/ACM Trans Comput Biol Bioinf 16:1374–1373. https://doi.org/10.1109/TCBB.2017.2661761
Ben-Kiki O, Evans C, Ingerson B (2009) YAML ain’t markup language (YAML) (tm) version 1.2. YAML.org; http://www.yaml.org/spec/1.2/spec.html
Yoo AB, Jette MA, Grondona M (2003) SLURM: simple linux utility for resource management. In: Feitelson DG, Rudolph L, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing 9th international workshop, JSSPP 2003, Seattle, WA, USA, June 24, 2003, revised papers, Lecture notes in computer science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_3
Neafsey DE, Waterhouse RM, Abai MR, Aganezov SS, Alekseyev MA, Allen JE et al (2015) Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 347:1258522. https://doi.org/10.1126/science.1258522
Fontaine MC, Pease JB, Steele A, Waterhouse RM, Neafsey DE, Sharakhov IV et al (2015) Extensive introgression in a malaria vector species complex revealed by phylogenomics. Science 347:1258524. https://doi.org/10.1126/science.1258524
Chen F, Mackey AJ, Stoeckert J, Christian J, Roos DS (2006) OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res 34:D363–D368. https://doi.org/10.1093/nar/gkj123
Hahn MW (2007) Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution. Genome Biol 8:R141. https://doi.org/10.1186/gb-2007-8-7-r141
Redelings BD (2021) BAli-Phy version 3: model-based co-estimation of alignment and phylogeny. Bioinformatics 37:3032–3034. https://doi.org/10.1093/bioinformatics/btab129
Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690. https://doi.org/10.1093/bioinformatics/btl446
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S et al (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542. https://doi.org/10.1093/sysbio/sys029
Comte N, Morel B, Hasić D, Guéguen L, Boussau B, Daubin V et al (2020) Treerecs: an integrated phylogenetic tool, from sequences to reconciliations. Bioinformatics 36:4822–4824. https://doi.org/10.1093/bioinformatics/btaa615
Bansal MS, Kellis M, Kordi M, Kundu S (2018) RANGER-DTL 2.0: rigorous reconstruction of gene- family evolution by duplication, transfer and loss. Bioinformatics 34:3214–3216. https://doi.org/10.1093/bioinformatics/bty314
Davín AA, Tricou T, Tannier E, de Vienne DM, Szöllősi GJ (2019) Zombi: a phylogenetic simulator of trees, genomes and sequences that accounts for dead linages. Bioinformatics 36:1286–1288. https://doi.org/10.1093/bioinformatics/btz710
Briand S, Dessimoz C, El-Mabrouk N, Lafond M, Lobinska G (2020) A generalized Robinson-Foulds distance for labeled trees. BMC Genomics 21:779. https://doi.org/10.1186/s12864-020-07011-0
Tannier E, Bazin A, Davín AA, Guéguen L, Bérard S, Chauve C (2020) Ancestral genome organization as a diagnosis tool for phylogenomics. In: Scornavacca C, Delsuc F, Galtier N (eds) Phylogenetics in the genomic era. https://hal.science/hal-02535466
Acknowledgments
The authors CC and EC were supported by the Natural Sciences and Engineering Research Council of Canada. This work benefited from the support of the Digital Research Alliance of Canada. DD was supported by the MODS project funded from the program “Profilbildung 2020” (grant no. PROFILNRW-2020-107-A), an initiative of the Ministry of Culture and Science of the State of North Rhine-Westphalia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Cribbie, E.P., Doerr, D., Chauve, C. (2024). AGO, a Framework for the Reconstruction of Ancestral Syntenies and Gene Orders. In: Setubal, J.C., Stadler, P.F., Stoye, J. (eds) Comparative Genomics. Methods in Molecular Biology, vol 2802. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3838-5_10
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3838-5_10
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3837-8
Online ISBN: 978-1-0716-3838-5
eBook Packages: Springer Protocols