Abstract
Background
Whole-genome duplication and long terminal repeat retrotransposons (LTR-RTs) amplification in organisms are essential factors that affect speciation, local adaptation, and diversification of organisms. Understanding the karyotype projection and LTR-RTs amplification could contribute to untangling evolutionary history. This study compared the karyotype and LTR-RTs evolution in the genomes of eight oaks, a dominant lineage in Northern Hemisphere forests.
Results
Karyotype projections showed that chromosomal evolution was relatively conservative in oaks, especially on chromosomes 1 and 7. Modern oak chromosomes formed through multiple fusions, fissions, and rearrangements after an ancestral triplication event. Species-specific chromosomal rearrangements revealed fragments preserved through natural selection and adaptive evolution. A total of 441,449 full-length LTR-RTs were identified from eight oak genomes, and the number of LTR-RTs for oaks from section Cyclobalanopsis was larger than in other sections. Recent amplification of the species-specific LTR-RTs lineages resulted in significant variation in the abundance and composition of LTR-RTs among oaks. The LTR-RTs insertion suppresses gene expression, and the suppressed intensity in gene regions was larger than in promoter regions. Some centromere and rearrangement regions indicated high-density peaks of LTR/Copia and LTR/Gypsy. Different centromeric regional repeat units (32, 78, 79 bp) were detected on different Q. glauca chromosomes.
Conclusion
Chromosome fusions and arm exchanges contribute to the formation of oak karyotypes. The composition and abundance of LTR-RTs are affected by its recent amplification. LTR-RTs random retrotransposition suppresses gene expression and is enriched in centromere and chromosomal rearrangement regions. This study provides novel insights into the evolutionary history of oak karyotypes and the organization, amplification, and function of LTR-RTs.
Similar content being viewed by others
Background
Chromosomal mutations, such as polyploidization and chromosomal rearrangement, can lead to speciation, adaptation, and diversification [1,2,3,4,5]. Extant species are ancient polyploids from a common ancestor that experienced at least one whole-genome duplication (WGD) [6]. Eudicots core to their clade descended from an ancient whole-genome triplication event (γ) [7]. Chromosomal evolution influences the development of chromosomal size, structure, composition, and number of chromosomes [8]. Karyotype evolution will cause the chromosomal structure to be unstable, such as fusion and fission regions caused by rearrangement, as well as centromere regions that increase or disappear due to WGD or chromosome fusion [9]. Transposable elements may fill and stabilize these unstable regions in the chromosomes [10]. Therefore, reconstructing the ancestor karyotype and analysing the distribution of transposable elements are crucial for untangling the species local adaptation and speciation.
Previous approaches for ancestral karyotype reconstruction and projection defined contiguous ancestral regions based on collinearity among genomes. This method results in gaps in the projections and reveals unrefined karyotype details [11,80] to identify the diversity in karyotype evolution and chromosomal rearrangement. Homologous dot-plots between Q. glauca and those of other oak species were plotted with the ACEK karyotype map** results. CD-HIT [81] was used to remove redundant protein sequences with “-c 0.8 -aS 0.8 -d 0” parameters for further constructing phylogenetic trees. Then, OrthoFinder v2.5.4 [82] was used to identify orthologs and construct a maximum likelihood (ML) phylogenetic tree with the “-S diamond -M msa” parameters. We used “-M msa” for multiple sequence alignments (MSA) and used default parameters in MAFFT v7.515 [83] and FastTree v2.1.11 [84] to infer maximum likelihood trees.
LTR-RTs identification and annotation
We used EDTA v1.9.6 [85] (Extensive de-novo TE Annotator), a comprehensive process tool that integrates the results of several current LTR prediction tools, such as LTR_FINDER [86], LTRharvest [87], and LTR_retriever [88], to build a highly reliable non-redundant TE database, and annotated repeated sequences with RepeatMasker [89]. We used EDTA.pl with the “-species others -step all -anno 1 -sensitive 1” parameters to obtain the TE database for each oak genome. The protein domains of the elements belonging to different lineages of Copia or Gypsy superfamilies were analyzed using REXdb [27], which was implemented using TEsorter v1.2.5.2 [59]. The recombination caused by the disappearance of internal components will lead to the removal of intact LTR-RTs and the formation of solo LTRs [61, 62]. We extracted solo LTRs from the annotation file generated by the RepeatMasker in EDTA.
To explore LTR-RTs amplification and the disparity in evolution among oak species, we used the formula T = (1 - identity) / 2µ to calculate the transposition time of LTR-RTs, where identity represents the sequence similarity between 5’ and 3’ LTRs obtained from the EDTA analysis, µ represents the base substitution rate. The substitution rate 1.01 × 10− 8 of Q. lobate [49] is the oak substitution rate in this study. To investigate the historical dynamics of different lineages of Copia and Gypsy, we extracted RT protein domain sequences of diverse lineages in these superfamilies by the concatenate_domains.py script in TEsorter [59]. After sequence alignments were carried out using MAFFT v7.515 [83], ML phylogenetic trees were constructed and visualized using FastTree v2.1.11 [84] and iTOL [90], respectively.
LTR-RTs associated with genes
We analyzed the number and function of genes that overlap with LTR-RTs. The LTR-RTs overlap** with gene and promoter regions were calculated using the “intersect” function from BEDtools v2.30.0 [91]. Protein sequences of the gene and promoter regions overlap** with LTR-RTs were extracted. GO enrichment analysis of extracted genes was carried out using the eggNOG-mapper [92] online tool and the R package ClusterProfiler [93]. The metabolic pathways were annotated with KAAS [94] and visualized with R package ggplot2 [95].
We used transcriptome data from the leaf, inflorescence, and stem of Q. glauca from the NCBI SRA database (BioProject: PRJNA868092) to evaluate the impact of LTR-RTs on the expression of adjacent genes. Hisat2 v2.2.1 [96], Samtools v1.13 [97], and StringTie v2.2.1 [98] were used to compare transcriptome data to the reference genome, sort and index sam files, and obtain the read count. Gene expression level was quantified in TPM (transcripts per million). Paralogous genes were detected using BLAST v2.12.0 [80]. Expression levels of paralogous genes with and without overlap** LTR-RT were compared. We further analyzed the impact of LTR-RTs insertion on the expression level of resistance genes (R-genes), as the evolution of R-genes is widely considered to be affected by LTR-RT insertion.
LTR-RTs distribution
LTR/Copia and LTR/Gypsy were usually mixed with tandem repeats and enriched in plant centromere regions. Combined with previous research [79, 99], we used Q. glauca as a reference to scan the regions with a higher frequency of tandem repeat, LTR/Copia, and LTR/Gypsy distribution and also a higher GC content but low-frequency gene density. The densities of genes, tandem repeats, LTR/Copia, and LTR/Gypsy were calculated using BEDtools v2.30.0 [91] with parameters “-w 1000000 -s 200000” to make interval “windows” and “-counts -F 0.5” to compute the coverage. The GC content of the Q. glauca genome was calculated by seqkit [100] tools with the same sliding window size. The R scripts completed data visualization.
To predict potential centromere regions, we first used the Telomeres_and_Centromeres [99] method to detect the tandem repeats (TRs) by TRF v4.09.1 [101] software with the “2 7 7 80 10 50 500 -f -d -m” parameters, and TRF2GFF (https://github.com/Adamtaranto/TRF2GFF) was used to merge the annotated results. Then we screened high-frequency repeat units in each chromosome, using IGV v2.16.1 [102] to visualize the density of genome annotation, LTR-RTs, and repeat units. Potential centromere regions showed low-frequency peaks of genome and TE and high-frequency peaks of repeat units in IGV. Second, Juicebox v1.11.08 [103] was used to observe the Hi-C heat map of the Q. glauca [40] genome. Third, Centromics (https://github.com/zhangrengang/Centromics) and the CentroMiner tools of quarTeT v1.1.1 [78] were used default parameters to predict the potential centromere regions.
Data availability
The genomes of Q. gilva, Q. lobata, Q. mongolica, and Q. robur are available in the NCBI repository (https://www.ncbi.nlm.nih.gov/) with GenBank accession numbers GCA_023736055.1, GCA_001633185.5, GCA_011696235.1, and GCF_932294415.1, respectively. The genomes of Q. acutissima, Q. dentata, and Q. glauca are available in the NGDC repository (https://ngdc.cncb.ac.cn/) with accession numbers GWHBGBO00000000, GWHBRAD00000000, and GWHCAYJ00000000, respectively. The Q. variabilis genome is available in the CNGB repository (https://db.cngb.org/) with accession number CNA0051893.
References
Wu F, Tanksley SD. Chromosomal evolution in the plant family Solanaceae. BMC Genomics. 2010;11(182):1–11.
Soltis DE, Soltis PS. Polyploidy: recurrent formation and genome evolution. Trends Ecol Evol. 1999;14(9):348–52.
Schubert I. Chromosome evolution. Curr Opin Plant Biol. 2007;10(2):109–15.
Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, et al. Ancestral polyploidy in seed plants and angiosperms. Nature. 2011;473(7345):97–100.
Wu S, Han B, Jiao Y. Genetic contribution of paleopolyploidy to adaptive evolution in angiosperms. Mol Plant. 2020;13(1):59–71.
Chanderbali AS, ** L, Xu Q, Zhang Y, Zhang J, Jian S, Carroll E, Sankoff D, Albert VA, Howarth DG, et al. Buxus and Tetracentron genomes help resolve eudicot genome history. Nat Commun. 2022;13(1):643.
Jiao Y, Leebens-Mack J, Ayyampalayam S, Bowers JE, McKain MR, McNeal J, Rolf M, Ruzicka DR, Wafula E, Wickett NJ, et al. A genome triplication associated with early diversification of the core eudicots. Genome Biol. 2012;13(1):1–14.
Schubert I, Lysak MA. Interpretation of karyotype evolution should consider chromosome structural constraints. Trends Genet. 2011;27(6):207–16.
Eichler EE, Sankoff D. Structural dynamics of eukaryotic chromosome evolution. Science. 2003;301(5634):793–7.
Bennetzen JL, Wang H. The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu Rev Plant Biol. 2014;65:505–30.
Plomion C, Aury J-M, Amselem J, Leroy T, Murat F, Duplessis S, Faye S, Francillonne N, Labadie K, Provost GL, et al. Oak genome reveals facets of long lifespan. Nat Plants. 2018;4(7):440–52.
**e D, Xu Y, Wang J, Liu W, Zhou Q, Luo S, Huang W, He X, Li Q, Yuan J, et al. The wax gourd genomes offer insights into the genetic diversity and ancestral cucurbit karyotype. Nat Commun. 2019;10(1):5158.
Murat F, Armero A, Pont C, Klopp C, Salse J. Reconstructing the genome of the most recent common ancestor of flowering plants. Nat Genet. 2017;49(4):490–6.
Sun P, Jiao B, Yang Y, Shan L, Li T, Li X, ** Z, Wang X, Liu J. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol Plant. 2022;15(12):1841–51.
Wang Z, Li Y, Sun P, Zhu M, Wang D, Lu Z, Hu H, Xu R, Zhang J, Ma J, et al. A high-quality Buxus Austro-Yunnanensis (Buxales) genome provides new insights into karyotype evolution in early eudicots. BMC Biol. 2022;20(1):1–17.
Gantuz M, Morales A, Bertoldi MV, Ibañez VN, Duarte PF, Marfil CF, Masuelli RW. Hybridization and polyploidization effects on LTR-retrotransposon activation in potato genome. J Plant Res. 2022;135:81–95.
Zhao M, Ma J. Co-evolution of plant LTR-retrotransposons and their host genomes. Protein cell. 2013;4(7):493–501.
Baniaga AE, Barker MS. Nuclear genome size is positively correlated with median LTR-RT insertion time in fern and lycophyte genomes. Am Fern J. 2019;109(3):248–66.
Mehrotra S, Goyal V. Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genom Proteom Bioinf. 2014;12(4):164–71.
Biscotti MA, Olmo E, Heslop-Harrison JS. Repetitive DNA in eukaryotic genomes. Chromosome Res. 2015;23:415–20.
Baucom RS, Estill JC, Chaparro C, Upshaw N, Jogi A, Deragon J-M, Westerman P, SanMigue PJ, Bennetzen JL. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome. PLoS Genet. 2009;5(11):e1000732.
**a E, Tong W, Hou Y, An Y, Chen L, Wu Q, Liu Y, Yu J, Li F, Li R, et al. The reference genome of tea plant and resequencing of 81 diverse accessions provide insights into its genome evolution and adaptation. Mol Plant. 2020;13(7):1013–26.
Li G, Wang L, Yang J, He H, ** H, Li X, Ren T, Ren Z, Li F, Han X, et al. A high-quality genome assembly highlights rye genomic characteristics and agronomically important genes. Nat Genet. 2021;53(4):574–84.
Li S, She H, Yang L, Lan L, Zhang X, Wang L, Zhang Y, Li N, Deng C, Qian W, et al. Impact of LTR-retrotransposons on genome structure, evolution, and function in Curcurbitaceae species. Int J Mol Sci. 2022;23(17):10158.
Yang L, Zhang X, Wang L, Li Y, Li X, Yang Y, Su Q, Chen N, Zhang Y, Li N, et al. Lineage-specific amplification and epigenetic regulation of LTR-retrotransposons contribute to the structure, evolution, and function of Fabaceae species. BMC Genomics. 2023;24(1):423.
Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O, et al. A unified classification system for eukaryotic transposable elements. Nat Rev Genet. 2007;8(12):973–82.
Neumann P, Novák P, Hoštáková N, Macas J. Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mob DNA. 2019;10(1):1–17.
Yuan J, Jiang S, Jian J, Liu M, Yue Z, Xu J, Li J, Xu C, Lin L, **g L, et al. Genomic basis of the giga-chromosomes and giga-genome of tree peony Paeonia Ostii. Nat Commun. 2022;13(1):7328.
Du J, Tian Z, Hans CS, Laten HM, Cannon SB, Jackson SA, Shoemaker RC, Ma J. Evolutionary conservation, diversity and specificity of LTR-retrotransposons in flowering plants: insights from genome‐wide analysis and multi‐specific comparison. Plant J. 2010;63(4):584–98.
De Souza TB, Chaluvadi SR, Johnen L, Marques A, González-Elizondo MS, Bennetzen JL, Vanzela AL. Analysis of retrotransposon abundance, diversity and distribution in holocentric Eleocharis (Cyperaceae) genomes. Ann Bot. 2018;122(2):279–90.
Bennetzen JL, Ma J, Devos KM. Mechanisms of recent genome size variation in flowering plants. Ann Bot. 2005;95(1):127–32.
Liu Z, Yue W, Li D, Wang RR-C, Kong X, Lu K, Wang G, Dong Y, ** W, Zhang X. Structure and dynamics of retrotransposons at wheat centromeres and pericentromeres. Chromosoma. 2008;117(5):445–56.
Kim S, Choi D. New role of LTR-retrotransposons for emergence and expansion of disease-resistance genes and high-copy gene families in plants. BMB Rep. 2018;51(2):55–6.
Bahmani M, Forouzan S, Fazeli-Moghadam E, Rafieian-Kopaei M, Adineh A, Saberianpour S. Oak (Quercus branti): an overview. J Chem Pharm Res. 2015;7(1):634–9.
Wang Y, Xu C, Wang Q, Jiang Y, Qin L. Germplasm resources of oaks (Quercus L.) in China: utilization and prospects. Biology. 2022;12(1):76.
Burlacu E, Nisca A, Tanase C. A comprehensive review of phytochemistry and biological activities of Quercus species. Forests. 2020;11(9):904.
Fu R, Zhu Y, Liu Y, Feng Y, Lu R-S, Li Y, Li P, Kremer A, Lascoux M, Chen J. Genome-wide analyses of introgression between two sympatric Asian oak species. Nat Ecol Evol. 2022;6(7):924–35.
Wang W, He X, Yan X, Ma B, Lu C, Wu J, Zheng Y, Wang W, Xue W, Tian X, et al. Chromosome-scale genome assembly and insights into the metabolome and gene regulation of leaf color transition in an important oak species, Quercus dentata. New Phytol. 2023;238(5):2016–32.
Zhou X, Liu N, Jiang X, Qin Z, Farooq TH, Cao F, Li H. A chromosome-scale genome assembly of Quercus gilva: insights into the evolution of Quercus section Cyclobalanopsis (Fagaceae). Front Plant Sci. 2022;13:1012277.
Luo C, Li T, Jiang X, Song Y, Fan T, Shen X, Yi R, Ao X, Xu G, Deng M. High-quality haplotype-resolved genome assemblies of ring-cup oak (Quercus glauca) provide insight into the demographic dynamics of a dominant tree in East Asia subtropics evergreen broadleaved forests. Mol Ecol Resour. 2023;e13914.
Sork VL, Fitz-Gibbon ST, Puiu D, Crepeau M, Gugger PF, Sherman R, Stevens K, Langley CH, Pellegrini M, Salzberg SL. First draft assembly and annotation of the genome of a California endemic oak Quercus lobata Née (Fagaceae). G3: Genes, Genomes, Genet. 2016;6(11):3485-95.
Ai W, Liu Y, Mei M, Zhang X, Tan E, Liu H, Han X, Zhan H, Lu X. A chromosome-scale genome assembly of the Mongolian oak (Quercus mongolica). Mol Ecol Resour. 2022;22(6):2396–410.
Han B, Wang L, **an Y, **e X, Li W, Zhao Y, Zhang R, Qin X, Li D, Jia H. A chromosome-level genome assembly of the Chinese cork oak (Quercus variabilis). Front Plant Sci. 2022;13:1001583.
Yang Y, Zhou T, Duan D, Yang J, Feng L, Zhao G. Comparative analysis of the complete chloroplast genomes of five Quercus species. Front Plant Sci. 2016;7:959.
Deng M, Jiang XL, Hipp A, Manos P, Hahn M. Phylogeny and biogeography of East Asian evergreen oaks (Quercus section Cyclobalanopsis; Fagaceae): Insights into the Cenozoic history of evergreen broad-leaved forests in subtropical Asia. Mol Phylogenet Evol. 2018;119: 170–81.
Yang Y, Zhou T, Qian Z, Zhao G. Phylogenetic relationships in Chinese oaks (Fagaceae, Quercus): evidence from plastid genome using low-coverage whole genome sequencing. Genomics. 2021;113(3):1438–47.
Hipp AL, Manos PS, Hahn M, Avishai M, Bodénès C, Cavender-Bares J, Crow AA, Deng M, Denk T, Fitz-Gibbon S, et al. Genomic landscape of the global oak phylogeny. New Phytol. 2020;226(4):1198–212.
Jiang X, Hipp AL, Deng M, Su T, Zhou Z, Yan M. East Asian origins of European holly oaks (Quercus section Ilex Loudon) via the Tibet-Himalaya. J Biogeogr. 2019;46(10):2203–14.
Sork VL, Cokus SJ, Fitz-Gibbon ST, Zimin AV, Puiu D, Garcia JA, Gugger PF, Henriquez CL, Zhen Y, Lohmueller KE, et al. High-quality genome and methylomes illustrate features underlying evolutionary success of oaks. Nat Commun. 2022;13(1):2047.
Salse J. Ancestors of modern plant crops. Curr Opin Plant Biol. 2016;30:134–42.
Wang Z, Wang J, Pan Y, Lei T, Ge W, Wang L, Zhang L, Li Y, Zhao K, Liu T, et al. Reconstruction of evolutionary trajectories of chromosomes unraveled independent genomic repatterning between Triticeae and Brachypodium. BMC Genomics. 2019;20:1–10.
Li S, Su T, Cheng G, Wang B, Li X, Deng C, Gao W. Chromosome evolution in connection with repetitive sequences and epigenetics in plants. Genes. 2017;8(10):290.
Petit RJ, Bodénès C, Ducousso A, Roussel G, Kremer A. Hybridization as a mechanism of invasion in oaks. New Phytol. 2004;161(1):151–64.
Wei G, Li X, Fang Y. Sympatric genome size variation and hybridization of four oak species as determined by flow cytometry genome size variation and hybridization. Ecol Evol. 2021;11(4):1729–40.
Lucek K, Giménez MD, Joron M, Rafajlović M, Searle JB, Walden N, Westram AM, Faria R. The impact of chromosomal rearrangements in speciation: from micro-to macroevolution. Cold Spring Harb Perspect Biol. 2023;15(11):a041447.
Faria R, Navarro A. Chromosomal speciation revisited: rearranging theory with pieces of evidence. Trends Ecol Evol. 2010;25(11):660–9.
Rieseberg LH. Chromosomal rearrangements and speciation. Trends Ecol Evol. 2001;16(7):351–8.
Kirkpatrick M, Barton N. Chromosome inversions, local adaptation and speciation. Genetics. 2006;173(1):419–34.
Zhang R, Li G, Wang X, Dainat J, Wang Z, Ou S, Ma Y. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic Res. 2022;9:uhac017.
Hofstatter PG, Thangavel G, Lux T, Neumann P, Vondrak T, Novak P, Zhang M, Costa L, Castellani M, Scott A, et al. Repeat-based holocentromeres influence genome architecture and karyotype evolution. Cell. 2022;185(17):3153–68.
Vitte C, Panaud O. Formation of solo-LTRs through unequal homologous recombination counterbalances amplifications of LTR retrotransposons in rice Oryza sativa L. Mol Biol Evol. 2003;20(4):528–40.
Vitte C, Panaud O. LTR retrotransposons and flowering plant genome size: emergence of the increase/decrease model. Cytogenet Genome Res. 2005;110(1–4):91–107.
Stritt C, Wyler M, Gimmi EL, Pippel M, Roulin AC. Diversity, dynamics and effects of long terminal repeat retrotransposons in the model grass Brachypodium distachyon. New Phytol. 2020;227(6):1736–48.
Bui QT, Grandbastien M-A. LTR retrotransposons as controlling elements of genome response to stress? Plant transposable elements: impact on genome structure and function. 2012;24;273– 96.
Zhao Y, Li X, **e J, Xu W, Chen S, Zhang X, Liu S, Wu J, Kassaby YA, Zhang D. Transposable elements: distribution, polymorphism, and climate adaptation in Populus. Front Plant Sci. 2022;13:814718.
Grandbastien M-A. LTR retrotransposons, handy hitchhikers of plant regulation and stress response. BBA-Gene Regul Mech. 2015;1849(4):403–16.
Havecker ER, Voytas DF. The soybean retroelement SIRE1 uses stop codon suppression to express its envelope-like protein. EMBO Rep. 2003;4(3):274–7.
Laten HM, Majumdar A, Gaucher EA. SIRE-1, a copia/Ty1-like retroelement from soybean, encodes a retroviral envelope-like protein. Proc Natl Acad Sci. 1998;95(12):6897–902.
Kejnovsky E, Kubat Z, Macas J, Hobza R, Mracek J, Vyskot B. Retand: a novel family of gypsy-like retrotransposons harboring an amplified tandem repeat. Mol Genet Genomics. 2006;276:254–63.
Zhang L, Hu J, Han X, Li J, Gao Y, Richards CM, Zhang C, Tian Y, Liu G, Gul H, et al. A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nat Commun. 2019;10(1):1494.
Ji J, Yang L, Fang Z, Zhuang M, Zhang Y, Lv H, Liu Y, Li Z. Recessive male sterility in cabbage (Brassica oleracea var. capitata) caused by loss of function of BoCYP704B1 due to the insertion of a LTR-retrotransposon. Theor Appl Genet. 2017;130(7):1441–51.
Guo N, Wang S, Gao L, Liu Y, Wang X, Lai E, Duan M, Wang G, Li J, Yang M, et al. Genome sequencing sheds light on the contribution of structural variants to Brassica oleracea diversification. BMC Biol. 2021;19(1):1–15.
Rajarammohan S, Pental D, Kaur J. Near-complete genome assembly of Alternaria brassicae—A necrotrophic pathogen of Brassica crops. Mol Plant Microbe in. 2019;32(8):928–30.
Balzano E, Giunta S. Centromeres under pressure: evolutionary innovation in conflict with conserved function. Genes. 2020;11(8):912.
Tørresen OK, Star B, Mier P, Andrade-Navarro MA, Bateman A, Jarnot P, Gruca A, Grynbery M, Kajava AV, Promponas VJ, et al. Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. Nucleic Acids Res. 2019;47(21):10994–1006.
Marshall OJ, Chueh AC, Wong LH, Choo KA. Neocentromeres: new insights into centromere structure, disease development, and karyotype evolution. Am J Hum Genet. 2008;82(2):261–82.
Melters DP, Bradnam KR, Young HA, Telis N, May MR, Ruby JG, Sebra R, Peluso P, Eid J, Rank D, et al. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 2013;14(1):1–20.
Lin Y, Ye C, Li X, Chen Q, Wu Y, Zhang F, Pan R, Zhang S, Chen S, Wang X, et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification. Hortic Res. 2023;10(8):uhad127.
Yue J, Chen Q, Wang Y, Zhang L, Ye C, Wang X, Cao S, Lin Y, Huang W, **an H, et al. Telomere-to-telomere and gap-free reference genome assembly of the kiwifruit Actinidia chinensis. Hortic Res. 2023;10(2):uhac264.
Mount DW. Using the basic local alignment search tool (BLAST). Cold Spring Harb Protoc. 2007;2007(7):17.
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–9.
Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:1–14.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010;5(3):e9490.
Ou S, Su W, Liao Y, Chougule K, Agda JR, Hellinga AJ, Blanco Lugo CS, Elliott TA, Ware D, Peterson T, et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20(1):1–18.
Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35(suppl2):W265–W8.
Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008;9:1–14.
Ou S, Jiang N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018;176(2):1410–22.
Chen N. Using repeat Masker to identify repetitive elements in genomic sequences. Curr Protoc Bioinf. 2004;5(1):4–10.
Letunic I, Bork P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293–W6.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol. 2021;38(12):5825–9.
Yu G, Wang L, Han Y, He Q. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16(5):284–7.
Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35(suppl2):W182–W5.
Wickham H. ggplot2. WIREs Comp Stat. 2011;3(2):180–5.
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genoty** with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37(8):907–15.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33(3):290–5.
Shi X, Cao S, Wang X, Huang S, Wang Y, Liu Z, Liu W, Leng X, Peng Y, Wang N, et al. The complete reference genome for grapevine (Vitis vinifera L.) genetics and breeding. Hortic Res. 2023;10(05):uhad061.
Shen W, Le S, Li Y, Hu F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE. 2016;11(10):e0163962.
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27(2):573–80.
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178–92.
Durand NC, Robinson JT, Shamim MS, Machol I, Mesirov JP, Lander ES, Aiden EL. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 2016;3(1):99–101.
Acknowledgements
We would like to thank Dr. Kelly Barr at the University of California, Los Angeles for her assistance with English language and grammatical editing of the manuscript.
Funding
This work was supported by grants from the Natural Science Foundation of Hunan Province (2022JJ40861), the Scientific Research Fund of Hunan Provincial Education Department (21A0158), the National Natural Science Foundation of China (31700174), and Central South University of Forestry and Technology Innovation Foundation for Postgraduate (2022CX02061).
Author information
Authors and Affiliations
Contributions
XLJ designed and supervised the project. RBC and XLJ comprehensively analyzed data and wrote the draft manuscript. XLJ and RBC revision of the manuscript. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Cao, RB., Chen, R., Liao, KX. et al. Karyotype and LTR-RTs analysis provide insights into oak genomic evolution. BMC Genomics 25, 328 (2024). https://doi.org/10.1186/s12864-024-10177-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864-024-10177-6