Introduction

Allioideae Herbert, a subfamily of Amaryllidaceae (Asparagales), comprises four tribes, 13 genera and over 900 species1. The subfamily is widely distributed in temperate and subtropical regions of the Northern Hemisphere and South America, and occurs locally in South Africa2. Most Allioideae are economically important plants used in traditional medicine, horticulture, and also as ornamentals. Within Amaryllidaceae, Allioideae can easily be distinguished from the other subfamilies based on superior ovary and solid styles. These subfamilies are further characterized by possession of unique chemical compounds3. Molecular phylogenetic studies have demonstrated the monophyly of each subfamily of Amaryllidaceae using chloroplast (cp) DNA sequence data4,5,6. Despite the morphological, anatomical, chemical, and molecular distinctiveness of Allioideae, its sister group is controversial. Meerow et al.4 suggested that Allioideae is sister to the Agapanthoideae –Amaryllidoideae clade based on two cpDNA rbcL and trnL-F regions, which was also reported by Costa et al.7 inferred from four loci dataset. A more recent analysis of four cpDNA genes by Chen et al.6 found support for a sister relationship between Allioideae and Amaryllidoideae, which is in agreement with the results of Steele et al.5 and ** single nucleotide polymorphism markers for the identification of pineapple (Ananas comosus) germplasm. Hortic. Res. 2, 15056 (2015)." href="/article/10.1038/s41598-021-82692-5#ref-CR31" id="ref-link-section-d83091519e5398">31. In this study, different SSRs were identified among Allioideae that may be useful for studies of molecular markers and population genetics of Allium in particular and Allioideae in general (Tables S3 and S4). Furthermore, eight hotspot regions of cpDNA were identified, which can be used in future studies of interspecies relationships among Allium species (Table S2). Another study on the complete plastomes of Allium revealed different genes with high nucleotide diversity (including ndhK, ndhE, ndhA, rps16, psaI, rpl22, rpl32, and trnK-UUU) in comparison with the present study15. These various findings might be caused by different taxon sampling and an insufficient number of samples among the studies. However, these results provided preliminary data on nucleotide diversity of plastomes for further studies that include all Allium taxa to identify the common hotspot regions across Allium.

Phylogenetic relationships of Allioideae

Our MP and BI analyses consistently recovered Allioideae as sister to Amaryllidoideae (Fig. 2). This result is in line with previous molecular phylogenetic studies of Amaryllidaceae5,6. By contrast, Allioideae was found to be sister to a clade of Amaryllidoideae and Agapanthoideae inferred from data of nuclear ITS and plastid matK, ndhF, and rbcL7. Although Allioideae has superior ovary and solid style (vs. inferior ovary and hollow style in Amaryllidoideae), these characteristics are homoplasious in Asparagales32. Our phylogenomic study recovered Allieae as sister to the rest tribes of Allioideae (Fig. 2). The unique position of Allieae is also corroborated by having the synapomorphic, gynobasic style (vs. terminal in other tribes). Tulbaghieae, sister to Leucocoryneae-Gilliesieae, could be distinguished by the presence of corona in the flower. Moreover, the pseudogenization of cemA gene was only detected in Tulbaghieae. Gilliesieae and Leucocoryneae were strongly supported as sister in agreement with Sassone and Giussani2. This relationship is supported by several morphological characteristics such as terminal style position and absence of corona in the flower. In addition, both tribes were distributed in South America. In particular, Gilliesieae is restricted to Chile and Patagonia in Argentina, while Leucocoryneae is located in Argentina, Chile, Bolivia, Peru, Paraguay, Uruguay, and Brazil. Therefore, molecular phylogenetic relationships among tribes of Allioideae were supported by morphological and geographical evidence.

In the present study, Allium subg. Melanocrommyum and A. subg. Cyathophora were found to be non-monophyletic although 74 protein-coding genes were used (Fig. 2). Previous molecular phylogenetic studies of Allium revealed the non-monophyly of some subgenera53. Applying the Akaike information criterion, jModelTest v.2.1.754 assigned the GTR + I + Г model of molecular evolution to the combined dataset. Four MCMC chains were run simultaneously and sampled every 1000 generations for a total of 20 million generations. We plotted the log-likelihood scores of sample points against generation time using Tracer v.1.5; this ensured that stationarity was achieved after the first 2 million generations by determining whether the log-likelihood values of the sample points reached a stable equilibrium. In addition, we used the AWTY graphical system55 to compare split frequencies among runs and plot the cumulative split frequencies to ensure that stationarity was reached. The first 1000 (10%) sample trees from each run were discarded (representing burn-in), as determined using Tracer v.1.5. A maximum a posteriori tree was constructed by summarising the remaining trees from parallel runs into a majority-rule consensus tree, yielding posterior probability (PP) values for each clade.

Molecular dating analysis

To estimate the divergence times of tribes in Allioideae, we used BEAST v.1.856 based on 74 cpDNA coding regions. The BEAUti interface was used to generate input files for BEAST, in which the GTR + I + Г model, Yule speciation tree prior, and uncorrelated lognormal molecular clock model were applied. Two runs of 200 million generations were set for the MCMC chains, sampling every 1000 generations. Convergence of the stationary distribution was checked through visual inspection of the plotted posterior estimates using Tracer v.1.6. After discarding the first 20,000 (10%) trees as burn-in, the samples were summarised in a maximum clade credibility tree in TreeAnnotator v.1.6.1 using a PP limit of 0.50 and summarising the mean node heights. The mean and 95% HPD of each age estimate were obtained from the combined outputs using Tracer. The results were visualized using Figtree v.1.4.2 [http://tree.bio.ed.ac.uk/software/figtree/].

Age calibration was constrained to the phylogeny of Allioideae and its close relatives. The crown node (C1 in Fig. 3) of Yucca-Hosta was constrained with a uniform distribution from 20.7 to 37.5 mya following McKain et al.57, who estimated the divergence time of Agavoideae using 69 cpDNA coding genes. Three further calibration processes were implemented, as uniform distribution from 50.0 to 67.4 mya for the stem group of Amaryllidaceae (C2); from 42.0 to 61.7 mya for the crown group of Amaryllidaceae (C3); and from 38.1 to 56.5 mya for the stem node of Allioideae (C4).

Ancestral area reconstruction

Biogeographic data for species within Allioideae were compiled from their distributions described in the literature and herbarium specimens. The distribution range of Allioideae species and outgroups was divided into five areas: (A) Eurasia, (B) North America, (C) Africa, (D) South America, and (E) Australia. We coded each species based on the entire range of the species regardless of the sample’s biogeographic source. Ancestral area reconstruction and estimation of spatial patterns of geographic diversification within Allioideae were inferred using the BBM and S-DIVA as implemented in RASP v.2.1b (Reconstruct Ancestral State in Phylogenies, formerly S-DIVA)58. The BBM was run using the fixed state frequencies model (Jukes-Cantor) with equal among-site rate variations over two million generations, 10 chains each, and two parallel runs. In S-DIVA, the frequencies of ancestral ranges at a given node in ancestral reconstructions are averaged over all trees. For these analyses, we used all post burn-in trees obtained from BEAST analysis. The consensus tree used to map the ancestral distribution of each node was obtained using the Compute Condense option in RASP from stored trees. The maximum number of ancestral areas was set to five.