Introduction

Mountain systems contain an exceptionally high proportion of terrestrial biodiversity, and many narrowly endemic species1,2,3,4. When scaled by area, they are more diverse than lowland regions, including those renowned for high biodiversity such as the Amazon basin2. A defining feature of most major mountain systems is the presence of an alpine zone: a high-elevation biome that is climatically unsuitable for tree growth5. The alpine zone is characterised by distinctive plant communities adapted to life above the treeline that contribute significantly to overall mountain plant biodiversity5,6. Understanding how and why alpine plant communities have evolved is therefore key to unpicking the drivers of mountain biodiversity as a whole. Until now, a lack of robust species level molecular phylogenies for alpine clades has constrained efforts to address this issue.

Alpine species in any given region can arise in one of three ways7,8,7,8,18,19, and even in the alpine zone dispersal has been shown to be prevalent in some contexts. For example, regional-scale analyses22,23.

Alpine speciation includes ecological speciation driven by the intense habitat heterogeneity of mountain systems. This is likely to be prevalent in Lupinus in the Andes11,24,25, where different species exhibit a range of growth forms indicative of ecological specialisation24 and increased rates of adaptation have been shown to occur at coding genes25. Alternatively, the topographical heterogeneity of mountain systems can lead to highly fragmented habitats, potentially promoting allopatric speciation26. Despite these drivers of alpine speciation, the limited spatial extent and extreme conditions of the alpine zone may reduce the available niche space, potentially limiting opportunities for alpine speciation27. The dynamism of mountain systems adds further complexity, with climatic and geological changes underpinning dramatic alterations to the configuration and extent of different habitats through space and time5,10,28,29,30. This could promote speciation through the generation of niches and fragmentation of species ranges, yet also inhibit speciation by preventing stable populations from persisting long enough to become separate species. These conflicting forces are likely to explain the contrasting and context specific conclusions of previous studies of alpine speciation, with some estimating elevated rates of alpine speciation relative to background speciation rates and directly linking orogeny to these elevated rates25,33,34, and the estimation of macroevolutionary parameters35,36,37,38,39,40, mean that this knowledge gap can be addressed.

Here, we present a species-level time-calibrated phylogenetic tree for the angiosperm genus Saxifraga, which primarily occurs in mountain systems33. This phylogenetic tree is estimated from 329 low copy nuclear loci, incorporates 407 species (73% of the 557 species of Saxifraga41) and is analysed alongside a database of regional occurrence and biome preference information with a biogeographical model. This enables three questions to be addressed concerning the role of upslope biome shifts, mountain hop**, and alpine speciation: (1) What is the relative importance of upslope biome shifts and inter-regional mountain hop** for explaining the occurrence of alpine lineages in different regions? (2) Do geological or climatic changes affect the relative importance of these processes through time? (3) Are upslope biome shifts associated with surges in speciation rates? The widespread distribution of Saxifraga (Fig. 1b), coupled with its species richness, existence both inside and outside the alpine zone (Fig. 1c–n), and relatively ancient (late Cretaceous/early Paleocene) origin42, make it an ideal system for investigating processes underpinning the diversity of the alpine zone at broad spatial and temporal scales.

Overall, we show that upslope biome shifts have occurred at a higher rate than inter-regional mountain hop** in Saxifraga. This pattern is especially pronounced in the last 5 Myr, implying that global climatic cooling and temperature fluctuations during this period have led to increased pressure and/or opportunity to adapt to the alpine zone. Further, high speciation rates within Saxifraga are not associated with adaptation to the alpine zone.

Results

Phylogenetic inference

The final species tree estimated in ASTRAL43 contains 580 tips, 491 of which belong to Saxifraga. These represent a total of 407 species, 73% of the species diversity of the genus (Supplementary Fig. 1)41. The phylogenetic tree is broadly congruent with previous trees based on ribosomal (ITS) and plastid DNA33,42. In particular, major clades delimited in33 are recovered with high support, although there are some topological changes within and among these clades (Fig. 2a; Supplementary Fig. 1). There are high levels of gene-tree-species-tree conflict, with the most congruent gene trees sharing approximately 30% of nodes with the inferred species tree topology. Gene-tree-species-tree conflict is particularly pronounced among some recently diverged lineages (Supplementary Fig. 1).

Fig. 2: Phylogeny and speciation dynamics in Saxifraga.
figure 2

a Estimated time-calibrated phylogeny, with branch colours showing lineage-specific speciation rates estimated in BAMM, and the matrix referring to the biomes and regions inhabited by each species. Pli. Pliocene, Q. Quaternary. b Mean speciation rate through time estimated in BAMM. Red line is the posterior mean estimate, blue shaded area is the posterior distribution, darker shades correspond to a higher posterior probability than lighter shades. c Posterior distribution for the speciation rate in each of the three biome categories from the ClaSSE analysis. Raw sequence reads are deposited in the Sequence Read Archive, distribution data are available on Zenodo, the biome and regional assignment database is available in Supplementary Data File 3.

Divergence time estimation

Analyses using treePL44 resulted in a crown node age estimate for Saxifraga of ~67 Ma, close to the Cretaceous-Palaeogene boundary (Fig. 2a) and consistent with previous results based on few nuclear and plastid loci42. This estimate was also consistent across time-calibrated phylogenies estimated under a range of different assumptions, except when no maximum constraints were used at internal nodes, where the crown node age estimate is in the mid-Cretaceous (Supplementary Table 1, Supplementary Fig. 2). Most major clades within Saxifraga are estimated to have originated throughout the Miocene (Fig. 2a), although infrageneric divergence time estimates vary somewhat between methods (Supplementary Table 1, Supplementary Figs. 2, 3).

Speciation rate estimation

Estimation of lineage-specific speciation rates with BAMM39 supports two distinct speciation rate increases (Fig. 2a, b): one ~30 Ma at the origin of the clade containing sections Porphyrion, Saxifraga, and Ciliatae, and a further nested speciation rate increase within section Porphyrion ~ 8 Ma. This configuration of rate shifts (or very similar configurations) receives the highest posterior support, although several alternative configurations are recovered with considerably lower support (many of which are very similar to the best configuration) (Supplementary Fig. 4). Estimation of biome specific speciation rates with the ClaSSE model highlighted that generalists (those inhabiting both the alpine zone and non-alpine biomes) have the highest speciation rate. The rate for such lineages is around four-fold higher than for alpine specialists (corresponding to alpine speciation rates) and species that do not occur in the alpine zone (Fig. 2c).

Rates of upslope biome shifts and inter-regional mountain hop**

The estimated rate of upslope biome shifts is higher than that of inter-regional mountain hop** (Fig. 3). This pattern is especially pronounced during the most recent 5 Myr (Fig. 3a, f). Notably, the increased rate of upslope biome shifts is temporally concordant with falling global temperatures in the Pliocene, and temperature fluctuations in the Pleistocene (Fig. 3a). By contrast, major orogenic events within the distribution range of Saxifraga substantially predate the timing of increased rates of upslope biome shifts.

Fig. 3: Rates of upslope biome shifts and inter-regional mountain hop**.
figure 3

a Posterior mean rate of upslope biome shifts and inter-regional mountain hop** through time, as well as global climate through time (estimated by deep sea oxygen isotope records)52. The vertical dotted lines delimit the time intervals that are referenced in (cf). Pli. Pliocene, Q. Quaternary. bf Posterior distributions of rates of upslope biome shifts and inter-regional mountain hop** from the ClaSSE model, with (b) showing rates across the entire time-calibrated phylogeny, (c) showing rates at times over 15 Ma, (d) showing rates 15 Ma or less, but over 10 Ma, (e) showing rates 10 Ma or less, but over 5 Ma, and (f) showing rates 5 Ma or less. Distribution data are available on Zenodo, the biome and regional assignment database is available in Supplementary Data File 3.

The increased rate of upslope biome shifts in the most recent 5 Myr (Fig. 3a, f) also significantly post-dates speciation rate increases estimated in BAMM (Fig. 2a, b). This finding, alongside the result from the ClaSSE analysis where the speciation rate for alpine specialists is lower than that of generalists (Fig. 2c), indicates that specialisation to the alpine zone is not associated with speciation rate increases.

Topological uncertainty in the species tree could potentially bias estimates of upslope biome shifts and inter-regional mountain hop** and their changing rates through time. An incorrect topology could disrupt the association of clades with certain regions or biomes, and thus lead to overestimation of rates of dispersal or biome shifts. Although it is not computationally feasible to account for this comprehensively, for example by repeating the ClaSSE analysis over a distribution of species tree topologies, we highlight that the species tree topology is well supported, and that there is no decrease in the level of support (indicating an increase in topological error) during the last 5 Myr (Supplementary Fig. 5). The increased rate of upslope biome shifts in recent times is therefore unlikely to result from species tree estimation error.

Discussion

Overall, our results suggest that alpine species in different regions are primarily derived from non-alpine lineages via upslope biome shifts, a process that has intensified substantially within the last 5 Myr. This conclusion is based on the overall estimated rate of upslope biome shifts being ~3 times higher than that of inter-regional mountain hop** (Fig. 3b), and the estimated rate of upslope biome shifts increasing markedly within the last 5 Myr (Fig. 3a, f). The higher rate of upslope biome shifts is especially notable given our analytical framework is biased toward inter-regional mountain hop**, with lineages being able to transition between five regions but only two biome categories (Fig. 1). The recent increase in upslope biome shifts is temporally concordant with, and thus potentially driven by global climatic cooling and increased climatic instability in the Pliocene and Pleistocene (Fig. 3a). Conversely, it is temporally disconnected from, and thus unlikely to be driven by major orogenic events within the distribution range of Saxifraga.

We also show that specialisation to the alpine zone is not associated with higher speciation rates: the ClaSSE analysis estimates the highest speciation rates for biome generalists (Fig. 2c), whilst significant speciation rate increases estimated in BAMM substantially pre-date increased rates of upslope biome shifts (Figs. 2a, b, 3). Rapid alpine speciation does not therefore appear to be a major driver of Saxifraga diversity within the alpine zone. Further studies are necessary to determine whether the patterns presented here are replicated in other clades, and the processes that drive potential similarities and differences that exist between clades.

Upslope biome shifts are more important than inter-regional mountain hop**

Biome shifts and dispersal are fundamentally different mechanisms for assembling regional biotas and because of this their relative importance has been debated extensively45,46,47. Recently, there has been a greater emphasis on dispersal and the closely associated pattern of phylogenetic niche conservatism13,14,15 in which dispersing lineages maintain similar habitat preferences regardless of their geographical location.

However, this study illustrates the importance of upslope biome shifts rather than dispersal of pre-adapted lineages from other regions in Saxifraga (Fig. 3). Limited dispersal within Saxifraga is intuitive given the lack of specialisations for dispersal within the genus48, and the fact that many sections are near endemic to a particular region (Fig. 2a). Further, previous analyses have highlighted that long-distance dispersal in Saxifraga is rare42, whilst regional scale analyses from other groups have demonstrated the importance of upslope biome shifts7,8,10. This study builds on this work by quantifying the relative rate of upslope biome shifts and dispersal at a broader spatial scale and within a robust phylogenomic framework.

The emphasis on upslope biome shifts does not preclude more prevalent dispersal at narrower spatial scales, as has been demonstrated in several settings including dispersal within and between the Himalayas and Qinghai–Tibet-Plateau and within European mountain systems8,Concluding comment

Using phylogenomic data for a major mountain plant clade of the Northern Hemisphere, we illustrate the recency with which alpine species have evolved from non-alpine ancestors. Specifically, we suggest that Pliocene and Pleistocene climatic cooling caused a surge in upslope biome shifting as alpine habitats expanded repeatedly. Alpine habitats are now contracting due to rapid ongoing anthropogenic climate warming, and there is evidence that this is already leading to widespread extinction of alpine plant diversity, a pattern that will likely continue in the future57,58. However, even in the unlikely event that extinction rates remain constant in alpine habitats, our study suggests that the recruitment of new lineages from lowland floras will wane as alpine habitats shrink59. Therefore, alpine plant diversity is likely to be lost if anthropogenic climate change continues unabated.

Methods

Taxon sampling

DNA samples were collected from wild populations, living collections, and herbaria, with the aim being to maximise the number of Saxifraga species that were sampled. Collection efforts followed a species list based on Plants of the World Online41. DNA samples taken from wild and living collections were desiccated in silica gel. However, most samples were sourced from historic collections from nineteen herbaria, with this material also being stored in silica gel prior to DNA extraction. The recorded collection dates of these vouchers ranged up to 186 years before DNA extraction. Herbarium voucher information is given in Supplementary Data File 1. Additional raw sequence data were included from ref. 55 (Supplementary Data File 2), to include a broad sampling of Saxifragales. Taxon selection outside Saxifraga was based on the percentage of overlap** loci that were available for representatives of relevant clades for setting node age constraints.

Probe design for target capture

Locus selection

High-throughput sequencing through targeted enrichment60,61 was used because the method is highly efficient with fragmented DNA from herbarium collections62,63. Biotinylated oligonucleotide baits (probes) were redesigned for loci previously selected as putative low-copy genes throughout the order Saxifragales55. The Saxifraga DNA sequence data resulting from55 showed that probes designed to work universally among Saxifragales had unequal retrieval rates among Saxifraga s.s. More specific probes may therefore produce a higher target capture efficiency within the genus. In addition, higher capture efficiency is warranted due to the strong reliance on herbarium-sourced DNA samples. To supplement the pool of exon regions to select from, reference exon regions that were previously selected for targeted enrichment probes for the closely related genus Micranthes (Saxifragaceae)34 were also included. Among the targeted exons, the 301 loci from the ‘Saxifragales targets’ and the 295 loci from the ‘Micranthes targets’ shared 101 loci, through an initial screening with the ‘Map to reference’ function in Geneious v8.0.5 (www.geneious.com). The remaining 495 unique loci were further used to assess capture efficiency and orthology within Saxifraga.

To evaluate the performance of all loci from the Saxifragales and Micranthes probes in Saxifraga, pre-generated targeted enrichment sequencing results from target-capture sequencing by55 were used. Sequence divergence within Saxifraga was accounted for through the incorporation of target-capture reads from six species, representing different sections: S. fortunei Hook. (sect. Irregulares), S. magellanica Poir. (sect. Saxifraga), S. parnassifolia D.Don (sect. Ciliatae), S. pulchra Engl. & Irmsch. (sect. Porphyrion), S. rebunshirensis Sipliv. (sect. Bronchiales), and S. rotundifolia L. (sect. Cotylea). HybPiper v1.3.164 was then used to align the trimmed reads to the 495 target loci, using the standard options for map** through BWA44. This requires an input phylogenetic tree with molecular branch lengths, and a set of minimum and maximum node age constraints. Several different input phylogenetic trees and sets of fossil calibrations were used, as described above. Cross-validation was used to select the optimal smoothing value (the parameter in treePL that describes among-branch-substitution-rate-variation). However, alternative smoothing values were also used to quantify the effects of different assumptions about among-branch-substitution-rate-variation. A summary of all estimated time-calibrated phylogenetic trees is shown in Supplementary Table 1.

Following divergence time estimation, subsequent analyses were primarily based on a time-calibrated phylogenetic tree designated as main (Supplementary Table 1). This tree is considered to incorporate the most accurate set of divergence time estimates for several reasons. First, it uses the branch-wise method that specifically samples parts of the gene trees that are topologically congruent with the species tree, unlike gene shop** that incorporates parts of the dataset that are topologically incongruent with the species tree. Second, the implemented fossil calibrations have been assigned following careful evaluation of the literature, and careful consideration of theoretical issues relating to the difference between fossil ages and clade ages. Third, cross-validation has been shown to be relatively effective at estimating the extent of substitution rate variation27. However, rates of upslope biome shifts, mountain hop**, alpine speciation, and lineage specific speciation were also estimated in the time-calibrated phylogeny that differs most significantly from main, no maximum (Supplementary Figs. 79, Supplementary Note 1). The sensitivity of key biological conclusions to alternative divergence time estimates could therefore be determined.

Occurrence data

Georeferenced occurrences were downloaded from the Global Biodiversity Information Facility (GBIF) on 30th June 202178. These were cleaned using functions from the CoordinateCleaner R package79 that remove specimens if they are georeferenced to capital cities, the equator, geographic centroids of countries or regions, GBIF headquarters, or biodiversity institutions; or if they are duplicated, have equal latitude and longitude values, or zero latitude or longitude values. This dataset was augmented with herbarium specimens from Royal Botanic Gardens, Kew (K). In some cases, incorporation of specimens from K used specimens that were not georeferenced. Instead, specimen labels were interpreted in order to assign specimens to a region.

Definition of regions and biomes

For computational reasons, the ClaSSE analysis (see below) was only feasible for a maximum of five geographic regions. Five regions were defined that reflect historical dispersal barriers, climate, and current discontinuities in the distribution of Saxifraga42. Following Ebersbach et al.42 the Northern Hemisphere was divided into three main continental areas: North America, Asia, and Europe. Unlike Ebersbach et al.42 this study does not have a specific focus on biogeographic connections among Asian mountain ranges. Asia was therefore treated as a single region. However, the Caucasus, situated between Europe and Asia, was treated as a separate region. This area hosts 22 species of Saxifraga that are separated geographically from their congeners in Europe and Asia by >1000 km. This separation is far larger than any spatial discontinuity in any other region. South America, with only two species of Saxifraga that likely originated from North America42, was not recognised as a separate region but merged with North America. Similarly, the African continent is inhabited by very few species, including just one unique species, and was omitted from the study. Because the Arctic, being ecologically highly similar to the alpine zone, can potentially serve as an important step** stone for dispersal in alpine taxa80, it was included as a region. The following five regions were therefore included: Americas, Asia, Caucasus, Europe, Arctic.

Within each region, two biomes were delimited: alpine (defined as areas at an elevation above which trees do not grow, thus including the sometimes-distinguished nival zone) and non-alpine. However, in the Arctic region only the “alpine” biome was recognised due to the ecological similarity of Arctic lowlands with lower-latitude alpine habitat. In practical terms, this was achieved by not allowing “Arctic non-alpine” taxa to occur in the subsequent analyses. Taken together, this relatively coarse categorisation of biomes and regions is appropriate for the purposes of this study in that it provides a basis for investigating the processes that underpin diversity in the alpine zone at broad geographical scales.

Assignment of species to regions and biomes

Specimens were assigned to one of the five regions based on their occurrence data, or, for some of the additional specimens from K, based on interpretation of the specimen label. Regional assignments for species based on georeferenced specimens were then also manually checked following extensive consultation of the relevant literature (Supplementary Data File 3). Meanwhile, biome assignments for species were performed manually based on the expertise of the authors of this study, following extensive consultation of the relevant literature (Supplementary Data File 3), and with reference to the same occurrence data as for the regional assignments. Each species was designated as alpine, non-alpine, or both. Importantly, when assigning species, the primary focus was the dominant biome preference of a given taxon. Therefore, for species which were primarily biome specialists, but occasionally occur in an alternative biome, these species were classified as specialists (alpine or non-alpine) rather than being placed in the “both” category. When assigning species to biomes, decisions were communicated and discussed extensively amongst the authorship. This reduced the risk that assignments were biased by potential differences in how the authors conceptualise different biomes. Although imperfect, this overall approach was the most appropriate given there is an insufficient number of georeferenced specimens with sufficiently accurate and precise coordinates to reliably assign species to biomes based on such coordinates. The final dataset of geographical and biome assignments contained 344 species, of which 97, 106, and 141 were classified as alpine, non-alpine, and both respectively (Supplementary Data File 3).

Estimating geographical shifts, biome shifts, and speciation using a ClaSSE model

Prior to performing ClaSSE analyses, the time-calibrated phylogeny estimated above was pruned such that it only incorporated the 344 species with biome and regional assignments. Cladogenetic and anagenetic events included in the CLaSSE model are shown in Supplementary Fig. 6. The model incorporates separate speciation, extinction, and anagenetic rates for alpine, non-alpine, and generalist lineages. Sympatric, subset-sympatric, and vicariant speciation events are incorporated. The ClaSSE model was implemented in RevBayes v1.1.240.

Rates of upslope biome shifts were based on the number of inferred transitions from biome generalists to alpine zone specialists, whilst rates of inter-regional mountain hop** were based on the number of times alpine specialists immigrated to a new region. To obtain rates, these values were divided by the time interval under consideration (either the time interval incorporated by the entire dated tree, or more specific time intervals as indicated in Fig. 3).

Lineage-specific diversification rate estimation

Lineage-specific speciation and extinction rates were estimated in BAMM v.2.5.039. These analyses were performed with the same pruned phylogeny as was used with the CLaSSE model. Appropriate priors were designated with the setBAMMpriors function in R from the package BAMMtools81, the prior number of rate shifts was set to 1, and a global sampling fraction of 0.62 was specified. The analysis was run for 10 million generations, sampling every 50 generations. 1 million generations were discarded as burnin.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.