Background

Phylogeography has been widely used to study processes of diversification in the context of the geological and climatic history of a geographic region [1]. California has been the focus of a large number of phylogeographic studies, largely because of the great diversity of species inhabiting the state and the complex geological processes active in the region [24]. The California floristic province has been designated as one of the world’s top 25 hotspots of biodiversity and the state of California encompasses 70 % of this province [4, 5]. Endemism in the California floristic province is high, with 71 of the 584 (12 %) vertebrates in the region being endemics [5]. Phylogeographic studies of individual species in this region have often revealed high levels of phylogeographic structure [69] and the presence of cryptic species [1013], which suggests that there are even greater levels of biodiversity in the region than previously recognized.

The California Floristic Province is home to 44 species of salamander, of which 33 are endemic to the province, according to the taxonomy in AmphibiaWeb (2015). The salamanders in this region have been the focus of many phylogeographic studies. Some studies have discovered additional cryptic species that have added to the endemism of the region [1315]. Other studies have found high levels of divergence among geographically structured clades [6, 16, 17], many of which deserve protection as evolutionarily significant units [18]. The high levels of phylogeographic structure observed in these studies can be explained by the fact that salamanders in general have very low levels of dispersal. Salamanders may thus be particularly sensitive indicators of climatic and geologic factors that can lead to vicariance. Therefore, phylogeographic studies of salamanders are likely to be particularly informative about historical barriers to gene flow that have contributed to diversification in California.

Phylogeographic studies are especially effective when multiple forms of evidence are used to infer the history of a species. Mitochondrial DNA is widely used for phylogeography because it is particularly good at detecting historical relationships among populations, due to its low effective population size, high mutation rate, and lack of recombination [19, 20]. However, a gene tree based upon any single genetic region may not reflect the true history of a species because of incomplete lineage sorting or introgression [21]. One solution to this problem is to collect data for multiple loci, which allows researchers to utilize species tree methodologies for phylogenetics and can help control for any factors that only affect individual loci when inferring phylogeographic structure [2224]. A second approach for interpreting phylogeographic patterns is to use species distribution models to generate hypotheses about where genetic breaks could have occurred due to historical range fragmentation in past climatic conditions [25, 26]. A third approach is to observe whether genetic breaks within a particular species are also found in other species, which would suggest that shared environmental barriers to gene flow have led to vicariance in a community of species [2, 27]. Our goal is to use a combination of multi-locus data, species distribution modeling, and comparative phylogeography to better understand the evolutionary history of a species of salamander endemic to the California floristic province, the Arboreal Salamander (Aneides lugubris).

The Arboreal Salamander is a fully terrestrial, nocturnal, direct develo**, lungless salamander in the family Plethodontidae [28]. The most common coloration is a brown or tan back with yellow spots, although the number and size of the spots vary between populations (Fig. 1). Arboreal Salamanders on South Farallon Island were previously recognized as a distinct subspecies A. l. farallonensis (Farallon Salamander; Van Denburgh, 1905 [29]), primarily on the basis of the presence of larger yellow spots than most mainland populations [30]. Arboreal salamanders are associated with oak woodland habitats, but they are also found in ecotonal regions and even treeless areas. They are well-adapted for climbing with well-developed limbs, a prehensile tail, and long toes. Arboreal Salamanders often deposit their eggs in holes of live oak trees and eggs have been found as high as 9 meters, and the species has been found as high as 18 meters in the canopy [31]. Aneides lugubris is unique among Aneides, and in fact among all plethodontid salamanders, in being hyperossified with many unique osteological features, and in being heavily muscularized [32]. The species occurs in the Coast Ranges from Humboldt County in northern California to a southern limit in the vicinity of Ensenada in Baja California Norte, Mexico, with disjunct populations in the western-central Sierra Nevada Mountains, on South Farallon Island, Año Nuevo Island, Catalina Island, and Los Coronados Islands off of northwestern Baja California (Fig. 2) [33, 34]. Despite its extensive range, A. lugubris has not yet been the focus of a comprehensive phylogeographic study. Some phylogeographic data were collected for A. lugubris as part of a study of nine codistributed Californian species [35]. This previous study examined relationships between county level units of A. lugubris using an unrooted mtDNA tree, but the sampling was limited and it did not take into account the possibility of multiple genetic clades occupying the same county [35]. In addition, an unpublished allozyme study suggested that the Sierra Nevada population is genetically distinct and that the Farallon Island population is most closely related to populations in nearby Marin County [36].

Fig. 1
figure 1

Aneides lugubris from (a) Lake Co., (b) Solano Co., (c) the Sierra Nevada Mtns., and (d) Santa Cruz Co. (Photos: Mitchell Mulks). Note the pronounced geographic variation in spotting among populations. Note also that the salamander in panel D has a head wound that likely resulted from agonistic interaction with another A. lugubris

Fig. 2
figure 2

Relief map of California showing the estimated range of Aneides lugubris in beige, and the localities of genetic samples as circles. The colors of each sample locality correspond to the clade colors in Fig. 3

Phylogeographic study of A. lugubris is particularly relevant because of its potential for tracking the history of an important ecological assemblage in California. The complex and highly diversified California flora is derived from Tertiary formations and three are generally recognized: 1) plant formations of northern origin, widely distributed in a circum-polar pattern, the Arcto-Tertiary Geoflora, 2) formations that originated to the south, associated with the long trend of aridification in Mexico and southwestern United States, the Madro-Tertiary Geoflora, and 3) formations derived from tropical regions well to the south, the Neotropical Geoflora [37, 38]. Lowe (1950) developed a biogeographic hypothesis for species of Aneides grounded in the long history of the Arcto-Tertiary Geoflora [39]. All species of Aneides, except A. lugubris, are associated with coniferous and broad-leafed deciduous forests of northern origin. For example, species of Aneides in northwestern California are associated with redwood and mixed conifer forests, as well as with some broad-leafed associates such as Madrone, Tanbark Oak, Big-leaf Maple and Black Oak. Aneides lugubris is broadly sympatric with A. flavipunctatus and A. vagrans in southern Humboldt and western Mendocino counties. However, A. lugubris differs from its congeners in that its geographic distribution extends far to the south, well into regions dominated by Madro-Tertiary geofloral derivatives, in particular live oaks and sycamores. It is the only west coast species of Aneides found south of Monterey Bay, and in southern parts of its range, and on islands, it can be found in very open, unforested habitats. Thus, phylogeographic study of A. lugubris can provide insight into the processes that have shaped diversification of salamanders in the southern regions of California.

In this study, we analyzed the phylogeographic structure and population genetics of the Arboreal Salamander using both mitochondrial and nuclear sequence data with three main goals. First, we determined the number of distinct genetic clades, their geographic distributions, and their phylogenetic relationships in order to identify the major genetic breaks within the species. Second, we identified the potential causes of these biogeographic patterns by using species distribution modeling, comparative phylogeography, and considering the geological history of the region. Third, we considered the conservation implications of our data in terms of demarcating evolutionarily significant units and management units within the species [18, 40].

Results

Data characteristics

We obtained mtDNA data for 78 individuals (Fig. 2; Additional file 1: Table S1) and our dataset consisted of 1452 base pairs of aligned sequence data, with 884 bp of the ND4 gene and 568 bp of the cytb gene. We collected nuclear data for 18 A. lugubris and 3 outgroup taxa and obtained sequences for five genes from all individuals except for one A. lugubris (MVZ213101), where low quality DNA prevented the collection of data for three nuclear loci. Our nuclear dataset contained a total of 48 variable sites.

Phylogenetic analyses

A time-calibrated Bayesian phylogenetic analysis conducted in BEAST [

Fig. 3
figure 3

Time-calibrated Bayesian phylogeny of the ND4 and cytb mitochondrial genes for Aneides lugubris. Numbers at nodes represent posterior probability support followed by bootstrap support after the backslash. The tree was pruned of the outgroup species to better visualize the relationships within A. lugubris

Our mtDNA phylogeny identified the probable source populations of the disjunct Sierra Nevada and Farallon Island populations. The Sierra Nevada samples form a monophyletic group that is sister to samples in Marin County. The Farallon Islands population forms a monophyletic group that is sister to samples from Central Sonoma County.

Both our Bayesian and maximum likelihood trees recovered the six clades with strong support. However, the relationships among the six clades were not well supported in the mtDNA trees. The six mitochondrial clades ranged from 2.1 % to 3.6 % sequence divergence from one another (Table 1). None of the clade-specific Tajima’s D statistics were significant, although all values were negative. We could not calculate Tajima’s D for the Pinnacles population because of inadequate sample size.

Table 1 Average uncorrected sequence divergence and Tajima’s D statistic between mitochondrial lineages based on the ND4 gene

Species tree analyses

In order to generate a species tree, we had to assign our 18 individuals with nuclear sequence data to pre-specified groups before the analysis. In general, we did this by grou** samples that were in the same mtDNA clade. However, we deviated from this approach in two ways. First, we treated our Sierra Nevada sample as a distinct group, because we were interested in resolving the phylogenetic relationships of this disjunct population using data from both mtDNA and nuclear genes. Second, we treated the sample from northwest San Benito Co. (SBR266 from the Santa Cruz clade) as a distinct group because of its location at a contact zone.

Our species tree analyses, using all of our genetic data (5 nuclear and 2 mtDNA genes), and analyses that utilized only our nuclear loci, converged on the same tree topology (Fig. 4a, b). These multi-locus species tree analyses resolved the relationships among clades more fully than the mtDNA phylogeny. The basal split in both species trees occurred approximately 2.2 Ma, between a southern/central coast group and everything else (PP = 1, Fig. 4a, b). The Southern and Central Coast clades split ~1.3 Ma and are sister to each other with high support (PP > 0.95). The Northern clade split ~1.1 Ma from the clade containing the Sierra Nevada and SF Bay clades (mtDNA + nDNA PP = 0.92; nDNA PP = 0.98), which split from each other ~0.5 Ma. Both trees utilizing all nuclear genes recover a sister relationship between the Santa Cruz and San Benito groups (PP ≥ 0.95), which diverged ~0.8 Ma. Confidence intervals for major node ages from our tree utilizing all seven loci are found in Additional file 3: Table S2.

Fig. 4
figure 4

Species trees for Aneides lugubris using (a) two mtDNA + five nDNA loci, (b) five nDNA loci, and (c) two mtDNA + two nDNA loci. Sample localities of each clade can be seen on the map in the lower right corner. The trees were pruned of the outgroup species to better visualize the relationships within A. lugubris

The genetically distinct Pinnacles sample could not be included in the above analyses because we were unable to obtain data for this population for three of the nuclear genes. Therefore, we analyzed a dataset that included two mtDNA plus two nuclear loci so that we could infer the relationship of our Pinnacles sample to the rest of the range. The resulting tree (Fig. 4c) finds the Pinnacles sample most closely related to the Southern clade, although support values were generally lower across the tree because of the reduced dataset.

Genetic clustering analysis

Our genetic clustering analysis of five nuclear genes supports the existence of three genetic clusters (Fig. 5). One includes all individuals assigned to the Northern and SF Bay/Sierra Nevada mtDNA clades (Fig. 5, blue points). A second is located in the Santa Cruz Mtns (green points), and a third is located in southern and central California (red points). Gene exchange appears to be in progress among all three genetic populations in the vicinity of the Monterey Bay when considering the mixed ancestry of some salamanders in the region.

Fig. 5
figure 5

a Population structure of Aneides lugubris is depicted in the bar plot where each horizontal bar represents one individual, and the proportion of each color corresponds to the probabilities that the individual is assigned to one of three genetic clusters. b The spatial arrangement of these three gene pools where the number of each locality corresponds to the bar plot numbers, and the beige shaded area is the estimated range of A. lugubris

Species distribution modeling

Our species distribution model (Fig. 6c) effectively captures the current occurrence of A. lugubris, as measured by the Area Under the Curve (AUC) values (Training data AUC = 0.950, test data AUC = 0.927). The six Bioclim variables with the highest permutation importance to the model were mean temperature of driest quarter, precipitation of coldest quarter, temperature annual range, mean temperature of the coldest quarter, minimum temperature of the coldest month, and annual mean temperature. In general, the distribution model closely matched the species occurrence data, but it also suggested two areas of potential occurrence where there are no species records. One area is in the Sierra Nevada, north of where the species is known to occur. The other area is in the Central Valley, between the San Francisco Bay and the Sierra Nevada.

Fig. 6
figure 6

Species distribution models for Aneides lugubris. A species distribution model (panel c) was constructed using present climatic conditions and the locality points shown in the inset (both testing and training points shown). The distribution model was then projected to the climatic conditions of the last glacial maximum ~22,000 years ago (a) and the mid-Holocene ~6,000 years ago (b). The model was also projected to the inferred climatic conditions in the year 2070 under the Community Climate System Model 4 concentration pathway 4.5 global climate model (d). Colors to the red end of the spectrum indicate greater climatic suitability and colors to the blue end of the spectrum indicate lower climatic suitability

A projection of the species distribution model to the climatic conditions of the last glacial maximum indicates a general reduction in favorable climatic conditions during this time (Fig. 6a). However, three potential refugia with relatively high climatic suitability are recognized, one in a broad area centered around the San Francisco Bay Area, another in southern California from Los Angeles to Baja California Norte, and the third in a region around San Luis Obispo. By the Mid-Holocene, the climate in the coastal areas where A. lugubris currently occurs was projected to be nearly as favorable as the present (Fig. 6b). However, conditions in the Sierra Nevada were still relatively poor.

A projection of the species distribution model to the climatic conditions in 2070 under a moderate climate change scenario suggests that favorable conditions will be maintained from the central coast to the northern part of the species distribution (Fig. 6d). However, climatic conditions are predicted to deteriorate in those portions of the Sierra Nevada where the species is known to occur. In addition, southern California is predicted to face poor climatic conditions for persistence, except in some of the mountainous areas.