Introduction

Voles constitute one of the youngest groups of rodents. Mountain voles, belonging to genus Neodon (Rodentia: Cricetidae), which occur only in the Tibetan-Himalayan region (THR) (Fig. 1)26, along with 1X–15X whole-genome sequencing (WGS) data for each morphologically distinct taxon (Supplementary Data 2). This extensive dataset allowed the identification and description of six new species of Neodon. Our analyses showed that rapid climate change, complex topography and founder events resulting from dispersal were the key factors driving Neodon diversification and evolution. In addition, our de novo genome assembly revealed the genetic basis of the adaptations of mountain voles to high-latitude environments, characterised by pressures such as hypoxia, high UV radiation and low temperatures.

Results

Morphological evidence of six unidentified lineages of Neodon

We analysed 235 specimens of Neodon to examine the possible existence of new species of Neodon based on morphological evidence. We also generated WGS data for 48 specimens to explore their genetic divergence and potential taxonomic status (Supplementary Data 1 and 2).

Initial observations of skulls, teeth and bacular structures showed 15 distinct patterns (Fig. 2 and Supplementary Fig. 1), each representing a putative species of Neodon. This included eight described species1,24, one previously evaluated group with unclear status and six tentatively unidentified taxa. We recorded the characteristics of the genitalia and bacular structures for males; and dental, external and cranial measurements for both sexes (detailed abbreviations of the measurements are provided in Supplementary Data 3) and further conducted morphological comparisons of these 15 putative species. The morphology of the glans penis provided useful clues about the affinities of microtine species and the differences in the characters of the glans penis clearly distinguished all putative species (Fig. 2). The pairwise Euclidean distances of dental measurements (e.g., the number of closed triangles on the first lower molar) also distinguished 15 patterns (Supplementary Fig. 2 and Supplementary Data 4). In addition, principal component analysis (PCA) (Supplementary Fig. 3) and subsequent two-sided t-tests or Wilcoxon rank-sum tests (Supplementary Fig. 4) of 17 statistical measurements of external and cranial characteristics of 95 intact adults (Supplementary Data 57) also resolved all 15 putative species.

Fig. 2: Comparison of morphological features.
figure 2

a Comparison of tooth rows. b Comparison of glans penes. Numbered views are 1: glans; 2: midventral cut view; 3: urethral lappet; 4: dorsal papilla. Lettered structural features in a1 and a2 are a. distal baculum; b. outer crater; c. inner crater; d. ventral groove; e. glans; f. prepuce; g. penis body; h. station of dorsal papilla; i. lateral baculum (cartilage); j. urethral lappet; k. lateral baculum (bony part); l. distal baculum (bony part); and m. proximal baculum. The taxa are (from top to bottom) Neodon leucurus, N. fuscus, N. linzhiensis, N. forresti, N. irene from Clade 1, N. nyalamensis, N. sikimensis, unidentified taxon 1 (from Nanyi township, Milin County), unidentified taxon 2 (from Shergyla Mountains, Linzhi county), unidentified taxon 3 (from Motuo County, south of the Namchabarwa Mountains) from Clade 2, N. medogensis, unidentified taxon 4 (from Ridong village, Bershula Mountains, Chayu County), N. clarkei, unidentified taxon 5 (from Bomi County) and unidentified taxon 6 (from Chibagou National Nature Reserve, Chayu County) from Clade 3 (refer to Fig. 3 for clade information).

Molecular evidence for all lineages of Neodon

We generated a total of 241 Gb of 10X Genomics linked-reads (67.56X) for one specimen of Neodon sampled from north of the Yarlung Zangbo River on Shergyla Mountain (unidentified taxon 2), and produced a genome assembly with a total length of 2.25 Gb and a scaffold N50 of 10.85 Mb. In addition, we obtained a total of 620.45 Gb of reads for an additional 47 samples, with the amount of data generated from each sample ranging from 2.21 Gb to 43.38 Gb (Supplementary Data 2). A total of 4951 full-length single-copy orthologous gene groups were annotated in our reference genome using BUSCO27. The “nuclear gene set” obtained for all lineages after the removal of low-confidence genes, consisted of a total of 4624 coding genes, with an average, maximal and minimal lengths of 1885 nt, 23,046 nt and 222 nt, respectively (Supplementary Fig. 5).

We obtained complete mitogenomes for sequenced specimens and calculated Kimura 2-parameter genetic distances for mitochondrial protein-coding genes (Supplementary Fig. 6 and Supplementary Data 810). The results for cox1 and cytb, the two most-widely used barcoding genes in mammals, showed an average interspecific genetic distances of 11.00% for cox1 and 11.30% for cytb. Species delimitation methods (bayesian implementation of Poisson Tree Processes (bPTP), automatic barcode gap discovery (ABGD) and BPP) based on mitochondrial or nuclear datasets recognised the same 15 species, including all six undescribed morphological lineages (Supplementary Figs. 710). Furthermore, the species delimitation results identified a split within N. sikimensis (Supplementary Fig. 7), and these specimens showed the greatest intraspecific genetic distances (average of 5.23% for cox1 and 5.20% for cytb). However, the specimens of N. sikimensis, which were collected in the same region at the same time, did not differ significantly in their morphology. This likely indicated the cooccurrence of divergent mitogenomes, as reported in the Asian elephant28 and other species. Further work can explore the possibility of unabated gene flow (one species) or restricted gene flow (two species).

Phylogenetic analysis corroborates taxonomic status

The analyses generated eight phylogenetic trees, among which two were based on barcoding genes (Supplementary Fig. 11), two were based on 13 mitochondrial coding genes (Supplementary Fig. 7), and four were based on 4,624 nuclear genes (Supplementary Fig. 10). All nuclear trees generated by coalescent and concatenation approaches shared the same topology, with only small-scale incongruences being identified between the nuclear trees and the other four mitochondrial gene trees. We used the nuclear ASTRAL III tree as the species tree in downstream analyses (Supplementary Fig. 10a). The phylogenetic analysis recovered Neodon as sister to Lasiopodomys, sharing a common ancestor with Microtus and Alexandromys, and the species of Neodon formed three major clades.

Description of six new species

Multiple resources supported the recognition of six new morphological species. Thus we described these species as follows (Expanded description in Supplementary Note 1, 2). All type series specimens have been deposited with the Sichuan Academy of Forestry:

Neodon namchabarwaensis Liu SY., Zhou CR., Murphy WR. & Liu SL., sp. nov. (unidentified taxon 1)

Holotype

Adult female, field number XZGB0818009, collected by Liao Rui on 10 August 2008. Specimen preserved as a skin, cleaned skull, and tissues. Skull, dentition and mandible in Supplementary Fig. 1a.

Type locality

Nanyi township, Milin County, south of **zang, China, 29.17889° E, 94.15113° N, elevation 3160 m a.s.l.

Paratypes

Five specimens topotypes (3♂♂, 2♀♀), field numbers: XZGB0817007♂, XZGB0818006♀, XZGB0818010♂, XZGB0828001♀, XZGB09N235♂;

Distribution

Known from south of the Yarlung Zangbo River, north of the Namchabarwa Mountains. The lowest elevation is 3130 m a.s.l.

Etymology

Species named for the famous Namcha Barwa Mountain, the highest mountain in this region where the new species occurs.

Diagnosis

Medium body, average length 114.9 mm (adult); average hind foot length 20.1 mm. Average tail length 46.4 mm, approximately 40.4% of HBL. First lower molar with 3 closed triangles in front of the posterior transverse space, 6 inner and 5 outer angles. 1st upper molar with 4 inner and 3 outer angles. 2nd upper molar with 3 inner and 3 outer angles. 3rd upper molar with 4 inner and 3 outer angles.

Neodon shergylaensis Liu SY., Zhou CR., Murphy WR. & Liu SL., sp. nov. (unidentified taxon 2)

Holotype

Adult male, field number XZGB09N195, collected by Liao Rui and Liu Yang on 30 May 2009. Specimen preserved as a skin, cleaned skull, penis and tissues. Skull, dentition, and mandible are in Supplementary Fig. 1b.

Type locality

Shergyla Mountains of Linzhi county, southeast of **zang, China, 29.62368° E, 94.66174° N, elevation 4500 m.

Paratypes

6 specimens (1♂, 5♀♀), field numbers: LZRAP01013♂, LZRAP01020♀, LZRAP01014♀, LZRAP01019♀, XZGB09N197♀, GB0815001J♀

Distribution

Known from north of the Yarlung Zangbo River at over 3160 m a.s.l. both sides of Shergyla Mountains and Niyang River.

Etymology

The species is named for its type locality. This region supports a high biodiversity.

Diagnosis

Medium body, average length 115.7 mm (adult); average hind foot length 19.5 mm. Average tail length 42.7 mm, approximately 37% of HBL. The first lower molar with 3 closed triangles in front of the posterior transverse space, 6 inner and 4 outer angles in 63% specimens, 6 inner and 5 outer angles in 37% specimens. 1st upper molar with 3 inner and 3 outer angles. 2nd upper molar with 3 inner and 3 outer angles. 3rd upper molar with 4 inner and 3 outer angles.

Neodon liaoruii Liu SY., Zhou CR., Meng GL. & Liu SL., sp. nov. (unidentified taxon 3)

Holotype

Adult male, field number XZ11117, collected by Liao Rui on 1 November 2011. Specimen preserved as a skin, cleaned skull, penis and tissues. Skull, dentition and mandible in Supplementary Fig. 1c.

Type locality

Motuo County, south of **zang, China, 29.47028° E, 94.984° N, elevation 3260 m.

Paratypes

Ten specimens (4♂♂, 6♀♀), field numbers: MT11036♀, MT11066♀, MT11067♀, MT11109♂, MT11118♂, MT11120♂, MT11122♀, MT11142♂, MT11143♀, MT11144♀.

Distribution

Known from south of the Namchabarwa Mountains. Lowest elevation 2660 m a.s.l.

Etymology

Species epithet is a patronym for the collector, Mr. Liao Rui. He made an important contribution to our collecting specimens.

Diagnosis

Relatively large body, average length 116.8 mm (adult); average hind foot length 21.1 mm. Avarage tail length 59.3 mm, ~50.8% of HBL. First lower molar with 3 closed triangles in front of the posterior transverse space, 6 inner and 5 outer angles. 1st upper molar with 3 inner and 3 outer angles. 2nd upper molar with 2 inner and 3 outer angles in 66% specimens, and 3 inner and 3 outer angles in another 34% specimens. 3rd upper molar with 4 inner and 3 outer angles in 61% specimens, and 3 inner and 3 outer angles in another 39% specimens.

Neodon bershulaensis Liu SY., Zhou CR., Liu Y. & Liu SL., sp. nov. (unidentified taxon 4)

Holotype

Adult male, field number XZ11010, collected by Liao Rui on 3 Mach 2011. Specimen preserved as a skin, cleaned skull, penis and tissues. Skull, dentition, and mandible in Supplementary Fig. 1d.

Type locality

Ridong village, Bershula Mountains, Chayu County, southeast of **zang, China. 98.12407° E, 28.58392° N, elevation 3750 m a.s.l.

Paratypes

3 intact adult specimens, field number: CHYRD-03♀, CHYRD-04♂, CSD3825♂.

Distribution

Known from the type locality only, Ridong village, Chayu County, southeast of **zang.

Etymology

Species epithet for the famous Bershula Mountains, where type locality, Ridong is at its foot.

Diagnosis

Medium body, average length 107 mm (adult); hind feet length 18–20 mm (average 19 mm). Average tail length 51.5 mm, 48.1% of HBL. First lower molar with 5 closed triangles in front of the posterior transverse space, 6 inner and 4 outer angles. 1st upper molar with 4 inner and 3 outer angles in 70% specimens; other 30% with 3 inner and 3 outer angles. 2nd upper molar with 3 inner and 3 outer angles. 3rd upper molar with 4 inner and 3 outer angles.

Neodon bomiensis Liu SY., Zhou CR., Meng GL. & Liu SL., sp. nov. (unidentified taxon 5)

Holotype

Adult male, field number XZ13015, collected by Liao Rui on 31 October 2013. Specimen preserved as a skin, cleaned skull, penis and tissues. Skull, dentition and mandible in Supplementary Fig. 1e.

Type locality

Bomi County, southeast of **zang, China, 95.9575816° E, 29.82959° N, elevation 2900 m a.s.l.

Paratypes

2 intact adults specimens, field numbers: MT11304♀, XZ13016♀

Distribution

Known only from the type locality, Bomi County, southeast of **zang.

Etymology

Species epithet derived from the county where type series collected.

Diagnosis

Medium body, average length 111.75 mm (adult); hind feet length 18–19 mm (average 18.75 mm). Tail length 53–56 mm (average 53.75 mm), approximately 48.1% of HBL. First lower molar with 4 closed triangles in front of the posterior transverse space, 6 inner and 4 outer angles in 60% specimens; other 40% with 5 inner and 4 outer angles. 1st upper molar with 4 closed triangles after the anterior transverse space, forming 3 inner and 3 outer angles. 2nd upper molar with 3 inner and 3 outer angles, and the last inner angle much small. 3rd upper molar with 3 inner and 3 outer angles.

Neodon chayuensis Liu SY., Zhou CR., Liu Y., Tang MK. & Liu SL., sp. nov. (unidentified taxon 6)

Holotype

Adult female, field number CY37, collected by Liu Yang on 8 October 2007. Specimen preserved as a skin, cleaned skull and tissues. Skull, dentition, and mandible in Supplementary Fig. 1f.

Type locality

Chibagou National Nature Reserve, Chayu County, southeast of ** more than 50% with the adaptor sequence, with a maximal 1 bp mismatches to the adaptor sequence); or (3) more than 30% of the read length below Q30 (Supplementary Data 2).

A total of 1,612.36 million paired-dnd reads were generated with 10X technology was generated for Neodon shergylaensis sp. nov., and the reference genome was assembled using SuperNova v2.1.158, with a preset genome size of 2.50 Gb and a weighted mean molecule size of 18.39 kb. One of the two pseudohaplotypes generated using SuperNova with “--maxreads = ‘all’ --accept-extreme-coverage, --style = pseudohap2” was used to obtain the core gene set using BUSCO v3.0.127. In addition, the assembled genome was annotated using MAKER2 v2.31.1059,60 (control files can be found in Supplementary Data 16) for further evolutionary analysis. We also assembled and annotated the mitogenome of each sample using MitoZ61 with ~3 Gb (~1X) of filtered data61.

Gene dataset construction

A read-map**-then-consensus-calling pipeline was used to obtain orthologous nuclear genes for each sample. For this purpose, (1) we obtained complete and single-copy orthologues of the N. shergylaensis sp. nov. genome using BUSCO v3.0.127 with a database of the Euarchontoglires group (6192 genes, v2), and we removed orthologues with high homologue to each other (BLASTn v2.6.0+ with e value < 1e-5)62 to avoid map** uncertainty in the subsequent steps, deleted orthologues with internal stop codons, extracted corresponding genomic regions for the remaining qualified orthologues with custom Python scripts and then used the residual data as a reference for the resequenced samples. (2) We mapped the WGS data of each sample onto the reference using BWA-MEM v0.7.171 (Supplementary Fig. 17 and Supplementary Data 18 and 19). The “map**-derived” gene dataset was used in subsequent analyses.

Species delimitation

We calculated the Kimura 2-parameter genetic distances between lineages for each gene using the dist.dna function in R ape v1.1-168 and explored the correlation between genetic distance and geographic distance. Geographic distances between different sampling sites were calculated with the geopy.distance.geodesic function in Python geopy v2.0.0 package69 (https://github.com/geopy/geopy). Pearson correlation coefficients were calculated with the cor function in R and plotted using ggpubr70. In addition to morphological identification, we conducted species delimitation analysis using the clustering-based method bPTP71 based on both the mitochondrial and nuclear trees. We also applied the similarity-based ABGD72 and the multispecies coalescent (MSC) model-based BPP v4.3.873 analysis using only the mitochondrial data. We also applied both the A1074,75 and A1176 analyses implemented in BPP to the 31 Neodon specimens using mitochondrial data. Specimens of each morphological species were grouped into the same population, including 1 to 5 specimens of each morphological species. Species delimitation were performed with a user-specified guide tree in the A10 analysis, while species delimitation and species tree inference were jointly calculated in the A11 analysis (detailed in Supplementary Note 1).

Phylogenetic inference

Both coalescent and concatenation methods were used to infer phylogenetic trees. We inferred the best maximum likelihood trees using RAxML v8.2.1277 with the GTR + GAMMA model from 20 independent tree searches and 500 bootstrap replicates for each gene, and then obtained the final species tree using ASTRAL-III78 based on the multispecies coalescent model with the bootstrap support of each node being estimated by the multilocus resampling method79. SVDquartets (parameters of “eval Quartets = 1e + 6 bootstrap = standard”) implemented in PAUP v4.0a16780,81 was also utilised to estimate the species tree with the same dataset to validate the results. Additionally, we concatenated the gene alignments to generate a “supergene” alignment for each species and obtained species trees using IQ-tree v1.6.1282, RAxML or MrBayes83. The inferred phylogenetic tree comprised two gene sets - the “mitochondrial Gene Set” and the “nuclear Gene Set”. All trees inferred from the nuclear dataset showed the same topology, and we thus re-estimated the branch lengths of the final species tree in units of substitutions per site using ExaML v3.0.2184.

Divergence time estimation

We estimated the divergence times of lineages of Neodon based on the second codon sites of nuclear genes using MCMCTree, a Bayesian relaxed clock method implemented in the PAML v4.9 h package30. For estimation, an approximate likelihood calculation of the ‘REV’ (GTR, model = 7) model was applied, and multiple fossil calibration points taken from records in the Palaeobiology Database (Accessed 2018 Dec 12)85 and the timetree database86 were included, as follows: (a) the root age was set as 7.9 Mya, as supported by the occurrence of Promimomys in the fossil record;87 (b) the splits of Lasiopodomys and Neodon, Eothenomys and Myodes were calibrated as <0.53 Mya based on the earliest occurrence of the oldest fossil record of Myodes from the Paleobiology Database and another fossil record of Promimomys;87 (c) the split data of Eothenomys was set as 2.7–5.3 Mya based on the fossil calibration point of Eothenomys (3.6–2.6 Mya);88,89 and (d) the split data of Microtus and Alexandromys was dated between 0.6 and 3.5 Mya based on previous studies90,91. The minimum boundary was supported by the earliest occurrence of Allophaiomys in the fossil record in the database. BaseML first estimated a prior substitution rate, and MCMCTree then generated the Gradient and Hessian matrices with following settings: ‘correlated rates clock’ (clock = 3), overall substitution rate (rgene gamma) set of G (1, 12.0), and rate drift parameter (sigma2 gamma) set of G (1, 4.5). Next, we conducted two independent MCMC runs with different random seed numbers and a burn-in of 500,000 iterations to check for convergence. Each run was sampled every 1000 iterations until 500,000 samples had accumulated. We also applied Tracer v1.7.197 was applied to perform KEGG and GO enrichment annotation. Protein structure was predicted using SWISS-MODEL98.

Statistics and reproducibility

The morphometric variation in non-sex-related measurements of adult specimens was analysed using PCA in SPSS v17.0. We employed Kaiser-Meyer-Olkin and Bartlett’s tests to check the fitness of the PCA, followed by Tukey’s test. Independent-samples two-sided t tests or Wilcoxon rank-sum tests were also performed to check the differences between the taxon pairs after PCA. The significant positively selected genes were confirmed using Bonferroni test. Reproducibility was confirmed by performing analyses with independent replicates (for morphological analyses), five hundred bootstrap replicates or different coalescent and concatenation approaches as described in the Methods section.

Nomenclatural Acts

This published work and the nomenclatural acts it contains have been registered in ZooBank, the proposed online registration system for the International Code of Zoological Nomenclature (ICZN). The ZooBank LSIDs (Life Science Identifiers) can be resolved and the associated information viewed through any standard web browser by appending the LSID to the prefix “http://zoobank.org/”. The LSID for this publication is: urn:lsid:zoobank.org:pub:794808AA-EA46-4E86-B482-9983214688BB.

The LSID for Neodon namchabarwaensis Liu SY., Zhou CR., Murphy WR. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:8B19E76E-2E5F-452E-A94B-0824DB45CB30

The LSID for Neodon shergylaensis Liu SY., Zhou CR., Murphy WR. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:811C522A-2B13-48EE-A8B4-3B758E3EB129

The LSID for Neodon liaoruii Liu SY., Zhou CR., Meng GL. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:D4E07979-F92F-4825-BA6B-BB5B6881F9FD

The LSID for Neodon bershulaensis Liu SY., Zhou CR., Liu Y. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:A72C6927-1269-4E65-8183-6F48D86F06E9

The LSID for Neodon bomiensis Liu SY., Zhou CR., Meng GL. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:445E9955-1D43-41E9-AE51-71A3CDFDB28D

The LSID for Neodon chayuensis Liu SY., Zhou CR., Liu Y., Tang MK. & Liu SL., sp. nov. is: urn:lsid:zoobank.org:act:0F26DDC2-C279-4DE6-AAF2-B1E4C9917B6F

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.