Background

Organization of metazoan mitochondrial genomes (mitogenomes) is usually conserved [1]. Notwithstanding, many cases of genome reorganization (e.g., rearrangement, duplication and loss) occur in closely related animals [24]. Typically, the mitogenome of metazoans encodes 13 protein-coding genes along with 2 rRNAs and 22 tRNAs. The 13 protein-coding genes, all of which play vital roles in the respiration chain, producing 95% of the adenosine triphosphate (ATP) required for cellular energy through oxidative phosphorylation [5]. The function of these proteins results in the distribution of mitochondrial DNA (mtDNA) diversity being far from random [6]. Evidence for adaptive evolution of mtDNA genes exists in some vertebrate lineages, such as mammals and reptiles [79].

Many studies have indicated that protein-coding genes of the mitogenome experienced adaptive or purifying selection, yet the role that selective pressure plays on reorganization of the mitogenome is subject to debate. On the one hand, mitogenomic organization may evolve neutrally [10]. Dowton et al. [2] characterized 67 gene arrangements in the Hymenoptera and suggested that tRNAs occupy selectively neutral positions. Furthermore, Boussau et al. [11] proposed that mitogenomic structural evolution (i.e., gene duplication) was influenced by population size. The genomic duplication is more likely to occur in lineages where the efficiency of selection had been reduced and the ratio of nonsynonymous to synonymous substitution (dN/dS) increased. These findings indicate that mitogenomic reorganization accompanies lower or relaxed selection, and that fixation of the structural alteration is nonadaptive [11, 12]. On the other hand, positive selection could also act on gene order in the mitogenome. The location of highly transcribed RNAs (such as 12S and 16S rRNA) is adjacent to transcriptional regulatory elements in the control region (CR) [13]. Significant correlations between codon usage and tRNA positions in vertebrate mt genomes (e.g., [14]), suggest that frequently transcribed tRNA genes, such as hydrophobic residues, also occur close to the CR due to functional efficiency. Loss of a duplicated gene may not occur randomly, but rather retention may depend on the distribution of the copies [15]. Thus, changes in tRNA gene positions could owe to adaptive selection [14].

Variation in number of tRNA genes may influence codon usage bias in mitogenomes. Genome reorganization of mtDNA has been linked to variation in the number of tRNA genes [16]. Translational selection may drive the coevolution of tRNA genes and codon usage [17]. Positive correlations between tRNA abundance and codon usage bias have been observed in some unicellular (e.g., Saccharomyces cerevisiae) and multicellular (e.g., Caenorhabditis elegans) eukaryotic genomes [18, 19]. In mitochondria, oxidative phosphorylation often requires a few cytosolic tRNAs encoded by nuclear DNA [20], and these imported tRNAs could compensate for changes in the number of mt tRNA genes. Consequently, the influence asserted by changes in the number of tRNA genes and the role played by selection on mitogenome rearrangements remains elusive [21]. An understanding of selective constraints on mitogenomic reorganization might provide some clarity on the mechanisms that underlie mitogenome evolution. Unfortunately, little is known about how gene duplications, losses, and rearrangements of tRNA genes influence codon usage of mt genes due to the paucity of examples of mitogenomic reorganization within closely related species. Among vertebrates, anurans (especially neobatrachians) facilitate revealing the relationship between tRNA genes positions and codon usage due to the high levels of gene rearrangements [2224]. The variation also allows for explorations into how variation in tRNA gene copy number influences codon usage bias.

Most mitogenomes of non-neobatrachians (e.g., Archaeobatrachia) have the vertebrate ancestral gene order (AGO) [25, 26]; Leiopelma archeyi and Leptolalax pelodytoides are exceptions [24, 27]. In contrast, rearranged gene orders (RGO) characterize neobatrachians [28, 29], which share LTPF clusters (trnL-trnT-trnP-trnF) resulted from rearrangements of four typical vertebrate tRNA genes [2931]. Further, recurrent gene rearrangements involve duplications and/or losses of CR and tRNA genes [3, 3235]. These findings contrast with the proposition that vertebrates possess highly conserved mitogenomic organizations [1, 36].

Accelerated mt substitution rates in protein-coding genes occur in neobatrachians. Such could be a consequence of relaxed purifying selection in the ancestor of neobatrachians [23, 37]. However, highly rearranged mt genomes and high rates of nucleotide substitution are not significantly correlated [23, 37]. Studies on the correlation between tRNA gene rearrangements and codon usage could provide new insights into the role played by selection on mitogenomic reorganization.

Herein, we sequence the mt genome of three congeneric species of Glandirana and three hotspots of gene rearrangements across 20 species in the Ranoides. We supplement these data with sequences from GenBank to examine the relationship between tRNA gene arrangements and codon usage in anurans. Our results suggest selection does not favor any specific rearrangement of gene position. Further, tRNA gene duplications or losses do not appear to influence codon usage bias. Our findings shed light on the non-adaptive evolution of mitogenomic reorganization, at least in neobatrachians.

Results and discussion

trnS (AGY) pseudogene of Glandiranaand gene rearrangement in Ranidae

We failed to sequence the complete mt genome of Glandirana rugosa, G. emeljanovi, and G. tientaiensis (Table 1, [GenBank: KF771341–KF771343]) due to long, highly repetitive nucleotides in the CRs. All three mtDNA genomes contained 13 protein-coding genes, two rRNA genes, 21 tRNA genes [Ψ trnS (AGY)], and non-coding regions (Additional file 1). The mt genomes of Glandirana were arranged identically to those typical of other neobatrachians, except for the presence of trnS (AGY) pseudogenes (Additional files 1, 2). Coding genes were similar in length to their counterparts in other anurans. Differences in mtDNA genome size and organization of Glandirana owed to the size of repeat-sequences in the CR (Additional file 1).

Table 1 Data for samples of employed Ranidae and mt regions sequenced in this study

We re-designed a pair of PCR primers that were conserved in Glandirana (data not shown) to amplify this region [GenBank: KF771278–KF771280] for confirming the presence of trnS (AGY) pseudogenes. All three mitogenomes in Glandirana included 62–63 bp of a non-coding sequence downstream of trnH in the typical location of trnS (AGY). The primary sequence of trnS (AGY) in Glandirana was very similar to those in the other frogs (Figure 1), though anticodons differed from the typical canonical sequence GCT and among the three species (CCC in G. rugosa; CTA in G. emeljanovi; and TCA in G. tientaiensis). With the exception of the original position of trnS (AGY), homologous fragments were not found. As previously reported in G. rugosa [22], pseudogene trnS (AGY) occurred in all of our Glandirana. This occurrence constituted a synapomorphy for the three species.

Figure 1
figure 1

Aligned sequences for the segment of coding trnS (AGY) of Pelophylax nigromaculata , Odorrana tormota , Buergeria buergeri , and the corresponding segment of Glandirana . The anticodons of trnS (AGY) shown in the frame.

Our study identified gene rearrangements in three ranid hotspot fragments (Table 1). The typical gene order of the trnWtrnY block was trnW, trnA, trnN, origin of light strand replication (OL), trnC, and trnY (WANCY). In some species of Amolops and Odorrana, three gene rearrangements differed from the consensus order across vertebrates. The tandem duplication–random loss (TDRL) model provided a plausible mechanism for these rearrangements [38, 39]. The hypothesized duplicated region in the mitogenome of Amolops chunganensis, A. granulosus, A. kangtingensis, A. loloensis, and A. mantzorum included a partial fragment of trnA, all of trnN and OL, and a partial fragment of trnC (Figure 2a). The inferred duplications in the mitogenome of Odorrana schmackeri included all of trnN, OL, trnC, trnY, and partial cox1 (~264 bp) (Figure 2b). The hypothesized duplicated region in the mtDNA of O. versabilis included a partial fragment of trnA, and all of trnN (Figure 2c). In the WANCY fragments, gene rearrangements were detected for O. schmackeri, O. versabilis and the five species of Amolops, but differences in the duplicated fragments and the position of the pseudogenes or superfluous gene copies suggested these features originated separately (Figure 2). The WANCY region was reported to be a hotspot for gene order rearrangement in amphibians [39, 40], and our discovery was consistent with these findings.

Figure 2
figure 2

Putative mechanism of gene rearrangement of the mitochondrial sequences according to a model of tandem duplication of gene regions and subsequent gene deletions. WANCY gene rearrangement in (a) Amolops chunganensis, A. granulosus, A. kangtingensis, A. loloensis, and A. mantzorum, (b) Odorrana schmackeri, and (c) O. versabilis.

All analyzed species of Odorrana shared a translocation of trnH, which moved to CR from between nad4 and trnS (AGY). This result was consistent with the previous observations [22, 32]. The original trnH in Babina pleuraden also become a pseudogenes. Owing to shared patterns, Kakehashi et al. [41] proposed that the trnHtrnE block was duplicated in the ancestral lineage of Babina and Odorrana. However, Babina okinavana, B. holsti, and B. subaspera have functional trnH genes in the ancestral position [41]. These results suggest that random losses the duplications may drive the differences in gene order in closely-related species.

An inter-genic spacer occurs between nad5 and nad6 in Babina adenopleura and Odorrana schmackeri at lengths of 457 bp and 306 bp, respectively. We cannot identify the noncoding sequences and no evidence suggests they were homologous fragments. This evidence indicates independent origins.

Genomic features and phylogenetic relationship

The final concatenated alignment of our mtDNA dataset for 50 species contained 10836 nucleotide positions, including 7361 variable sites of which 6746 were potentially parsimony-informative. Maximum likelihood (ML) and Bayesian inference (BI) methods of phylogenetic reconstruction obtained in the same tree topologies for 13 mt protein-coding genes. The trees differed only in branch lengths (Figure 3). Monophyly of the Neobatrachia was supported by our work and previous studies [42, 43], while the Archaeobatrachia was paraphyletic [44, 45]. The major clades of frogs (Figure 3) were consistent in recent morphological and molecular analyses [4648].

Figure 3
figure 3

A maximum likelihood phylogeny of the derived from 13 coding protein of mtDNA sequences for Anura. Bayesian inference obtained the same topology. Numbers above the lines or beside the nodes were rapid bootstrap proportions calculated with 1000 replicates and Bayesian posterior probabilities, respectively. The branch lengths are to scale. Mt genomic features that are useful markers for inferring relationships, and genomic features of each species are labeled on tree as follows: (A) LTPF cluster; (B) trnS (AGY) pseudogene; (C) translocation of trnH; (D) translocation of nad5; (E) LTPF changed into TLPF cluster; (F) duplication of CR and trnM; (G) duplication of trnM; (H) translocation of trnE and loss one copy of trnM; (I) trnA, trnN, trnC and trnE gene loss; (J) trnQ, trnA, trnN, trnC, trnY, and trnP gene loss; (K) translocation of nad5; (L) translocation of trnP; (M) translocation of trnL(CUN); (N) WANCY changed into WACYN; (O) translocation of nad5, trnE, and trnP.

The aligned ranid data contained 1322 nucleotide positions. Of these sites, 677 were variable and 535 were potentially parsimony-informative. Figure 4 depicted the phylogeny of ranids based on 12S and 16S rRNA. ML and BI analyses produced identical trees. All neobatrachian families (Ranidae, Dicroglossidae, Rhacophoridae, Mantellidae, and Microhylidae) formed a clade and monophyly of each ranid genus was well supported. Within the Ranidae, our analyses recovered a sister taxa relationship between Rana + Lithobates and Odorrana + Babina (Figure 4), which was consistent previous studies [48]. However, the phylogenetic relationships among Amolops, Glandirana, Pelophylax, and Hylarana conflicted with other hypotheses [42, 49, 50]. Our results located Glandirana as the sister taxon of Amolops, but with weak Bayesian support (BPP = 80). Analyses of the mt protein-coding genes and the 12S and 16S rRNA (Figures 3, 4) did not support monophyly of section Pelophylax (including the subgenera Pelophylax and Rugosa [Glandirana]) as proposed by Dubois [51]. This corresponded to the previous view that Pelophylax was polyphyletic [50].

Figure 4
figure 4

A strict consensus tree from the Bayesian inference analyses derived from the sequences of the genes 12S and 16S rRNA for Ranidae. Numbers above the lines or beside the nodes were Bayesian posterior probabilities. The mt genomic features of each species on the tree are as follows: (A) translocation of trnH; (B) trnE transposition and CR-psH-S1-nad5; (C) WANCY rearrangement; (D) transposition of trnG-nad3 block and duplication of CR; (E) trnS (AGY) pseudogene; (F) translocation of trnC.

We mapped genomic features of neobatrachian mtDNA on the phylogeny (Figures 3, 4) to provide additional data for inferring history. Generally, gene rearrangements have been considered to be relatively rare, random events and, thus, they constituted useful synapomorphies [5254]. Four tRNA genes, trnL (CUN), trnT, trnP, and trnF, were rearranged to form the LTPF cluster (labeled “A” in Figure 3), which was a synapomorphy for the Neobatrachia. Descendants shared unique gene rearrangements in ancestral lineages and/or additional rearrangements. For example, all the descendants of Dicroglossidae shared the trnM duplication [28, 33, 47, 55] (Figure 3). The Rhacophoridae and Mantellinae shared the translocation of nad5 [56, 57] (Figure 3). Finally, the five species of Amolops formed a clade and all species shared the same WANCY gene rearrangement (Figure 4).

Gene rearrangements comprised distinct genomic characters for some genera (Figures 3, 4). For example, Glandirana was associated with genomic features derived from the pseudogene trnS (AGY), Odorrana was associated with structural trnH translocations, and Pelophylax retained the ancestral condition of typical LTPF. The derived features of the mitogenomes could serve as useful indicators of phylogenetic relationships, especially in lineages gene sequence data often lead to ambiguous results [42, 48]. Duplication of the trnHtrnE block supports the sister relationship of Babina and Odorrana [41] (Figure 4). Duplication of trnM also supports the rooting of Occidozyga martensii as the sister group of Dicroglossidae (Figure 3).

In highly rearranged lineages, convergent and parallel gene rearrangements happened frequently in non-sister lineages. Thus, genomic features require careful consideration when being employed for phylogenetic inference. Gene rearrangements vary in their phylogenetic distributions and rates among lineages [4]. In neobatrachians, convergent gene rearrangements occur. For example, a single origin cannot be invoked to explain the distribution of trnM in both Mantellinae and Dicroglossidae; the duplications arose independently (Figure 3) and the positions and residues of duplicated fragments differ [3, 28, 35]. A similar pattern involving the translocation of nad5 was observed in the Rhacophoridae and Fejervarya [33, 56]. Parallel rearrangements also occur in gene rearrangement hotspots (e.g., WANCY and CR) [3].

Correlation between codon usage and tRNA gene position

There are no significant correlations between codon usage and the location of tRNA in most of the AGOs, and no increased correlations in recurrent rearranged neobatrachians compared to basal neobatrachians (Additional file 3). Among the analyzed species, 14 have the AGOs of other vertebrates and 30 RGOs occur in Leiopelma archeyi and the neobatrachians (Additional file 2). Codon usage and tRNA position in the AGOs are very weakly correlated (Figure 5a; Pearson’s correlation coefficient r = -0.005, two-tailed p = 0.93, n = 308; Spearman’s rho correlation coefficient r = -0.118, two-tailed p = 0.039, n = 308). In contrast, the two variables show a significant correlation in neobatrachian RGOs (Figure 5b; Pearson’s correlation r = -0.551, two-tailed p < 0.001, n = 625; Spearman’s rho coefficient nonparametric correlation coefficient r = -0.607, two-tailed p < 0.001, n = 625). However, codon usage and tRNA positions are not significantly correlated in L. archeyi, which is an archaeobatrachian with an independent RGO (Figure 5c; Pearson’s correlation coefficient r = 0.001, two-tailed p = 0.998, n = 22; Spearman’s rho correlation coefficient r = -0.06, two-tailed p = 0.789, n = 22). Correlations between codon usage and tRNA do not increase in neobatrachians upon adding rearrangements relative to basal neobatrachians (Additional file 3). As seen in Glandirana and Limnonectes bannaensis, the loss of tRNA genes does not effect this correlation.

Figure 5
figure 5

A linear regression plot between position of tRNA genes and usage of the corresponding codon. (a) 15 species with evolutionarily stable gene order; (b) 28 slightly rearranged gene orders in neobatrachians; (c) Leiopelma archeyi. Data points for the tRNA genes that specify hydrophobic and hydrophilic amino acids are colored magenta and green, respectively. The regression lines were derived from the all data points in each plot.

Codon usages and tRNA positions were thought to be significantly correlated in typical vertebrates [14]. The tRNA genes of these amino acids occur near the CR where transcription efficiency is high [13]. In mitogenomes, the 13 protein-coding genes associate with transmembrane proteins that are rich in hydrophobic amino acid residues. The tRNA genes that specify hydrophobic amino acid locate much closer to the CR than do hydrophilic amino acids. Consequently, some form(s) of adaptive selection might maintain novel mt gene orders and the rearrangement of genes with higher codon usage to regions near the CR to enhance transcription efficiency [14]. However, our analyses fail to detect a significant correlation in many archaeobatrachians possessing AGOs as well as in Leiopelma archeyi (Additional file 3). In comparison, significant correlations occur between codon usage and position of tRNAs in all neobatrachians sharing rearrangement LTPF. However, this does not infer a recent adaptation in the lineages with RGOs. Although rearrangement LTPF dramatically improves correlations, all other recurrent rearrangements do not do so. These recurrent rearrangements marginally or insignificantly improve correlations in neobatrachians (Additional file 3). Thus, highly-used tRNA genes in recent, re-appearing RGOs do not appear to be located closer to the CR than typical RGOs.

Codon usage bias and tRNA gene number

Analyses of the relative synonymous codon usage (RSCU) in 14 anurans mitochondrial genomes yield insights into the relationship between changes in the number of tRNA genes (e.g., duplication/loss) and codon usage bias. The results did not indicate a difference in codon usage bias in duplicated or lost tRNA genes (Additional file 4). Gene trnM was duplicated in Dicroglossidae and Mantella madagascariensis yet their codon usage does not differ from those of other anurans (Chi-square test, p >0.05). Three species of Glandirana that lost trnS (AGY) did not differ in codon usage from closely related species (p >0.05). The same situation occurs in Limnonectes bannaensis in which tRNAs trnA, trnN, trnC, and trnE are absent [34].

The change in tRNA copy numbers could have accompanied gene rearrangement, as explained by the tandem duplication-random loss (TDRS) model [39, 58]. Directional mutation pressure on each strand of DNA and translational selection are major factors in codon usage bias [59]. Codon usage and tRNA abundance are tightly correlated in Escherichia coli and Saccharomyces cerevisiae [19]. This implies that tRNA gene copy numbers can evolve to match codon usage. Under translational selection, codon usage and tRNA gene content may have evolved to alternative stable states, where selection favors codons that are most rapidly translated by the current tRNAs [17]. However, tRNA gene duplications and losses do not show significant changes in codon bias among closely related anurans. The long period of accumulating mutations and the compactness of mtDNA might preclude such changes. The importation of nuclear tRNAs may also explain the absence of a correlation. Protein synthesis requires the replacement of tRNAs. In most organisms, mitochondrial biogenesis requires the importation of both large number of proteins and at least a few cytosolic tRNAs [20, 60]. Cytosolic tRNAs replace lost mt tRNA as well as reduce the influence of mt tRNA gene duplications or rearrangements. This alone may explain the absence of correlation between tRNA position and codon usage. Even though the importation of cytosolic tRNAs may compensate for missing mt tRNA genes, most vertebrates have a complete set of mt tRNA genes [20, 61]. The tRNA genes absent in Glandirana also occur rarely in other vertebrates, though they have been found in the wallaroo (Macropus robustus) [62], Chinese big-headed turtle Platysternon megacephalum [63] and large-headed frog Limnonectes bannaensis [34].

Nucleotide substitution and mitogenomic reorganization

Neobatrachian mt genomes have higher nucleotide substitution rates than archaeobatrachians [23, 37, 64]. To check the differences in the rates of nucleotide divergence in neobatrachians, we calculated average synonymous (dS) and nonsynonymous (dN) substitutions using a maximum likelihood tree for each of the 13 mitochondrial protein-coding genes as well as six congeneric pairwise comparisons among our anurans. Congeneric comparisons comprised two AGOs (Bombina maxima and B. bombina; and Xenopus laevis, and X. tropicalis) and four RGOs (Glandirana emeljanovi and G. rugosa; Limnonectes fujianensis and L. bannaensis; Pelophylax nigromaculata and P. chosenica; and Bufo japonicas, and B. gargarizans) (Figure 6). All neobatrachian mitochondrial genes have a significantly elevated dN (p < 0.01) and dS (p < 0.01).

Figure 6
figure 6

Six pairwise comparison of sequence divergence for each mitochondrial gene. Species-pairs are as follows: Bombina maxima and B. bombina; Xenopus laevis and X. tropicalis; Glandirana emeljanovi and G. rugosa; Limnonectes fujianensis and L. bannaensis; Pelophylax nigromaculata and P. chosenica; and Bufo japonicas and B. gargarizans.

Nonsynonymous/synonymous substitution rates (dN/dS = ω) for each gene can test whether or not mitogenomic reorganization associates with an overall relaxation of selective pressures on mt function. In all comparisons, dN/dS differs significantly from a null ratio of 1 (Figure 6; range 0.000–0.245). The mean dN/dS does not differ significantly between the AGOs and RGOs (p >0.05). ML estimates of ω indicate that purifying selection acts on each protein-coding gene; no significant difference (p >0.05) occurs in the fitting of models M1a (nearly neutral) and M2a (positive selection) (Additional file 5). Estimates of ω under the two-ratio model for the Anura are not higher than its null model and only cob differs significantly by LRT (Table 2, Additional file 5). Values of ω are significantly less than 1 for all 13 protein-coding genes in the AGO and RGOs (Figure 6, Table 2). This result supports the presence of strong purifying selection in anurans [23, 37], which strongly preserves mt gene function [65, 66]. However, ratios of ω for all genes, except cob, nad3, nad4L, and nad6, are significantly higher in neobatrachians when compared to non-neobatrachians (Table 2). Similarly, independent values of ω inferred for the stem branch of Natatanura are generally higher than the respective null model and evaluations of most individual genes obtain significant results (Table 2, Additional file 5). Corresponding to previous studies [37], the elevated ω in the branch leading to the Neobatrachia indicates the possible relaxation of purifying selection on protein-coding genes (Table 2).

Table 2 Results from branch models that assume branch-specific changes in the selection coefficient (ω) in Neobatrachia, based on single-gene alignments and the all-mt gene datasets

A positive correlation occurs between the rates of mt gene rearrangement and mt nucleotide diversity in some lineages of invertebrates [21, 67, 68]. Accordingly, Shao et al. [67] hypothesized that an increase in substitution rates can drive elevated rates of gene rearrangement. A high substitution rate leads to an increase in mutation at the sites of initiation and termination of the mt genome replication. These mutations may cause errors during replication of mt genomes and then cause gene rearrangements through gene duplications and deletions [15, 38]. Considering the TDRL model, a high rate of nucleotide substitution leads to an increase in the rate of mt gene rearrangement. High rates of nucleotide substitution also tend to occur in modified genomes of salamanders [69]. However, fast nucleotide substitution rates are not required to increase the propensity of mitogenomic rearrangements [23, 37]. Our results show that neobatrachian mt genes have a significantly elevated dN (p < 0.01) and dS (p < 0.01). Irisarri et al. [23] proposed that an accelerated rate of mt substitution rate was a result of the relaxation of purifying selection on protein-coding genes. In neobatrachians, some clades, such as Rhacophoridae, Mantellinae, and Dicroglossidae (Figure 3), have an increase in the rate of mt gene rearrangement. However, no significant difference in substitution rate occurs between non-rearranged and rearranged ranoids [37]. Thus, the high diversity of recurrent rearrangements in neobatrachians appears to be driven by variables beyond a high rate of nucleotide substitution, for example, life histories and genetic drift—both of which may increase the chance that rearranged mt genomes survive and drive fixation [67]. Considering rearrangements in the archaeobatrachians Leiopelma archeyi and Leptolalax pelodytoides [24, 27], accelerated substitution rates may be lineage-specific features, rather than accompanying gene rearrangement. No differences in rate of change occur between lineages with gene duplications and their non-duplicated counterparts [37]. Our results also failed to detect significant differences in rates when tRNA genes are lost (e.g., Glandirana branch, LRT p >0.05). Overall, these results indicate that high substitution rates and the relaxation of purifying selection can increase the chance of mitogenomic reorganization in neobatrachians, but they are not required.

Selective pressure on mitogenomic reorganization

Selection on reorganized mitogenomes is less well studied compared to selection on mt protein-coding genes [2, 14]. Although a positive correlation might occur between codon usage and tRNA gene positions [14], genomic reorganization does not improve this relationship. Adaptive selection does not appear to act on tRNA gene positions after genomic rearrangement. Further, the widespread importation of cytosolic tRNA into mitochondria [20] may preclude a codon usage bias owing to tRNA gene duplications or losses. The nonadaptive evolution of gene order suggests that random loss follows gene duplication. Hotspots of gene rearrangement (eg. WANCY) in Amolops and Odorrana (Figure 3) have similar portions of duplicated genes yet the positions of silenced genes vary. These findings support the TDRL model of rearrangement in many neobatrachians [22, 29, 56].

Nucleotide substitution rates and ω are higher in neobatrachian mt genomes than in archaeobatrachians and yet no significant correlation exists between the rates of mitogenomic reorganization and mt nucleotide diversity [23, 37]. Values of ω strongly indicate that purifying selection likely contributes to the maintenance of metabolic function in anurans and trans-membrane protein functions likely constrain nonsynonymous mutations [23]. Some functional constraints on the mitogenomic organization of rRNA and tRNA genes exist (e.g., 12S and 16S rRNAs generally locate near the CR due to requiring high transcriptional rates [13]; secondary structures of tRNA genes involve punctuation models or termination signals [70]). Functional constraints do not necessarily favor one gene rearrangement over another. The fixation of large-scale mitogenomic reorganization largely involves two nonadaptive fores: random genetic drift and mutation pressure [11, 71]. Further studies on the molecular and demographic mechanisms of this hypothesis may yield new insights into the evolution of mitogenomic reorganization.

Methods

We sequenced the mt genome of Glandirana rugosa, G. emeljanovi, and G. tientaiensis (Table 1). We also sequenced four mtDNA fragments including one conserved sequence [12S–16S rRNA] and three hotspots of rearrangements in 20 ranid species across the genera Amolops, Babina, Hylarana, Odorrana and Rana (Table 1). Specimens were stored in 95% ethanol at -20°C in the Chengdu Institute of Biology. All work with animals was conducted according to relevant national and international guidelines. All animal care and experimental procedures were approved by the Chengdu Institute of Biology Animal Care and Use Committee.

Total genomic DNA was extracted from muscle tissue through SDS-proteinase K/phenol-chloroform protocols [72]. Using complete mtDNA information available for Ranidae, we designed 13 pairs of primers (Additional file 6) that amplified overlap** fragments spanning the entire mt genome. From our 20 species, we determined three hotspot fragments (nad2cox1, nad3nad5, and nad5cob) and partial fragments of 12S and 16S rRNA genes using primers described in Kurabayashi [73]. We amplified these fragments using a combination of normal polymerase chain reaction (PCR) and long-and-accurate (LA) PCR methods; normal PCR was carried out in a 25 μL mixture containing 0.5–1.0 μL of template DNA, 2.5 μL 10 × PCR buffer, 0.5 μL of each primer (10 pm/μL), 2 μL MgCl2, 2 μL dNTPs and 0.3 μL of Extaq DNA polymerase (TaKaRa Bio, Dalian, China) with reaction conditions as follows: 4 min at 95°C, followed by 33 cycles of 40 s at 94°C, 45 s at 48–58°C, 1.5 min at 72°C and a final extension of 8 min at 72°C. LA-PCR was carried out in a 25 μL mixture containing 0.5–1.0 μL of template DNA, 12.5 μL 2 × PCR buffer, 0.5 μL of each primer (10 pm/μL), 4 μL dNTPs and 0.5 μL of KOD FX DNA polymerase (TOYOBO, Osaka, Japan). PCR conditions consisted of 3 min at 98°C, followed by 33 cycles of 50 s at 94°C, 60 s at 50–58°C, 4 min at 72°C and a final extension for 10 min at 72°C. We sub-cloned PCR fragments of the CR containing long tandem repeats using a pMD™ 18-T Vector (TaKaRa Bio, Dalian, China) and sequenced this fragment by additional walking primers (Additional file 6). DNA sequencing was performed on an ABI 3730 automatic DNA sequencer.

Genome annotation

We extracted protein-coding sequences from each mt genome and identified mt tRNA genes using tRNAscan-SE v.1.21 (http://lowelab.ucsc.edu/tRNAscan-SE) [74]. We excluded incorrect annotations by comparing original annotations from GenBank with the vertebrate mt genetic code. We identified trnS (AGY) by visually inspecting unassigned regions for sequences with similarity and assigning them to previously identified mt tRNA isotypes. We determined the locations of the 13 protein-coding and 2 rRNA genes through comparisons with homologous sequences in other anurans [22, 29, 30].

We downloaded 40 complete and 3 partial anuran mt genomes from GenBank (Additional file 2). We used taxonomic names prior to Frost [75] to test their hypothesis and referred to NCBI Organelle Genome Resource and MitoZoa (http://www.caspur.it/mitozoa) [76] to determine genome organizations for each species (Additional file 2).

Measuring codon usage and the position of each tRNA gene

We chose 15 archaeobatrachian species plus 29 neobatrachian species to examine variation in codon usage and gene arrangement. We determined correlations between physical (base pair [bp]) distances of each tRNA gene from the CR (from the 5’ end of the tRNA gene to the 3’ end of the CR) and codon usage of the 13 protein-coding genes (where overlap** codons were considered once for each gene) following Satoh et al. [14]. For mt genomes with two CR or tRNA copies, we analyzed the first copy.

To evaluate the codon usage bias in a single codon family, we calculated relative synonymous codon usage (RSCU) values of every codon in each sample. RSCU values were obtained by using DAMBE [77] and MEGA5 [78]. We evaluated trnL and trnS codons, which had two groups of codons (CUN and UUR, and AGY and UCN, respectively), as two synonymous codon families [79].

Genetic divergence, molecular evolution and phylogenetics

We aligned sequences of the 13 mt protein-coding genes using ClustalW in MEGA5 [78]. To avoid artificial alignment errors, we used Gblocks v.0.19b [80] with the following settings to remove ambiguous alignments: 26 minimum conserved positions, 42 minimum flanking positions, 8 maximum non-conserved positions, and a minimum length of a block of 5 while allowing gap positions in half.

We used DnaSP 5.10 [81] to determine DNA polymorphism and divergence, including the number of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN). Maximum likelihood (ML) estimates of nonsynonymous/synonymous substitution rates (ω) were obtained with CODEML in the PAML4.4 package [82]. The nearly neutral (M1a) and positive selection (M2a) models were compared using a likelihood ratio test (LRT). To detect if the 13 protein-coding genes of neobatrachian species experienced divergent patterns of selection compared to non-neobatrachian species and other amphibians, we applied a two-ratio model test to the each-gene and all-gene datasets. Two separate two-ratio models were fitted: one ratio was assigned to the interested branch (Anura, Neobatrachia, Natatanura) and the other was assigned to the remaining lineages. These two-ratio models were also compared against the null model (one ratio model) by LRT.

To further confirm the phylogenetic relationships among anurans, 46 Anura and four outgroup taxa (two Caudata [GenBank: NC_008085, NC_004926] and two Gymnophiona [GenBank: NC_006301, NC_006302]) were included in this analyses. We constructed the phylogenies using the concatenated 13 mt protein-coding genes and partitioned these genes by codon position. The best fitted substitution model for each partition was estimated using Akaike information criterion (AIC) implemented in jModeltest [83]. The model of GTR + I + G was chosen for ML and Bayesian inference (BI) analyses, which were performed with RAxML BlackBox web-servers (http://phylobench.vital-it.ch/raxml-bb/index.php) [84] and MrBayes 3.1 [85], respectively. BI analyses used the following settings: 10 million Markov chain Monte Carlo (MCMC) generations, a sampling frequency of 1000, and calculating a majority rule consensus tree after omitting the first 25% trees as burn-in.

To provide further insight into the phylogenetic relationships of species with gene rearrangments, we conducted additional ML and BI analyses using 12S and 16S rRNA sequence data from 65 taxa (58 ranids from nine genera, three dicroglossids, one rhacophorid, one mantellid, and two microhylids; Figure 4). Alignment of these sequences was verified using secondary structure [86]. We used the same procedures for tree reconstruction as described above for the 13 mt protein-coding genes.

Availability of supporting data

Organization and features of mitochondrial genome in three species of Glandirana are available in Additional file 1. Detailed information of 46 anurans mitochondrial genomes used in this study is given in Additional file 2. The correlations between the codon usage and tRNA positions of each 44 anuran species are available in Additional file 3. Relative synonymous codon usage (RSCU) values in 13 protein-coding genes of anurans mitochondrial genomes are given in Additional file 4. Positive selection detection of the mt protein genes among different branches are presented in Additional file 5. Primers designed for amplifying the complete mitochondrial genome of Glandirana are listed in Additional file 6.