Introduction

Doridina (~ 2,000 species) and its sister group Cladobranchia (~ 1,000 species) are two suborders of mollusk nudibranchs1,2. The dorid nudibranchs are a diverse group of marine mollusks found worldwide that play an important role in the marine ecosystem. Dorid species are carnivorous; they feed mainly on sedentary invertebrates such as sponges, cnidarians, tunicates, and bryozoans3. To deter their own predators, many dorid species synthesize unpleasant or toxic compounds from their foods3, this ability makes dorid nudibranchs potentially interesting subjects in the search for chemical compounds with pharmaceutical relevance4. Thus, several species have been used in pharmaceutical science and developmental studies. Cadlina luteomarginata has been of particular interest in biochemical investigations, whereas other nudibranch species, such as Aldisa andersoni, Aldisa cooperi, Cadlina pellucida, Cadlina laevis, Doriprismatica atromarginata, and Jorunna funebris, have become important subjects in bioactive substance studies3,4,5,6,7. However, the first step in the practical application of dorid nudibranch compounds is the elucidation of the group’s taxonomy and phylogenetic relationships.

Despite their importance in marine ecology and pharmaceutical studies, the interfamily relationships of dorid nudibranchs have long been disputed8. Previously, only the morphological characteristics of dorid families, such as the rhinophores, mantle, gill, gill cavity, and radula, were used for classification; however, molecular markers are now used to study dorid nudibranch phylogeny9,10. In such studies, a single marker or combination of several short markers is usually used. Although some studies have been conducted to determine relationships within a genus or family, only a few studies have dealt with the higher-level groups in Doridina and Cladobranchia2,11. Moreover, the application of short markers has been difficult to elucidate the phylogenies containing these higher-level groups. To date, few attempts have been made to study the interfamily relationships of dorid nudibranchs using cladistic methods. Notable research has been published by Hallas et al.11 and Korshunova et al.8 related to the families within the suborder Doridina. Nevertheless, interfamily relationships remain poorly understood because of conflicting phylogenies, tree polytomy, and inadequate sampling11. For example, controversy surrounds the relationships between Discodoridae + Dorididae and Goniodorididae + Aegiridae. Conventionally, Discodoridae was considered to have a close relationship with Dorididae, whereas Goniodorididae was believed to have a close relationship with Aegiridae12. Nevertheless, recent molecular analyses have shown that Discodoridae is a sister group to Goniodorididae, whereas Aegiridae is a sister group to Dorididae8 or has an unstable position (depending on the analysis method used)11. To improve the systematics of dorid nudibranchs, phylogenetic relationships must be explicitly determined, and the effective use of DNA sequences to elucidate phylogenies is one potential strategy. Another issue is the phylogenetic classification of Cadlinidae, which has long been controversial. Traditionally, Cadlinidae has been considered a member of Chromodorididae; however, recent taxonomic evaluation indicated that Cadlinidae is an independent family that is separate from Chromodorididae8,13.

The mitochondrial genome is a powerful molecular marker used to explore phylogenetic relationships, and it has been applied to reveal the molecular evolution of mollusks14,15,16. The typical mitogenome of mollusks contains 13 protein-coding genes, 22 transfer RNA (tRNA) genes, and two ribosomal RNA (rRNA) genes14. Given their importance in systematics and phylogenetic reconstruction, the mitogenomes of nudibranchs are now being characterized; however, too few of the mitogenomes from more than 2,000 dorid nudibranchs have been sequenced and analyzed. Molecular phylogenetic analyses and taxon-sampling schemes are known as effective tools in phylogenetic research13. Previously, the phylogenetic position of nudibranchs has been studied on the basis of partial cox1, 16S rRNA, 18S rRNA, and 28S rRNA sequences. Nevertheless, the complete mitogenome provides better phylogenetic resolution and accuracy than single genetic markers17.

This study aimed to analyze the mitogenome structure of dorid nudibranchs and use mitogenomes as molecular markers to investigate the interfamily relationships of this group. To achieve this aim, the mitogenomes of different dorid nudibranchs were decoded and analyzed. The structure of dorid nudibranch mitogenomes was examined and compared with the mitogenome sequences already available in public databases. Additionally, phylogenetic trees showing interfamily relationships were determined on the basis of the examined dorid nudibranch mitogenomes.

Results

General mitgenome features

Eight complete mitogenomes of dorid nudibranchs were sequenced in the present study, including those of Aldisa cooperi, Cadlina japonica, Cadlina koreana, Cadlina umiushi, Carminodoris armata, Doris odhneri, Triopha modesta, and Verconia nivalis. Mitogenome lengths ranged from 14,397 bp (T. modesta) to 14,982 bp (C. japonica) (Tables S1–8; Figs. S1–8). All eight mitogenomes had negative AT skew values (from − 0.167 in V. nivalis to − 0.089 in D. odhneri) and positive GC skew values (from 0.008 in D. odhneri to 0.152 in Cadlina umiushi), suggesting a bias for T and G nucleotides (Table S9).Generally, mitogenomes contained 13 protein-coding genes, two rRNA genes, and 22 tRNA genes. The mitogenomes of most species comprised 37 genes; however, that of C. japonica contained 38 genes due to a duplication of tRNAIle. The mitogenome of each species contained 13 protein-coding genes (PCGs), including nine genes (cox1, cox2, cytb, nd1, nd2, nd4, nd4l, nd5, and nd6) encoded by the H-strand and four genes (atp6, atp8, cox3, and nd3) encoded by the L-strand (Tables S1–8). In terms of start and stop codons, ATN was the most frequent initiation codon, ATG was most commonly used, and ATA, GTG, and TTG were also used for initiation; TAA was the most common termination codon, with TAG and incomplete T– were also used for the termination of several genes. Codon usage and relative synonymous codon usage (RSCU) are indicated in Table S10. For all mitogenomes, the amino acids most frequently found in PCGs were leucine followed by serine; by contrast, glutamine, arginine and cysteine were the least common amino acids. The RSCU values of 13 PCGs in the eight examined mitogenomes showed a bias toward amino acids encoded by codons rich in A and T, such as UUA-Leu, AUU-Ile, UUU-Phe, and AUA-Met (Table S10).

There were 22 tRNA genes in most dorid species, except the C. japonica mitogenome carried 23 genes due to a tRNAIle duplication. Similar to other nudibranchs, different anticodons were observed for tRNALeu and tRNASer. Two rRNA genes were detected in the mitogenomes of dorid nudibranchs. The large (16S rRNA) and small (12S rRNA) rRNAs were encoded by the H-strand and L-strand, respectively. Overall, the intergenic regions in the eight dorid nudibranch mitogenomes were short in length. In the present study, the sizes of the intergenic regions varied according to species. The longest noncoding region was located between tRNAHis and tRNACys in Cadlinidae species (324 bp in C. japonica). The overlap** regions were also short and variable among species; nevertheless, the longest overlap** region was always located between nd5 and nd1 genes.

The gene contents and order were similar to those in typically structured Nudibranchia mitogenomes. Gene direction was similar across the examined mitogenomes, although the direction of tRNACys was inverse in species from Cadlinidae, i.e., A. cooperi, C. japonica, C. koreana, and C. umiushi, compared with its direction in other nudibranchs (Fig. 1). In the four Cadlinidae species, tRNACys was encoded by the L-strand; in the other nudibranchs, it was encoded by the H-strand. Overall, gene order within Doridina mitogenomes is pretty conservative and identical to that of arrangement pattern in Nudibranchia. An exception was observed in the mitogenomes of Hypselodoris which present translocation of the second tRNASer (GCU) and nd418,22. Fragments of cox1 from the same species were used as bait for assembly. Mitogenome sequences were annotated on the MITOS web server using the invertebrate genetic code23.

PCGs and rRNA genes were aligned with homologous genes from other nudibranchs and confirmed using BLAST searches in GenBank. Additionally, tRNA structures were predicted and identified using the MITOS web server23 and ARWEN24. Circular maps of complete mitogenomes were generated and annotated using Geneious v9.125. Skewness was assessed using the following formulas: AT skew = [A − T] / [A + T]; GC skew = [G − C] / [G + C]26. RSCU values were calculated using MEGA X to evaluate the level of nucleotide bias in each codon27.

Phylogenetic analysis

The mitogenomes generated in this study and those obtained from GenBank for other nudibranchs were used for phylogenetic analyses (Table 1). Two species of the suborder Cladobranchia, Dermatobranchus otome and Tritonia tetraqueta were used as outgroup. Both amino acid and nucleotide sequences were applied to construct phylogenetic trees. Because nd4l gene of Notodoris gardineri was half as short as that of other species, this gene was excluded from the analyses. For both amino acid and nucleotide sequences, each sequence was extracted and aligned using MAFFT v740 in Geneious v9.125. To check the impact of the variable regions of mitogenomes on phylogenetic trees, two alignment schemes were used11. In the first scheme, following alignment, sequences were directly concatenated without the use of Gblocks. In the second scheme, Gblocks v0.91b was used to remove poorly aligned regions41. Four datasets were used to build the phylogenetic trees: nucleotide sequences of 12 PCGs + 2 rRNAs + 22 tRNAs, nucleotide sequences of 12 PCGs, 1st and 2nd codons of 12 PCGs and amino acid sequences of 12 PCGs. For the nucleotide and amino acid sequences of the 12 PCGs, each of the 12 sequences was set for a separate partition. The sequences of each mitogenome were concatenated using Geneious v9.125. The best partition scheme and the best fit model were determined using Partition Finder 242. For a dataset with 36 nucleotide sequences, each of the 12 PCGs and two rRNA genes were set for separate partitions and the 22 tRNA genes were set for a partition. Also, for testing the impact of outgroup on phylogenetic reliability, different outgroup including three species of the suborder Cladobranchia, Melibe leonine, Protaeolidiella atra and Sakuraeolis japonica were used to compared BI and UFboot values among trees. Data fore phylogenetic analyses were prepared as description above with the use of Gblocks41.

Table 1 Complete mitogenomes used in this study. Species names and systematics are used following the World Register of Marine Species at http://www.marinespecies.org.

The phylogenetic trees were constructed with Maximum Likelihood (ML) and Bayesian Inference (BI) methods. ML trees was searched using IQ-tree v.2.1.2 with 1,000 bootstrap replicates43. BI trees were searched using MrBayes v3.2.7 with four chains and 20,000,000 and 3,000,000 generations for the nucleotide dataset and amino acid dataset, respectively44. Additionally, sampling was performed every 100 generations and 25% of the first tree was set as burn-in. Each run was checked for proper mixing and convergence on the basis of ESS values of > 200 in Tracer v1.745. The maximum clade credibility tree was visualized using FigTree v1.4.446.

A tree topology test was performed using the mitogenome sequences and IQ-tree v2.1.2, to compare the interfamily relationships of dorid nudibranchs found in the present study with those found in previous reports43. The tree topology from the present study was tested against a constrained tree in which Aegiridae was a sister group to Dorididae and Goniodorididae was a sister group to Discodoridae. P-values for approximately unbiased tests were obtained from IQ-tree v2.1.2 using 20,000 bootstrap replicates43.