Introduction

Small RNAs are noncoding RNAs of about 20−24 nucleotides in length. They play vital roles in multiple developmental and physiological processes in various organisms through sequence-specific regulation of target genes at the transcriptional or post-transcriptional level1. Based on the biogenesis pathways, plant small RNAs can be classified into two major classes, microRNAs (miRNAs) and small interfering RNAs (siRNAs). SiRNAs are a large small RNA class with four subclasses, including heterochromatic siRNAs (hc-siRNAs), trans-acting siRNAs (ta-siRNAs), natural antisense transcript-derived siRNAs (nat-siRNAs) and long siRNAs (lsiRNAs)2. MiRNAs are produced from transcripts with internal stem-loop structures3, whereas plant siRNAs are derived from inverted repeat sequences, dsRNAs copied from single-stranded RNAs (ssRNA), over-lap** regions of bidirectional transcripts, or dsRNAs formed by virus replication4. Plant small RNAs regulate gene expression by loading into RNA-induced silencing complexes (RISCs) and then interacting with homologous RNA or DNA molecules for direct RNA cleavage, translational repression, or DNA methylation. The biogenesis and function of plant small RNAs involves various families of proteins, such as Dicer-likes (DCLs), HYPONASTIC LEAVES1 (HYL1), C2H2 Zn-finger protein SERRATE (SE), HEN1, HASTY, RNA dependent RNA polymerases (RDRs) and Argonautes (AGOs), of which DCLs are the core components for small RNA biogenesis5,6.

DCLs are multidomain ribonucleases characterized by six domains, including DExD-helicase (DExDc), helicase-C (HELICc), Duf283, PAZ, RIBOc and double stranded RNA-binding (dsRB) domain7. DExDc and HELICc existing in the N- and the C-terminals of the helicase region, respectively, are involved in ATP-dependent RNA or DNA unwinding. The ATP-binding site locates in DExDc domain. PAZ binds single-stranded RNAs with the two-base 3′-overhangs8. The RIBOc domain, known as ribonuclease III C terminal domain, is involved in dsRNA cleavage9, whereas dsRB mediates the discrimination of different RNA substrates and subsequent incorporation of effector complexes7. The function of DUF283 is currently unknown.

DCLs are usually encoded by a multiple gene family in plants. The number of DCL genes in each plant species may be varied. For instance, there are four in Arabidopsis10, five in poplar, maize and sorghum11, seven in tomato12 and eight in rice13. Among them, Arabidopsis DCLs (AtDCLs) are well-studied. Each of the four AtDCLs is primarily associated with the biogenesis of specific small RNA species, but they may play redundant and hierarchical roles in the production of various sRNAs14. AtDCL1 is a core component for miRNA biogenesis, whereas AtDCL2, AtDCL3 and AtDCL4 are mainly involved in the derivation of siRNAs15. AtDCL2 generates 22 nt siRNAs from endogenous inverted-repeats, integrated viruses and transgenes and plays significant roles in virus resistance and transitive silencing of transgenes14,16. AtDCL3 is responsible for the derivation of heterochromatic siRNAs mostly from repetitive DNA loci. These siRNAs are about 24 nt in length and mediate the establishment and maintenance of heterochromatin states through RNA-dependent DNA methylation and histone modification48. Similarly, no miR162 was found in about 94 million sequence reads from juvenile and adult shoots, ripe and unripe fruits and leaves of O. europaea49,50. It indicates that the miR162-mediated feedback regulation of DCL1 seemed to be absent from R. glutinosa and O. europaea. The regulation of SiDCL1 remains to be elucidated.

Discussion

Although DCLs have been identified from various plant species, functional characterization of DCLs is limited to a few plants, such as Arabidopsis and rice18,19,20. The identification and molecular cloning of five SmDCLs provides a base for elucidating the function of SmDCLs and for understanding the biogenesis pathways and functions of small RNAs in S. miltiorrhiza, an emerging model plant with high medicinal value26. Five SmDCLs cluster into four clades with Arabidopsis and rice DCLs (see Supplementary Fig. S1 online), indicating the existence of four types of DCLs with distinct functions in S. miltiorrhiza as the cases in Arabidopsis and rice7,13,51. Conservation of sequence features, gene structures and functional domains implies that the function of each SmDCL could be similar to its Arabidopsis and rice counterparts in the same clade. However, it is interesting to show, for the first time, two SmDCLs in the DCL4 clade. SmDCL4a and SmDCL4b have similar exon patterns, but the intron size is distinct with some intron expanded while the others condensed (Fig. 1). Moreover, SmDCL4a and SmDCL4b showed distinct expression patterns (Fig. 3). These results indicate that SmDCL4a and SmDCL4b may play different roles in S. miltiorrhiza as the case of OsDCL3a and OsDCL3b in rice22. Further production and analysis of transgenic S. miltiorrhiza plants with SmDCL4a and/or SmDCL4b up- or down-regulated will definitely shed light on the biological function of SmDCL4a and SmDCL4b.

It has been shown the presence of miRNA-mediated feedback regulation of Arabidopsis AtDCL1 and P. patens PpDCL123,24,25. AtDCL1 is regulated by miR162 and miR83823,24,25, while PpDCL1 is regulated by miR104725. Analysis of the regulation mechanism of SmDCLs unexpectedly revealed the loss of miR162 target site in SmDCL1. Close examination of the miR162 complementary regions showed the absence of miR162 target sites in DCL1 from the non-vascular plant P. patens and the ancient vascular plant S. moellendorffii25,52, suggesting that the miR162 target site was not present in ancient plants and was gained during plant evolution. On the other hand, the gained miR162 target site might be lost in a few modern plants, such as S. miltiorrhiza. Since S. miltiorrhiza is evolutionarily far from P. patens and S. moellendorffii compared with many plants with the conserved miR162 target site (Fig. 5), gain and loss of miR162 target sites seems to be two independent events during plant evolution. Gain and loss of miRNA target sites has been previously investigated in Arabidopsis and rice45,53. The loss of miRNA target sites was proposed to be a consequence of gene ortholog loss, target site sequence disruption, or point substitutions/nucleotide mutations45,53. Analysis of the miR162 target sites (except the bulge nucleotide) showed single nucleotide mutation in S. indicum SiDCL1 and O. europaea OeDCL1, two in R. glutinosa RgDCL1, while four in S. miltiorrhiza SmDCL1 (Fig. 5). It suggests the loss of miR162 target sites was caused by nucleotide mutations rather than gene ortholog loss and target site sequence disruption.

It has been generally considered that miRNAs and their targets co-evolve in animals54. The absence of miR162 target site goes along with the lack of miR162 in P. patens52, S. moellendorffii25, R. glutinosa46, O. europaea49,50 and S. miltiorrhiza, suggesting that the miR162 gene, similar to the miR162 target site, might be lost in some modern plants during plant evolution and indicating the possibility for co-evolution of miR162 and miR162 target sites in plants. However, since current information is preliminary, it is impossible to make a conclusion. Relatively frequent gain and loss of miRNA genes has been previously reported in A. thaliana55. Analysis of miRNA-target pair conservation between A. thaliana and A. lyrata showed that about 12.5% of non-conserved pairs were due to the loss of corresponding miRNAs in A. lyrata45. Of the 387 miRNAs from wild rice, 259 were not found in cultivated rice, suggesting a significant loss of miRNAs during rice domestication56. A possible mechanism for miRNA gene loss is nucleotide mutation. For instance, among 591 rice miRNAs, 364 have one or more SNPs in their precursor sequences57. SNPs in the stem regions may cause unstable of the miRNA hairpin structures, while SNPs in mature miRNAs have great potential to loss miRNA-target interaction56. Genome-wide duplication could be the other possible mechanism for the loss of miRNA genes. Comparative analysis of miRNA genes in maize and sorghum showed that duplicated miRNA genes underwent extensive gene-loss, with about 35% of ancestral sites were retained as duplicate homoeologous miRNA genes58. Since there is no information for miR162 gene variation among S. miltiorrhiza and its relative species and it is unknown for the genome-wide duplication events happened during S. miltiorrhiza evolution, the mechanism for loss of miR162 in S. miltiorrhiza is currently unknown and need to be further investigated.

It has been proposed that miR162-mediated feedback regulation of DCL1 is important in maintaining AtDCL1 at functionally sufficient, but not limiting or excessive, levels23 and the excision of MIR838 precursor from AtDCL1 primary transcript, which leads to the production of truncated and non-functional AtDCL1 transcripts, provides a regulatory feedback mechanism supplementing miR162-directed regulation to maintain the proper level of AtDCL1 mRNA24. Additionally, P. patens miR1047 seems to play a similar role in feedback regulation of PpDCL125. However, data for the actual physiological functions of miR162, miR838 and miR1047 is lacking. Without direct physiological evidence, the significance of miRNA-mediated feedback regulation of DCL1 is largely uncertained. The absence of miR162-mediated feedback regulation of DCL1 in S. miltiorrhiza and probably in R. glutinosa and O. europaea implies that, at least in some plant species, miR162-mediated feedback mechanism could be not vital. It is possible that an alternative mechanism for maintaining SmDCL1 at a proper level exists in S. miltiorrhiza and other plant species lacking the miR162-mediated feedback regulation of DCL1. Further investigating the regulatory mechanism of SmDCLs using transgenics may help to demonstrate the significance of miRNA-mediated feedback regulation of DCL1 in plants and reveal the alternative of this feedback regulation in S. miltiorrhiza.

Methods

Plant materials

S. miltiorrhiza Bunge (line 993) was cultivated under natural growth conditions in a field nursery located at the Institute of Medicinal Plant Development, Bei**g, China. Mature flower buds, mature and healthy leaves, young stems and roots in about 0.5 cm diameter were collected from two-year-old plants on August 15th, 2012. Tissues were collected from at least 3 plants and then pooled. The pooled tissues were stored in liquid nitrogen until use.

Prediction and cloning of SmDCL genes

SmDCL genes were identified by tBLASTn analysis36 of Arabidopsis and rice DCL protein sequences (http://www.ncbi.nlm.nih.gov/protein) against the current assembly of the S. miltiorrhiza genome31. All retrieved DNA sequences were used for gene prediction on the Genscan web server (http://genes.mit.edu/GENSCAN.html)35. The predicted gene models were further examined and corrected manually by comparison with DCL genes identified from other plant species using the BLASTx algorithm (http://www.ncbi.nlm.nih.gov/BLAST)36.

To clone the full-length SmDCL cDNAs, RNA ligase-mediated rapid amplification of 5′ cDNA ends (5′-RACE) and 3′ cDNA ends (3′-RACE) was carried out using the GeneRacer kit (Invitrogen, Carlsbad, CA, USA). PCR amplification was performed using the following conditions: pre-denaturation at 94 °C for 2 min, 5 cycles of amplification at 94 °C for 30 s and 72 °C for 1 min, 5 cycles of amplification at 94 °C for 30 s and 70 °C for 1 min, 25 cycles of amplification at 94 °C for 30 s, 56 °C for 30 s and 72 °C for 2 min, followed by a final extension at 72 °C for 15 min. Nested PCR amplifications were carried out using the following conditions: pre-denaturation at 94 °C for 2 min, 30 cycles of amplification at 94 °C for 30 s, 58 °C for 30 s and 72 °C for 2 min, followed by a final extension at 72 °C for 15 min. PCR products were gel-purified, cloned and sequenced. The nesting and nested gene-specific primers used for 5′- and 3′-RACE are listed in Supplementary Table S1 and S2 online, respectively. Full-length SmDCL cDNAs were amplified using gene-specific forward and reverse primers (see Supplementary Table S3 online) under the following conditions: pre-denaturation at 94 °C for 2 min, 30 cycles of amplification at 94 °C for 30 s, 56 °C for 30 s and 72 °C for 3 min, followed by a final extension at 72 °C for 15 min. PCR products were gel-purified and cloned. For each transformation, three clones were sequenced at Bei**g Sunbiotech Co., Ltd (Bei**g, China). Sequences from three clones were aligned with the predicted SmDCL sequence using DNAMAN (Lynnon BioSoft, San Ramon, CA, USA). The cloned cDNAs showing the least nucleotide discrepancies with the predicted sequences were selected and deposited in GenBank (Table 1).

Phylogenetic tree construction and bioinformatics analysis

Phylogenetic tree was constructed using MEGA version 4.0 by the neighbor-joining method with 1000 bootstrap replicates38,59. Intron/exon structures were analyzed manually based on genomic DNA sequences and the cloned cDNA sequences. Molecular weight (MW) and theoretical isoelectric point (pI) were predicted using DNAMAN. Conserved domains were analyzed by search the deduced amino acid sequence of SmDCLs against the NCBI conserved domain (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Multiple sequence alignment of the deduced SmDCL amino acid sequences was carried out using T-Coffee37.

Quantitative real-time reverse transcription-PCR (qRT-PCR)

Total RNA was isolated from plant tissues using the plant total RNA extraction kit (BioTeke, Bei**g, China) and genomic DNA was removed by treating with RNase-free DNase (Promega, Madison, WI, USA). One μg total RNA was converted into cDNA by 200 U Superscript III reverse transcriptase (Invitrogen, Carlsbad, CA, USA) in a 20 μl volume. cDNA was diluted into 200 μl and then used for qRT-PCR. Gene-specific primers were listed in Supplementary Table S4 online. SmUBQ10 was used as a control as previously described28. PCR was carried out in a 20 μl volume containing 2 μl diluted cDNA, 250 nM forward primer, 250 nM reverse primer and 1 × SYBR Premix Ex Taq II (TaKaRa Bio, Otsu, Japan) using the following conditions: pre-denaturation at 95 °C for 30 s, 40 cycles of amplification at 95 °C for 5 s, 60 °C for 18 s and 72 °C for 15 s. The results from gene-specific amplification were analyzed using the comparative Cq method, which uses an arithmetic formula, 2-ΔΔCq, to achieve results for relative quantification60. Cq represents the threshold cycle.

Identification of S. miltiorrhiza miRNAs with perfect or near-perfect complementarity to SmDCLs

Plant miRNAs with the potential to target SmDCLs for cleavage were predicted using psRNATarget with the default parameters40. Known plant miRNAs were downloaded from miRBase (release 19, http://www.mirbase.org/)41. The identified miRNAs were then aligned with the current assembly of the S. miltiorrhiza genome31 using SOAP2 with no more than 2 mismatches allowed42. S. miltiorrhiza genomic DNA sequences with known plant miRNAs aligned were predicted for hairpin structures using mfold43. Criteria described by Meyers et al39 were applied to annotate S. miltiorrhiza miRNAs.

5′RLM-RACE for analysis of miRNA-directed cleavage of SmDCLs

The modified RNA ligase-mediated rapid amplification of 5′ cDNAs method (5′RLM-RACE) was performed using the GeneRacer kit (Invitrogen, Carlsbad, CA, USA) as described previously44. PCRs were carried out on mRNA isolated from pooled S. miltiorrhiza tissues containing flowers, leaves, stem and roots. Gene-specific primers used in this experiment are listed in Supplementary Table S5 online.

PCR amplification of SmDCL1 cDNA fragments in S. miltiorrhiza lines 992 and shh

SmDCL1 cDNA fragments surrounding the predicted miR162 target site were PCR-amplified on cDNA from the leaves of S. miltiorrhiza lines 992 and shh using 5′-GTCAGGGAGGAGCTGTGACAATT-3′ as the forward primer and 5′-CGTACATGAAAGCTCTTGAGCGAT-3′ as the reverse primer.

Additional Information

How to cite this article: Shao, F. et al. Comparative analysis of the Dicer-like gene family reveals loss of miR162 target site in SmDCL1 from Salvia miltiorrhiza. Sci. Rep. 5, 9891; doi: 10.1038/srep09891 (2015).