Abstract
Background
Siraitia grosvenorii (Luohanguo) is an herbaceous perennial plant native to southern China and most prevalent in Guilin city. Its fruit contains a sweet, fleshy, edible pulp that is widely used in traditional Chinese medicine. The major bioactive constituents in the fruit extract are the cucurbitane-type triterpene saponins known as mogrosides. Among them, mogroside V is nearly 300 times sweeter than sucrose. However, little is known about mogrosides biosynthesis in S. grosvenorii, especially the late steps of the pathway.
Results
In this study, a cDNA library generated from of equal amount of RNA taken from S. grosvenorii fruit at 50 days after flowering (DAF) and 70 DAF were sequenced using Illumina/Solexa platform. More than 48,755,516 high-quality reads from a cDNA library were generated that was assembled into 43,891 unigenes. De novo assembly and gap-filling generated 43,891 unigenes with an average sequence length of 668 base pairs. A total of 26,308 (59.9%) unique sequences were annotated and 11,476 of the unique sequences were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes. cDNA sequences for all of the known enzymes involved in mogrosides backbone synthesis were identified from our library. Additionally, a total of eighty-five cytochrome P450 (CYP450) and ninety UDP-glucosyltransferase (UDPG) unigenes were identified, some of which appear to encode enzymes responsible for the conversion of the mogroside backbone into the various mogrosides. Digital gene expression profile (DGE) analysis using Solexa sequencing was performed on three important stages of fruit development, and based on their expression pattern, seven CYP450 s and five UDPG s were selected as the candidates most likely to be involved in mogrosides biosynthesis.
Conclusion
A combination of RNA-seq and DGE analysis based on the next generation sequencing technology was shown to be a powerful method for identifying candidate genes encoding enzymes responsible for the biosynthesis of novel secondary metabolites in a non-model plant. Seven CYP450 s and five UDPG s were selected as potential candidates involved in mogrosides biosynthesis. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the fruit extract from S. grosvenorii.
Similar content being viewed by others
Background
Siraitia grosvenorii, which belongs to the Cucurbitaceae family, has long been used in traditional Chinese medicine as a natural sweetener and as a folk medicine for the treatment of lung congestion, colds and sore throats. In recent years, important pharmacological characteristics, such as anti-cancer and anti-hyperglycemic effects and inhibition of oxidative modification of low-density lipoprotein, have been reported [1–4]. Many cucurbitane-type triterpene glycosides have been isolated and characterized from the fruits [5–10]. The mixed mogrosides have been estimated to be about 300 times as sweet as sucrose so that an 80% extract was nearly 250 times sweeter than sugar [7]. Among them, mogroside V which accounts to 20% of mogrosides is extremely sweet. The purified, sweet principle, mogroside V, has been approved as a high-intensity sweetening agent in Japan [11] and the non-caloric sweet taste extract is a generally recognized as safe (GRAS) non-nutritive sweetener, flavor enhancer, and food ingredient in the USA [33], American ginseng[34], Glycyrrhiza uralensis[35], Medicago truncatula[24, 5, 6 and 7.
Evaluation of DGE libraries
A statistical analysis of the frequency of each tag in the different cDNA libraries was performed to compare gene-expression in different developmental stages. Statistical comparison was performed using the method described by Audic et al[22]. FDR (false discovery rate) was used to determine the threshold of P-value in multiple test and analysis. We used FDR < 0.001 as the threshold to judge the significance of gene expression difference. For pathway enrichment analysis, we mapped all differentially expressed genes to terms in the KEGG database and looked for significantly enriched KEGG terms compared to the genome background.
Clustering of CYP450 and UDPG gene expression profiles
Hierarchical clustering of log-transformed expression data was carried out using the Cluster 3.0 and Treeview programs [46]. Correlations between gene clusters were determined using Pearson's correlation. Heat maps were constructed using the University of Toronto BAR Heatmapper tool http://www.bar.utoronto.ca/ntools/cgi-bin/ntools_heatmapper.cgi.
Abbreviations
- DGE:
-
Digital Gene Expression
- DAF:
-
days after flowering
- MVA:
-
mevalonate pathway
- MEP:
-
2-C methyl-D-erythritol-4-phosphate pathway
- IPP:
-
isopentenyl diphosphate
- DMAPP:
-
isopentenyl/dimethylallyl diphosphate
- NGS:
-
Next generation sequencing technology
- GO:
-
Gene Ontology
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- AACT:
-
acetyl-CoA acetyltransferase
- HMGS:
-
hydroxymethylglutaryl-CoA synthase
- HMGR:
-
3-hydroxy-3-methylglutaryl-coenzyme A reductase
- MK:
-
mevalonate kinase
- PMK:
-
phosphomevalonate kinase
- MVD:
-
diphosphomevalonate decarboxylase
- DXS:
-
1-deoxy-D-xylulose-5-phosphate synthase
- DXR:
-
1-deoxy-D-xylulose-5-phosphate reductoisomerase
- MCT:
-
2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase
- CMK:
-
4-diphosphocytidyl-2-C-methyl-D-erythritol kinase
- MCS:
-
2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase
- HDS:
-
4-hydroxy-3-methylbut-2-enyl diphosphate synthase
- IDS:
-
4-hydroxy-3-methylbut-2-enyl diphosphate reductase (isopentenyl/dimethylallyl diphosphate synthase)
- IPI:
-
isopentenyl-diphosphate delta-isomerase
- GPS:
-
geranyl diphosphate synthase
- FPS:
-
farnesyl diphosphate synthase/farnesyl pyrophosphate synthetase
- SQS:
-
squalene synthetase
- SQE:
-
squalene epoxidase
- CAS:
-
cycloartenol synthase
- CS:
-
cucurbitadienol synthase
- CYP450:
-
cytochrome P450
- UDPG:
-
UDP-glucosyltransferase.
References
Takasaki M, Konoshima T, Murata Y, Sugiura M, Nishino H, Tokuda H, Matsumoto K, Kasai R, Yamasaki K: Anticarcinogenic activity of natural sweeteners, cucurbitane glycosides, from Momordica grosvenori. Cancer Letters. 2003, 198: 37-42. 10.1016/S0304-3835(03)00285-4.
Suzuki YA, Murata Y, Inui H, Sugiura M, Nakano Y: Triterpene glycosides of Siraitia grosvenori inhibit rat intestinal maltase and suppress the rise in blood glucose level after a single oral administration of maltose in rats. J Agric Food Chem. 2005, 53 (8): 2941-2946. 10.1021/jf0478105.
Takeo E, Yoshida S, Tada S, Shingu T, Matsuura H, Murata Y, Yoshikawa S, Ishikawa T, Nakamura H, Ohsuzu F, Kohda H: Sweet elements of Siraitia grosvenori inhibit oxidative modification of low-density lipoprotein. J Atheroscler Thromb. 2002, 9 (2): 114-120. 10.5551/jat.9.114.
Lin GP, Jiang T, Hu XB, Qiao XH, Tuo QH: Effect of Siraitia grosvenorii Polysaccharide on Glucose and Lipid of Diabetic Rabbits Induced by Feeding High Fat/High Sucrose Chow. Exp Diabetes Res. 2007, 2007: 67435-
Lee C: Intense sweetener from Lo Han Kuo. Experientia. 1975, 31 (5): 533-534. 10.1007/BF01932442.
Takemoto T, Arihara S, Nakajima T, Okuhira M: Studies on the constituents of Fructus momordicae. II. Structure of sapogenin. Yakugaku Zasshi. 1983, 103 (11): 1155-1166.
Kasai R, Nie RL, Nashi K, Ohtani K, Zhou J, Tao GD, Tanaka O: Sweet cucurbitane glycosides from fruits of Siraitha siamensis (chi-zi luo-han-guo), a Chinese folk medicine. Agric Biol Chem. 1989, 53 (12): 3347-3349.
Matsumoto K, Kasai R, Ohtani K, Tanaka O: Minor cucurbitane-glycosides from fruits of Siraitia grosvenorii (Cucurbitaceae). Chem Pharm Bull. 1990, 38 (7): 2030-2032.
Li DP, El-Aasr M, Ikeda T, Ogata M, Miyashita H, Yoshimitsu H, Nohara T: Two new cucurbitane-type glycosides obtained from roots of Siraitia grosvenori SWINGLE. Chem Pharm Bull (Tokyo). 2009, 57 (8): 870-872. 10.1248/cpb.57.870.
Jia Z, Yang X: A minor, sweet cucurbitane glycoside from Siraitia grosvenorii. Nat Prod Commun. 2009, 4 (6): 769-772.
Jakinovich W, Moon C, Choi YH, Kinghorn AD: Evaluation of plant extracts for sweetness using the Mongolian gerbil. J Nat Prod. 1990, 53 (1): 190-195. 10.1021/np50067a030.
**a Y, Rivero-Huguet ME, Hughes BH, Marshall WD: Isolation of the sweet components from Siraitia grosvenorii. Food Chemistry. 2008, 107 (3): 1022-1028. 10.1016/j.foodchem.2007.09.014.
Li D, Ikeda T, Huang Y, Liu J, Nohara T, Sakamoto T, Nonaka G-I: Seasonal variation of mogrosides in Lo Han Kuo (Siraitia grosvenori) fruits. J Nat Med. 2007, 61 (3): 307-312. 10.1007/s11418-006-0130-7.
Haralampidis K, Trojanowska M, Osbourn AE: Biosynthesis of triterpenoid saponins in plants. Adv Biochem Eng Biotechno. 2002, 75: 31-49. 10.1007/3-540-44604-4_2.
Lichtenthaler HK: The 1-deoxy-D-xylulose-5-phosphate pathway of isoprenoid biosynthesis in plants. Annu Rev Plant Physiol Plant Mol Biol. 1999, 50: 47-65. 10.1146/annurev.arplant.50.1.47.
Dewick PM: Medicinal Natural Products: A Biosynthetic Approach. 2009, Wiley
Schuster SC: Next-generation sequencing transforms today's biology. Nat Methods. 2008, 5 (1): 16-18. 10.1038/nmeth1156.
Ansorge WJ: Next-generation DNA sequencing techniques. N Biotechnol. 2009, 25 (4): 195-203. 10.1016/j.nbt.2008.12.009.
**ang LX, He D, Dong WR, Zhang YW, Shao JZ: Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune-relevant genes in marine fish. BMC Genomics. 2010, 11: 472-10.1186/1471-2164-11-472.
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, et al: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics. 2003, 19 (5): 651-652. 10.1093/bioinformatics/btg034.
Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004, 32: D277-280. 10.1093/nar/gkh063.
Audic S, Claverie JM: The significance of digital gene expression profiles. Genome Res. 1997, 7 (10): 986-995.
Shibuya M, Adachi S, Ebizuka Y: Cucurbitadienol synthase, the first committed enzyme for cucurbitacin biosynthesis, is a distinct enzyme from cycloartenol synthase for phytosterol biosynthesis. Tetrahedron. 2004, 60 (33): 6995-7003. 10.1016/j.tet.2004.04.088.
Naoumkina MA, Modolo LV, Huhman DV, Urbanczyk-Wochniak E, Tang Y, Sumner LW, Dixon RA: Genomic and coexpression analyses predict multiple genes involved in triterpene saponin biosynthesis in Medicago truncatula. Plant Cell. 2010, 22 (3): 850-866. 10.1105/tpc.109.073270.
Coon MJ: Cytochrome P450: nature's most versatile biological catalyst. Annu Rev Pharmacol Toxicol. 2005, 45: 1-25. 10.1146/annurev.pharmtox.45.120403.100030.
Morant M, Bak S, Møller BL, Werck-Reichhart D: Plant cytochromes P450: tools for pharmacology, plant protection and phytoremediation. Curr Opin Biotechnol. 2003, 14 (2): 151-162. 10.1016/S0958-1669(03)00024-7.
Bowles D, Lim EK, Poppenberger B, Vaistij FE: Glycosyltransferases of lipophilic small molecules. Annu Rev Plant Biol. 2006, 57: 567-597. 10.1146/annurev.arplant.57.032905.105429.
Gachon CM, Langlois-Meurinne M, Saindrenan P: Plant secondary metabolism glycosyltransferases: the emerging functional analysis. Trends Plant Sci. 2005, 10 (11): 542-549. 10.1016/j.tplants.2005.09.007.
Lairson LL, Henrissat B, Davies GJ, Withers SG: Glycosyltransferases: structures, functions, and mechanisms. Annu Rev Biochem. 2008, 77: 521-555. 10.1146/annurev.biochem.76.061005.092322.
Wu Q, Song J, Sun Y, Suo F, Li C, Luo H, Liu Y, Li Y, Zhang X, Yao H, et al: Transcript profiles of Panax quinquefolius from flower, leaf and root bring new insights into genes related to ginsenosides biosynthesis and transcriptional regulation. Genome Res. 2010, 20 (2): 265-272. 10.1101/gr.097261.109.
Kim MK, Lee BS, In JG, Sun H, Yoon JH, Yang DC: Comparative analysis of expressed sequence tags (ESTs) of ginseng leaf. Plant Cell Rep. 2006, 25 (6): 599-606. 10.1007/s00299-005-0095-0.
Jung JD, Park HW, Hahn Y, Hur CG, In DS, Chung HJ, Liu JR, Choi DW: Discovery of genes for ginsenoside biosynthesis by analysis of ginseng expressed sequence tags. Plant Cell Rep. 2003, 22 (3): 224-230. 10.1007/s00299-003-0678-6.
Wang QQ, Liu F, Chen XS, Ma XJ, Zeng HQ, Yang ZM: Transcriptome profiling of early develo** cotton fiber by deep-sequencing reveals significantly differential expression of genes in a fuzzless/lintless mutant. Genomics. 2010, 96 (6): 369-376. 10.1016/j.ygeno.2010.08.009.
Sun C, Li Y, Wu Q, Luo H, Sun Y, Song J, Lui EM, Chen S: De novo sequencing and analysis of the American ginseng root transcriptome using a GS FLX Titanium platform to discover putative genes involved in ginsenoside biosynthesis. BMC Genomics. 2010, 11: 262-10.1186/1471-2164-11-262.
Li Y, Luo HM, Sun C, Song JY, Sun YZ, Wu Q, Wang N, Yao H, Steinmetz A, Chen SL: EST analysis reveals putative genes involved in glycyrrhizin biosynthesis. BMC Genomics. 2010, 11: 268-10.1186/1471-2164-11-268.
Cheung F, Haas BJ, Goldberg SM, May GD, **ao Y, Town CD: Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology. BMC Genomics. 2006, 7: 272-10.1186/1471-2164-7-272.
Luo HM, Li Y, Sun C, Wu Q, Song JY, Sun YZ, Steinmetz A, Chen SL: Comparison of 454-ESTs from Huperzia serrata and Phlegmariurus carinatus reveals putative genes involved in lycopodium alkaloid biosynthesis and developmental regulation. BMC Plant Biology. 2010, 10: 209-10.1186/1471-2229-10-209.
Seki H, Ohyama K, Sawai S, Mizutani M, Ohnishi T, Sudo H, Akashi T, Aoki T, Saito K, Muranaka T: Licorice beta-amyrin 11-oxidase, a cytochrome P450 with a key role in the biosynthesis of the triterpene sweetener glycyrrhizin. Proc Natl Acad Sci USA. 2008, 105 (37): 14204-14209. 10.1073/pnas.0803876105.
Shibuya M, Hoshino M, Katsube Y, Hayashi H, Kushiro T, Ebizuka Y: Identification of beta-amyrin and sophoradiol 24-hydroxylase by expressed sequence tag mining and functional expression assay. FEBS J. 2006, 273 (5): 948-959. 10.1111/j.1742-4658.2006.05120.x.
Kurosawa Y, Takahara H, Shiraiwa M: UDP-glucuronic acid: soyasapogenol glucuronosyltransferase involved in saponin biosynthesis in germinating soybean seeds. Planta. 2002, 215 (4): 620-629. 10.1007/s00425-002-0781-x.
Achnine L, Huhman DV, Farag MA, Sumner LW, Blount JW, Dixon RA: Genomics-based selection and functional characterization of triterpene glycosyltransferases from the model legume Medicago truncatula. Plant. 2005, 41 (6): 875-887. 10.1111/j.1365-313X.2005.02344.x.
Meesapyodsuk D, Balsevich J, Reed DW, Covello PS: Saponin biosynthesis in Saponaria vaccaria. cDNAs encoding beta-amyrin synthase and a triterpene carboxylic acid glucosyltransferase. Plant Physiol. 2007, 143 (2): 959-969.
Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, et al: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20 (2): 265-272. 10.1101/gr.097261.109.
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, et al: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics. 2003, 19 (5): 651-652. 10.1093/bioinformatics/btg034.
't Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT: Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008, 36 (21): e141-10.1093/nar/gkn705.
Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95 (25): 14863-14868. 10.1073/pnas.95.25.14863.
Acknowledgements
We acknowledge the Bei**g Genomics Institute at Shenzhen for its assistance in original data processing and related bioinformatics analysis. This work was supported by the National Natural Science Foundation of China (grant No. 30860379 & 30960500).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Authors' contributions
QT conceived the study, designed and built the cDNA library, participated in data analysis, and drafted the manuscript. XJM initiated the project, helped to conceive the study, and participated in the design and coordination. DYQ designed the study and data analysis. IW participated in clustering of gene expression profiles. CS participated in manuscript formatting and editing. CMM participated in experiment materials preparation. YFY, WF and HZ participated in RNA extraction and RACE experiments. All authors read and approved the final manuscript.
Electronic supplementary material
12864_2011_3522_MOESM1_ESM.XLS
Additional file 1:Overview of Siraitia grosvenorii transcriptome sequencing and assembly. (A) Size distribution of Illumina sequencing contigs. (B) Size distribution of Illumina sequencing scaffolds and which after paired-end and gap filling. (C) Size distribution of Illumina sequencing unigenes and which after paired-end and gap filling. (XLS 22 KB)
12864_2011_3522_MOESM2_ESM.XLS
Additional file 2:Top BLAST hits from NCBI nr database. BLAST results against the NCBI nr database for all the distinct sequences with a cut-off E value above 10-5 are shown. (XLS 12 MB)
12864_2011_3522_MOESM3_ESM.XLS
Additional file 3:Top BLAST hits from NCBI Swissprot database. BLAST results against the NCBI Swissprot database for all the distinct sequences with a cut-off E value above 10-5 are shown. (XLS 8 MB)
12864_2011_3522_MOESM4_ESM.TIFF
Additional file 4:Distribution of total tags and distinct tags over different tag abundance categories. (A) Distribution of total clean tags. Numbers in the square brackets indicate the range of copy numbers for a specific category of tags. For example, [2, 5] means all the tags in this category has 2 to 5 copies. Numbers in the parentheses show the total tag copy number and ratio for all the tags in that category. (B) Distribution of distinct clean tags. Numbers in the square brackets indicate the range of copy numbers for a specific category of tags. Numbers in the parentheses show the total types of tags in that category. (TIFF 233 KB)
12864_2011_3522_MOESM5_ESM.XLS
Additional file 5:Differentially expressed genes between 3 DAF and 50 DAF. TPM: transcript copies per million tags. Raw intensity: the total number of tags sequenced for each gene. FDR: false discovery rate. We used FDR < 0.001 and the absolute value of log2Ratio ≤ 1 as the threshold to judge the significance of gene expression difference. In order to calculate the log2Ratio and FDR, we used TPM value of 0.001 instead of 0 for genes that do not express in one sample. (XLS 1 MB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Tang, Q., Ma, X., Mo, C. et al. An efficient approach to finding Siraitia grosvenorii triterpene biosynthetic genes by RNA-seq and digital gene expression analysis. BMC Genomics 12, 343 (2011). https://doi.org/10.1186/1471-2164-12-343
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1471-2164-12-343