Introduction

Reverse transcription quantitative real-time PCR (RT-qPCR) is a popular technique used to monitor the level of mRNA because of its high sensitivity, accuracy, specificity and efficiency1. To interpret the expression profiles of a target gene accurately and reliably, normalization of gene expression data using reference genes is essential in relative quantification analysis by RT-qPCR. Ideally, excellent reference genes should have a constantly stable or minimal variable expression in experimental conditions2. A few reference genes such as 18S rRNA (18 S ribosomal RNA), TUBA (α-tubulin), EF1A (elongation factor 1α), ACTB (β-actin), and GAPDH (glyceraldehyde-3-phosphate dehydrogenase)2, which show relatively high levels of expression, are frequently used for RT-qPCR analysis in plants3,4. However, increasing evidence has demonstrated that the expression levels of these traditional reference genes vary considerably in different samples and under different experimental conditions3,5,6. Therefore, it is necessary to select and validate reference genes according to specific samples and experimental conditions.

Santalum album L., commonly known as sandalwood, is a hemiparasitic tropical tree distributed in India, Indonesia, Malaysia, and Australia7. It is famous for its valuable essential oil extracted from aromatic heartwood and roots that are used in aromatherapy, perfumes, cosmetics, medicine and sacred unguents8,9. A number of functional genes and their expression levels have been characterized and studied in S. album in recent years7,10,11,12,13. As far as we known, the traditional housekee** gene ACT (β-actin) was the only reference gene used to date10,12,14,15, and there has been no systematic validation and evaluation of reference genes for RT-qPCR analysis in S. album.

In this study, 13 candidate reference genes, including 12 novel genes selected from a large set of RNA-seq data in three different tissues (stem, leaf, and root) of S. album, as well as the currently used traditional housekee** gene ACT, were assessed by RT-qPCR. Five statistical algorithms (geNorm16, NormFinder17, BestKeeper18, Delta Ct19 and RefFinder

Table 1 Selected candidate reference genes, primers, Tm and KS-test p values, and amplicon characteristics.

Primer specificity, amplification efficiency and expression profile of candidate reference genes

The specificity of primer pairs for each candidate reference gene was verified by 2% agarose gel electrophoresis with a single expected size product (Supplementary Fig. S1), and further demonstrated by melting curve analysis with a single peak (Supplementary Fig. S2). The cDNA-free template controls (ddH2O as template) showed no obvious melting curve products (data not shown). All these results confirmed the specificity of all primer pairs and the absence of DNA and other contaminating materials during RT-qPCR amplification.

The amplification efficiency for primer pairs of all candidate reference genes ranged from 93.919% (PP2C) to 112.855% (UK), and the R2 values lay between 0.994 (PP2C) and 0.999 (CSA) (Table 1).

The expression profiles of the 13 candidate reference genes in all experimental samples was assessed by RT-qPCR using the Cq value for each sample, after testing for normality using the KS-test in which a p value > 0.05 was considered as normal. All KS-test p values of each sample were greater than the 0.05 cut-off value (Table 1), the average Cq values ranged from 19.8 to 28.96 (Fig. 1), and the majority of average Cq values lay between 23 and 25, which indicates that the expression of all candidate reference genes fitted within a suitable reference gene expression level (15 < Cq < 30)21. As shown in Fig. 1, the gene with highest expression was ACT (with the lowest Cq value), and the gene with the lowest expression was UK (with the highest Cq value). The candidate reference gene names, GeneBank accession numbers, primer sequences, Tm values, amplicon lengths, amplification efficiencies, R2 values, and KS-test p values are listed in Table 1.

Figure 1
figure 1

Distribution of Cq values of 13 candidate reference genes in all experimental samples. Boxplot graph showing maximum, minimum values, medians and 25/75 percentiles.

Expression stability of candidate reference genes in different tissues and under hormone treatment of santalum album

According to GeNorm analysis (Table 2), M values of all of the candidate reference genes tested were below 1.5, indicating that they all had relatively stable expression. Among all four tissues tested (Table 2), FAB1A and PPR were the most stable genes, while Fbp1 was the least stable gene. For salicylic acid (SA) treatment (Table 2), FAB1A and Fbp3 were the most stable reference genes. For jasmonic acid methyl ester (MeJA) (Table 2), PP2C and CSA were the top ranked genes. PP2C and CCS1 ranked as the most stable reference genes for gibberellin (GA) treatment (Table 2). PPR and Fbp2 (Table 2) were the most stable reference genes in all three hormone treatment sample sets. As for the total experimental samples (Table 2), HLMt and PPR were the most stable reference genes.

Table 2 Expression stability of 13 candidate reference genes calculated by GeNorm, NormFinder, BestKeeper, Delta Ct and RefFinder.

The geNorm program was also performed to determine the optimal number of reference genes for normalizing RT-qPCR data by calculating the pairwise variations Vn/Vn + 1. As shown in Fig. 2, the value of V2/3 was always below the cut-off value of 0.15 in different tissue samples and samples from all of the hormone treatments, indicating that the two most stable reference genes were sufficient to normalize expression data in these experiments. In the total experimental samples, the three most stable reference genes were ideal to normalize RT-qPCR data since the value of V3/4 (0.127) was below the cut off value of 0.15.

Figure 2
figure 2

Pairwise variation (V) analysis of 13 selected reference genes using geNorm software. The pairwise variations Vn/Vn + 1 were calculated by geNorm in different tissues and under hormone treatment samples.

The results calculated with NormFinder (Table 2) show that PP2C followed by FAB1A were the most stable genes in all tested tissues, Fbp1 was also considered to be a weakly stable gene in such a sample set. As for SA treatment, PPR and Fbp1 were the most stable reference genes. CSA and Fbp3 were the top ranked reference genes in MeJA treatment samples. Fbp2 and UK were the most stable reference genes for GA treatment. Fbp2 and PPR were the most highly ranked reference genes in all three hormone treatment samples. When assessing the total experimental samples, HLMt and PPR were the top stably expressed genes.

As shown in Table 2, when evaluated by the BestKeeper program, ACT followed by PPR were the most stable genes and UK was considered to be the least stable gene in the four tissue samples tested. ODD and Fbp1 for the SA treatment, Fbp3 and CSA for the MeJA treatment, CCS1 and PP2C for the GA treatment, as well as UK and Fbp1 for all three hormone treatments were the most stable reference genes. As for the total experimental samples, CCS1 and FAB1A were the top stably expressed reference genes.

According to the ranking orders generated by Delta Ct (Table 2), PP2C and FAB1A were the most stable genes and Fbp1 was the least stable gene in the total of four tissues tested. As for hormone treatment, ODD and Fbp1 for SA treatment, CSA and Fbp3 for MeJA treatment, Fbp2 and PP2C for GA treatment, and Fbp1 followed by Fbp2 for all three hormone treatments were the top ranking reference genes. HLMt and PPR were the most stable reference genes for the total of experimental samples.

Finally, RefFinder was used to comprehensively validate the stability of candidate reference genes. According to the results determined by RefFinder (Table 2) and geNorm (Fig. 2), the combination of FAB1A and PP2C for all four tissues tested, ODD and Fbp1 for SA treatment, CSA and Fbp3 for MeJA treatment, PP2C and Fbp2 for GA treatment, as well as Fbp1 and Fbp2 for the total of three hormone treatments were the most suitable reference genes. As for all of the experimental samples, the most suitable reference genes were the combination of HLMt, PPR and FAB1A.

Moreover, we also verified the stability of candidate reference genes in specific tissues and different tissue combinations using RefFinder. According to the comprehensive ranking recommended by RefFinder (Fig. 3), PPR and Fbp3 were the most stable reference genes in leaves, Fbp2 and Fbp3 were the most stable genes in roots, and Fbp3, CCS1 and CSA were the most stable genes in callus. In stems as well as the combination of leaf, stem and root (LSR), PP2C and PPR were the most stable reference genes. Fbp3 was also among the most stable reference genes in LSR.

Figure 3
figure 3

Comprehensive expression stability of 13 selected reference genes recommended by RefFinder in specific tissues and different tissue combinations.

Validation of Identified reference genes in different Tissues and under MeJA treatment

In order to validate the identified reference genes, the transcript profile of a key gene (SaSSy) was investigated in the reference genes, and evaluated in the three tissues and under MeJA treatment.

As shown in Fig. 4A, the expression level of the SaSSy gene was similar in the three tested tissues when using the two most stable reference genes (FAB1A and PP2C) to normalize RT-qPCR data. The combination of FAB1A + PP2C provides more accurate expression values for each tissue than a single reference gene. Although each of the reference genes (FAB1A, PP2C, ACT and Fbp1) and gene combination (FAB1A + PP2C) used for normalization provided a similar trend of SaSSy expression level (leaf < root < stem) (this trend was comparable with the RNA-seq result), when the least stable reference gene Fbp1 was used to normalize RT-qPCR data, the expression level of SaSSy was obviously over-estimated in tested tissues. Statistical analysis showed insignificantly different results in roots when normalized by Fbp1, so it generated inconsistent statistical results compared with the results normalized by more stable reference genes or their combination. These over-estimated results, especially in roots, did not match RNA-seq results. Furthermore, SaSSy expression level was considerably reduced in all tissues when the traditional housekee** gene (ACT) was used for normalization.

Figure 4
figure 4

Relative expression levels of the SaSSy gene normalized by a validated reference gene alone or combination in different tissues (A) and under MeJA treatment (B) of Santalum album. Bars indicate standard deviation calculated from three biological replicates. Asterisk indicates significance at P < 0.05(*) or P < 0.01(**) using Duncan’s multiple range test.

Under the MeJA treatment, the expression of SaSSy at 3 h is about 2.0 times higher than no treatment control (0 h) when normalized by the most stable reference genes (CSA, Fbp3) and their combination (CSA + Fbp3). While using the least stable gene FAB1A, it was more than 2.8 times higher. The SaSSy expression at 6 h is about 1.4 times higher than at 0 h using the best reference genes (CSA, Fbp3, CSA + Fbp3). However, the least stable reference gene FAB1A produced an obviously reduced and statistically insignificant reduction of 1.15 times (Fig. 4B). This demonstrated an obviously effect of using different reference genes for normalization.