Introduction

Atractylodes rhizome (Atractylodis rhizoma, “Byakujutsu” in Japanese) and Atractylodes lancea rhizome (Atractylodis lanceae rhizoma, “Sojutsu” in Japanese) are often used as crude drugs in traditional Chinese medicine (TCM) and traditional Japanese medicine (Kampo). These plants, belonging to the genus Atractylodes and the family Compositae, are perennial herbs distributed in East Asia [4c). The buckets of sucrose (3–5 ppm), including the anomeric proton at 5.18 ppm, were spread on the far right side (PC1 positive value). The buckets defined as the sesquiterpene-2 group were found in the third quadrant, where the characteristic bucket from the protons of the dimethyl group at δ 1.16 ppm (s) was also found. Secondary metabolite components from the essential oil of Atractylodes lancea rhizome contributed to the grou** of samples from A. lancea. For the further consideration of this model, the score and lording plot of PC1 and PC3 were studied, which showed that sesquiterpene-1 and -2 were not clustered on the lording plot. The PC1 and PC3 data were showed in the Electronic supplementary material (ESM)(Figures S1, S2 and S3, vide infra). Unfortunately, these PCA models could not differentiate the plant species due to the confounding resonances of sucrose (3–5 ppm). As a second step, PCA was performed using the 1H NMR spectra of compounds extracted with a low polar solvent to prevent detection of sucrose and allow the delineation of the plant species.

Fig. 4
figure 4

Principal component analyses (PCA) using 1H-NMR spectrum extracted with CD3OD of different Atractylodes samples. a 3D score plot of PC1, PC2 and PC3 scores, b score plot of PC1 and PC2 scores, c loading plot for PC1 and PC2 components

Principal component analysis using 1H NMR spectra of CDCl3 extracts (CDCl3 model)

PCA using 1H NMR spectra of Atractylodes samples extracted with CDCl3, which is unaffected by primary metabolites, was performed utilizing bucket integral values every 0.04 ppm in the range of 0.00–10.00 ppm. The results showed three significant principal components cumulatively accounting for 82% of the total variance (PC1 = 50%, PC2 = 21%, PC3 = 11%), which are presented in the 3D score plot in Fig. 6a. The 3D score plot in Fig. 5b, shows three major, distinct clusters corresponding to the different species studied. The samples of A. koreana (purple) were clustered in one group, unlike the CD3OD extracts. The score plot of PC1/PC2 was analyzed to understand the details of the clustering (Fig. 5b). Plots of A. koreana were found in the second quadrant. The PC1 and PC2 loading plots were studied further to clarify potential metabolic markers contributing to the discrimination of the different species (Fig. 5c). In the fourth quadrant, buckets of 0.74 and 0.78 were confirmed corresponding to the proton at C14 (0.76 ppm, s) in atractylon, which is a specific compound found in Atractylodes rhizome. In contrast, in the third quadrant, there are buckets derived from protons at 6.11, 6.33, 6.37, 6.42, 6.79, and 7.38 ppm, which are considered to be inherent to atractylodin from A. koreana (shown in purple). In the fourth quadrant, the buckets of the sesquiterpene-2 group derived from Atractylodes lancea rhizome were involved. In addition, the protons of the dimethyl group were found to be 1.22 ppm (s) deviated from the center. From the above, it was possible to distinguish the species on the basis of PCA plots by detecting known, base-like compounds by NMR and performing multivariate analysis (PCA).

Fig. 5
figure 5

Principal component analyses (PCA) using 1H-NMR spectrum extracted with CDCl3 of different Atractylodes samples. a 3D score plot of PC1, PC2 and PC3 scores b score plot of PC1 and PC2 scores c loading plot for PC1 and PC2 components

Principal component analysis excluding sugar-based buckets in 1H NMR spectra of CD3OD extracts (CD3OD sugar KO model)

Using deuterated chloroform made it possible to distinguish basic species without the influence of highly water-soluble primary metabolites, especially sugars (sucrose in this instance). Therefore, PCA was performed excluding the sugar region (3.30 to 5.40 ppm) of the 1H NMR spectra of the CD3OD extracts, because the PCA derived from the reduced NMR dataset was in general agreement with the PCA using CDCl3 extracts. The results showed three significant principal components cumulatively accounting for 83% of the total variance. The 3D score plot showed major distinct clusters corresponding to the five different species (Fig. 6a, the contributions of PC1, PC2, and PC3 were 46%, 19%, and 18%, respectively). Especially, samples of A. koreana (purple) formed a cluster similar to that from the CDCl3 extracts. The PC1/2 loading plot was used to verify group formation on the score plot (Fig. 6b). There were many sesquiterpene-2-group buckets such as the chemical shifts of the dimethyl group resonances in hinesol and β-eudesmol, appearing at 1.16 ppm in the fourth quadrant, and whose clustering supported the grou** of A. lancea on the score plot. In contrast, the sesquiterpene-1 group (green) bucket was positioned in the second quadrant in the loading plot, which was in agreement with the distribution of the A. japonica group on the score plot. The A. koreana samples were distinguished as one group by the presence of the bucket corresponding to the resonance of atractylodin.

Fig. 6
figure 6

Principal component analyses (PCA) using 1H-NMR spectrum removed a sugar-based region (δ 3.30–5.40) extracted with CD3OD of different Atractylodes samples. a 3D score plot of PC1, PC2 and PC3 scores, b score plot of PC1 and PC2 scores, c loading plot for PC1 and PC2 components

Thus, by modifying the calculation method, it was possible to fully utilize the performance of 1H NMR metabolic profiling without changing the extraction solvent. In this method, highly water-soluble primary metabolites, such as saccharides, are excluded from the calculation to obtain characteristic metabolic fingerprints from the Atractylodes plants. On the other hand, this knocking out method would also exclude signals corresponding to glycosides at the same time. However, in this time, the KO model showed good agreement with the CDCl3 model for the discrimination of Atractylodes species. The reason for the discrimination is supposed to be that sucrose was the main component during the knockout region.

Comparison between three models using calculation parameters

The merit of NMR metabolic profiling is that the classification of plant species can be performed without specifying the index components. We successfully established a suitable method for 1H NMR metabolic profiling of Atractylodes plants by selecting a bucket according to the desired parameters and performing statistical calculations with the same solvent.

The multiple correlation coefficient (R2) and predictive ability parameter (Q2) in each of the three PCA models, extraction of CD3OD, CDCl3 and CD3OD knockout of sugar region (CD3OD sugar KO), are shown in Table 4. Two-way orthogonal partial least squares (O2PLS-DA) was performed on the three models to add various classes of Atractylodes species as objective variables, checked the multiple correlation coefficient (R2) and predictive ability parameter (Q2) of the resulting model, and performed a permutation test. R2 and Q2 values indicate the model's linearity and predictive ability, respectively. A discrimination model with an R2 value of > 0.65 and a Q2 value of > 0.5 is regard as adequate for discrimination [31].

Table 4 Estimation index results for all the discrimination models

A permutation test is used to validate the incidence of overfitting in a predictive model [32, 33]. In this test, provisional discrimination models were constructed on the basis of various data matrices in which objective and explanatory variables were randomly combined many times, and R2 and Q2 were calculated for each provisional model. The original and permuted data matrices and the correlation coefficients of R2 and Q2 were plotted on the x and y axes, respectively. The y intercept of the regression line in the plot is used as the estimated index for overfitting: generally, R2 < 0.4 and Q2 < 0.05 [32, 33]. Table 4 shows the results of each estimated index for all the discrimination models.

The value of R2Y did not change in any of the models, but the value of Q2Y showed a good fit in the CD3OD sugar KO model. This suggested that the CD3OD sugar KO model could be used as the discrimination model. From the results of the permutation test, it was found that using the CD3OD sugar KO model was suitable for predicting A. japonica and A. macrocephala.

Conclusion

In conclusion, we performed 1H NMR metabolic profiling of DNA-authenticated, archived rhizomes of the Atractylodes genus for genetic and chemical quality evaluation. The nucleotide sequence of the ITS region of the nuclear rDNA was confirmed for five species, A. japonica, A. macrocephala, A. lancea, A. chinensis, and A. koreana. An unbiased approach using multivariate statistical analysis of 1H NMR spectra of CD3OD extracts was adopted to reveal compositional differences in the primary and secondary metabolites among Atractylodes species, however, we failed to discriminate these plant species by this condition. Therefore, we prepared analytical samples by CDCl3 extraction, in which clustering of each plant species was achieved to detect species-specific compounds on the score plot of the PCA. Removing the sugar peaks from the 1H NMR spectra of the CD3OD extracts with PCA gave the same results as the PCA using CDCl3 extracts. This biased chemometric model was able to successfully discriminate these plant species. The present study revealed that 1H NMR-based metabolic profiling and genetic assessment are useful to identify members of the Atractylodes genus, which are categorized as different drugs in the Japanese Pharmacopoeia.

Experimental section

General experimental procedures

Polymerase chain reactions (PCRs) were performed with Ex Taq DNA polymerase (Takara, Kyoto, Japan). Sequencing reactions were carried out with Big Dye Terminator v3.1 Cycle Sequencing Kits (Applied Biosystems, CA, USA), and the amplicons were electrophoresed on an ABI 3130 Genetic Analyzer (Applied Biosystems, CA, USA). All ground Atractylodes samples were extracted with an Accelerated Solvent Extraction system (ASE 350) from Dionex Corporation (Sunnyvale, CA, USA). The extracts were dried using a Thermo-Fisher Savant SC250EXP SpeedVac™ equipped with an RVT4104 refrigerated vapor trap. Freeze-drying was performed on a Labconco Freezone 4.5 (Kansas City, MO, USA). A precise Mettler Toledo XS105 dual range analytical balance was employed to prepare extracts for UHPLC and quantitative 1H NMR (qHNMR) analyses. Samples for NMR analyses were prepared with a Pressure-Lock gas syringe (VICI Precision Sampling Inc., Baton Rouge, LA, USA) and calibrated glass pipets (cat. no: 2-000-200, Drummond Scientific, Broomall, PA, USA). Standard NMR tubes of 3 mm × 7 in. were purchased from Shigemi Co., Ltd. (no. PS005, Hachioji-city, Tokyo, Japan). 1H NMR spectra were recorded on an Agilent Technologies 400-MR (400 MHz).

Reagents

Atractylodes standards, β-eudesmol, and atractylenolide III were obtained from Fujifilm Wako Pure Chemical Co. (Osaka, Japan). For NMR acquisition, CD3OD-d4 (99.8% D) and CDCl3 (99.8% D) were purchased from Merck KGaA (Darmstadt, Germany). The signals in the 1H NMR spectra of Atractylodes extracts were assigned to individual metabolites on the basis of thorough analyses of the 2D NMR spectra and spiking experiments. The PCR-grade tubes, tips, and most biological reagents used for DNA authentication were acquired from Qiagen (Valencia, CA, USA) and/or Thermo-Fisher Scientific and Beckman Coulter (Indianapolis, IN, USA). LO3 agarose for gel electrophoresis was purchased from Takara (Kyoto, Japan).

Plant material

Fifteen specimens of five Atractylodes species were analyzed: A. japonica, A. macrocephala, A. lancea, A. chinensis, and A. koreana. Details of the plant materials are shown in Table 1. Samples were botanically/macroscopically verified prior to inclusion. All the voucher specimens were deposited in Medicinal Plant Garden, School of Pharmacy, Kitasato University, Kanagawa, Japan.

DNA extraction, PCR amplification, and sequencing

Total DNA was extracted from 100–200 mg of leaf tissue using hexadecyltrimethylammonium bromide (CTAB) solution following the method of Doyle [26] with minor modifications. The primer pair ITS5 (5′-GGA AGT AAA AGT CGT AAC AAG G-3′) and ITS4 (5′-TCC TCC GCT TAT TGA TAT GC-3′) [27] was used to amplify the ITS region of nrDNA. PCR amplification was performed in a 50-μL reaction volume containing 1 × reaction buffer for Ex Taq DNA polymerase, 0.2 mM of each dNTP, 0.2 μM of each primer, 0.5 units of Ex Taq DNA polymerase (Takara, Kyoto, Japan), and approximately 10 ng of template DNA. PCR was performed under the following cycling conditions: (95 °C, 3 min) × 1 cycle, (95 °C, 1 min; 52 °C, 1 min; 72 °C, 1 min 30 s) × 30 cycles, and (72 °C, 8 min) × 1 cycle. Since it is known that there are additive nucleotides in the ITS sequence of A. japonica [12], single strand conformation polymorphism (SSCP) analysis with the first PCR products was carried out to isolate individual alleles, following the method of Watano et al. [28]. Segregated bands were excised and purified before use as template DNA for the second PCR. The PCR products were purified with Amicon® Ultra Centrifugal Filter Units (Merck, Germany). Sequencing reactions using purified PCR products were carried out with BigDye Terminator v3.1 Cycle Sequencing Kits (Applied Biosystems, CA, USA) under the following cycling conditions: (96 °C, 1 min) × 1 cycle, (96 °C, 10 s; 50 °C, 5 s; 60 °C, 3 min 30 s) × 40 cycles, and (60 °C, 5 min) × 1 cycle. A specific sequence primer, AJITSF3 (5′-CCG CGA ACA TGT AAT GAC AAC CGG GC-3′) or AJITSR3 (5′-AAG CGT CGT CGC GAG GCG AC-3′) was used to avoid non-specific annealing. The reaction products were analyzed using the ABI3130 Genetic Analyzer (Applied Biosystems, CA, USA).

Phylogenetic analysis

Sequence data were edited and aligned using BioEdit [29]. Phylogenetic analysis by three different methods (NJ, MP, and ML) was performed using PAUP 4.0b10 [30]. Bootstrap analysis of 1000 replicates was conducted for the NJ and MP trees.

Sample extraction and preparation

Each Atractylodes plant sample was extracted using 0.1 mg of material in 1.0 mL of CD3OD-d4 (99.8% D) or CDCl3 (99.8% D) by ultrasonication at room temperature for 1 h. The mixture was centrifuged at 3,000 rpm (Kubota 3740, Japan) for 5 min. The supernatants were filtered through an Ekicrodisc® 30 mm syringe membrane filter (0.45-μm pore size) and transferred into 3-mm standard NMR tubes.

1H NMR acquisition and 1H NMR multivariate data analysis

The 1D 1H NMR spectra were acquired at 298 K using a 45° excitation pulse experiment (Bruker pulprog: zg). The probe was frequency-tuned and impedance-matched before each acquisition. For each sample, 64 scans (ns) and 4 dummy scans (ds) were recorded with the following parameters: spectral width of 16 ppm, relaxation delay (D1) of 3.0 s, and receiver gain (RG) set to 256. The total duration of each 1H NMR acquisition was 15 min. Off-line data processing was performed using MNova Lite (Mestrelab Research, S.L.) and ALICE2 software (JEOL). The 1H NMR spectra were automatically Fourier-transformed using ALICE2 software (JEOL). The spectra were referenced to CH3OH at 3.30 ppm and CHCl3 at 7.26 ppm. Spectral intensities were reduced to integrated regions, referred to as buckets, of equal width (0.04 ppm) within the region of δ 10.0 to 0.00 ppm using ALICE2 for metabolome version 6 software (JEOL). In the case of CD3OD, the region between δ 5.0 and 4.6, corresponding to residual water signals, was removed. The total integral value on the spectrum was set to 100 to provide bucket tables. In the resulting bucket tables, all rows were scaled to the total intensity, and Pareto scaling was applied to the columns preceding PCA and O2PLS-DA using SIMCA software. For another experiment, the regions between 4.00 and 3.00 ppm, corresponding to the sugar region, and the sucrose anomeric resonance at 5.169 ppm (doublet J = 3.75 Hz) were removed from the bucket table.