Background

Esophageal cancer is the sixth leading cause of cancer-related deaths, and the eighth most common cancer worldwide, with a higher prevalence in specific geographical locations and certain ethnicities [1, 2]. Esophageal squamous cell carcinoma (ESCC) accounts for almost 80% of all esophageal cancer cases worldwide, with particularly high incidence rates in Eastern Asia and several regions of Africa [3]. The average 5-year survival rate for ESCC varies between 10 and 41% [Full size image

To evaluate the diagnostic potential of the 18-miRNA panel, we employed a two-pronged strategy. First, within each dataset, we performed multivariate logistic regression with 2-fold cross-validations (repeated 100 times) to demonstrate the diagnostic performance of the signature (average AUC = 0.98, 0.99, 0.98, respectively; Fig. 2). Second, we trained a multivariate logistic regression model on the GSE55856 dataset, and then applied the same statistical model to all three datasets in order to further validate and confirm the diagnostic performance of the 18-miRNA signature (AUC = 0.99, 1.00, 0.99, respectively; Fig. S3).

Fig. 2
figure 2

The diagnostic performance of 18-mRNA signature for distinguishing cancer and normal tissues. Heatmaps for TCGA (A), GSE55856 (B) and GSE43732 (C), respectively. Heatmaps illustrate expression of the 18 candidate miRNAs in the three miRNAs expression datasets. The upper panel show the risk probabilities derived from multivariate regression analysis with 2-fold cross-validation (repeated 100 times), and the right panel showed the expression fold changes of the 18 candidate miRNAs. The ROC curves demonstrate that the 18-miRNA signature accurately distinguished cancer tissues from normal tissues in all three datasets (average AUC = 0.986, 0.993, 0.989, for TCGA (A), GSE55856 (B), and GSE43732 (C) respectively), and superior to single panel member. ROC curve is shown with 95% CI. The 95% CI of sensitivity and specificity for each panel member was also shown at the best threshold (calculated by Youden-Index)

Next, to determine the functional significance of these candidate miRNAs, we constructed a miRNA–mRNA regulatory network based on experimentally validated miRNA–target interactions obtained from the miRTarBase database (V8). In total, we identified 393 genes that were differentially expressed between tumor and normal samples in the TCGA dataset based on the following criteria, |log2 fold-change| > 2 and BH-adjusted p < 0.01, as targets of the 18 miRNAs (Fig. S4A, Table S2). As expected, these miRNA target genes were significantly enriched in cancer-related signaling pathways, such as epithelial-to-mesenchymal transition and angiogenesis pathways (Fig. S4B, Table S3). To ensure that these 18 miRNAs are dysregulated in ESCC, we analyzed a cohort of 32 ESCC and 32 matched adjacent normal tissue specimens to confirm the upregulation of all 18 miRNAs in ESCC (p < 0.05, paired student t-tests; Fig. S5); highlighting their diagnostic significance and biological relevance in esophageal cancer.

Training and validation of an 8-miRNA circulating signature in serum from retrospective cohorts of ESCC patients

Considering that our aim was to develop a non-invasive liquid biopsy assay, we next examined the diagnostic performance of the tissue-based 18-miRNA panel for its translational potential in a serum-based biomarker prioritization cohort (50 ESCC, 50 healthy controls). Among the 18 miRNAs, the expression levels of four miRNAs (miR-182, miR-183, miR-18a and miR-505) were below the detection limit in serum specimens (average PCR cycle threshold > 35; Table S4) [25]. For the remaining 14 miRNAs, eight (miR-103, miR-106b, miR-151, miR-17, miR-181a, miR-21, miR-25 and miR-93) were significantly upregulated in serum from ESCC patients compared to healthy controls (p < 0.05, student t-tests; Fig. S6). From a clinical standpoint, measurement of an upregulated marker in blood is more practical, therefore we focused on 8 upregulated miRNA candidate and interrogated the diagnostic performance of the 8-miRNA panel in training cohort of patients (280 ESCC, 128 healthy controls), which allowed us to construct a multivariate logistic regression model (Table S4). We subsequently derived a risk-scoring formula using logistic regression. For all retrospective serum cohorts, we used this scoring formula and Youden’s index (0.582) derived from the serum training cohort as the cutoff thresholds to dichotomize high- and low-risk groups. Using the risk-scoring formula and the cutoff values, we evaluated the diagnostic performance of this 8-miRNA panel in the training cohort by means of AUC and corresponding 95% confidence intervals, sensitivity, and specificity. Interestingly, for the serum training cohort, this miRNA signature achieved an AUC of 0.83 (95% CI, 0.79–0.87), a sensitivity of 78%, and a specificity of 75% (Fig. 3A and S7A).

Fig. 3
figure 3

Establishment, validation, and diagnostic performance evaluation of an 8-miRNA signature. ROC curves were used to demonstrate the robust diagnostic value of the 8-miRNA signature in (A) the serum training cohort (AUC = 0.83), (B) the validation cohort 1 (AUC = 0.80), (C) stage I–IV patient samples of validation cohort 2 (AUC = 0.89), and (D) only stage I samples of validation cohort 2 (AUC = 0.82). CI was calculated by 2000 stratified bootstrap replicates

To further confirm the diagnostic performance of this 8-miRNA signature, we assessed its performance in two additional independent patient cohorts, where we were able to collect serum specimens – the serum validation cohort 1 (106 ESCC patients and 20 healthy controls) and serum validation cohort 2 (123 ESCC patients and 42 healthy controls). Consistent with the serum training cohort, our circulating miRNA signature achieved a robust performance in both serum validation cohort 1 (Fig. 3B, S7B, AUC = 0.80, 95% CI: 0.69–0.91, sensitivity: 89%, specificity: 60%) and serum validation cohort 2 (Fig. 3C, S7C, S8, Table S5, AUC = 0.89, 95% CI: 0.83–0.94, sensitivity: 87%, specificity: 85%).

Next, using the serum validation cohort 2, we compared the diagnostic performance of our 8-miRNA signature against that of a classic tumor marker in ESCC patients – the squamous cell carcinoma-related antigen (SCC-Ag). While the SCC-Ag levels exhibited modest diagnostic efficiency (Fig. 3C, AUC = 0.72, 95% CI: 0.60–0.84, sensitivity: 0.91, specificity: 0.69), our 8-miRNA panel was significantly superior in distinguishing ESCC patients across all disease stages (Fig. 3C, p = 0.003, DeLong’s test). Furthermore, even when we evaluated specifically in stage I ESCC patients, our circulating miRNA signature maintained its diagnostic performance in discriminating stage I ESCC patients (n = 20) from healthy controls (n = 42; AUC = 0.82, 95% CI:0.70–0.94, sensitivity: 0.76, specificity: 0.91). Likewise, our biomarker panel also maintained its superiority to SCC-Ag in the stage I patients as well (p = 0.025, DeLong’s test; AUC = 0.63, 95% CI: 0.50–0.78, sensitivity: 0.75, specificity: 0.69; Fig. 3D, Table S5), highlighting its potential as a promising early diagnostic assay.

To investigate whether our 8-miRNA panel has a diagnostic specificity for ESCC and not other cancer types, we evaluated the diagnostic performance of our 8-miRNA panel in other major malignancies including colorectal, prostate, lung and breast cancer using public serum miRNA datasets. The scoring formula of the 8-miRNA panel was applied to these datasets and the diagnostic performance of the panel discriminating cancer patients from healthy controls in each cancer types was evaluated. Compared to ESCC, the AUC values of the 8-miRNA panel substantially decreased in other cancer types (combined ESCC validation cohorts VS. other cancer types, all P < 0.05, DeLong’s tests, Fig. S9), suggesting that our 8-miRNA panel is specific to ESCC. Collectively, these data support the diagnostic efficacy of the 8-miRNA signature, as well as its promising potential for the detection of early stage ESCC.

Validation of the diagnostic performance of the circulating miRNA signature in two, independent, prospective cohorts of ESCC patients

To demonstrate the clinical application of our circulating miRNA signature in true clinical settings, we next examined its performance in two, randomized, prospectively enrolled patient cohorts. We performed qRT-PCR assays to assess the expression of the 8-miRNAs signature in 186 serum specimens (Bei**g-1 cohort; 84 ESCC patients and 102 healthy controls) and used this cohort as our training set. We performed multivariate logistic regression analysis and derived a risk-scoring formula: logit(P) = (0.00810 x miR17)–(0.183 x miR21)–(0.974 x miR25) + (0.973 x miR93)–(0.347 x miR103)–(0.298 x miR106b)-(0.194 x miR151) + (0.226 x miR181a)-3.196. Our 8-miRNA signature performed robustly in its ability to distinguish ESCC patients from healthy controls (Fig. 4A, S10A, S11A, AUC = 0.92, 95% CI: 0.87–0.96, sensitivity: 89%, specificity: 84%). Subsequently, we assessed the performance of this miRNA signature in an independent validation cohort (Bei**g-2 cohort; 89 ESCC patients and 99 healthy controls). Once again, our signature robustly distinguished ESCC patients from healthy controls (Fig. 4B, S10B, S11B, S12, AUC = 0.93, 95% CI: 0.88–0.97, sensitivity: 93%, specificity: 89%; Table S6). In both training and validation cohorts, our 8-miRNA signature performed substantially better than individual miRNAs in identifying ESCC patients (S11A and S11B).

Fig. 4
figure 4

Evaluation of the circulating miRNA signature for detection of ESCC in randomized prospective cohorts. ROC curves were generated to assess the diagnostic performance of the 8-miRNA signature in both (A) Bei**g-1 (AUC = 0.92), and (B) Bei**g-2 (AUC = 0.93) randomized prospective cohorts (ESCC patients across stages). Compared to our 8-miRNA signature, CE72–4, cyfra21–1, SCC-Ag, and CEA markers all showed significantly poorer performance (all P < 0.01, DeLong’s tests) in both cohorts. CI was calculated by 2000 stratified bootstrap replicates. Compared to conventional SCC-Ag and CEA markers, our 8-miRNA signature also demonstrated its superior performance in detection of stage I ESCC patients in both (C) the Bei**g-1 cohort (AUC = 0.97, all P < 0.05, DeLong’s tests) and (D) the Bei**g-2 cohort (AUC = 0.89, all P < 0.05, DeLong’s tests)

In both cohorts, compared to the conventional tumor markers including SCC-Ag, CEA, and CYFRA21-1, our 8-miRNA panel consistently demonstrated superior diagnostic performance for the identification of ESCC patients across all stages (Table S6, Fig. 4A, B, all p < 0.01, DeLong’s tests). Notably, when we focused on stage I ESCC patients, our 8-miRNA signature remarkably discriminated stage I ESCC patients from healthy controls in both Bei**g-1 cohort (AUC = 0.97, 95% CI:0.93–1.00, sensitivity: 0.92, specificity: 0.92) and Bei**g-2 cohorts (AUC = 0.89, 95% CI: 0.77–1.00, sensitivity: 92%, specificity: 90%); and in each instance its performance was substantially superior to that of SCC-Ag and CEA, which are routinely analyzed in clinical settings (Table S7, Fig. 4C and D, all p < 0.05, DeLong’s tests). We performed univariate and multivariate analyses to confirm that our circulating miRNA signature was the only significant predictor for detecting ESCC patients from all stages (Table S8), as well as stage I patients specifically (Table S9).

The 8-miRNA signature robustly identifies patients with high-risk premalignant lesions and is cost-effective vs. currently used diagnostic approaches in the clinic

Next, we investigated the earliest possible lesions that could be detected with our non-invasive circulating miRNA panel. Since the diagnostic risk scores were significantly elevated in stage I–IV ESCC patients (all p < 0.001, one-sided Student’s t-tests), we examined the diagnostic performance of the 8-miRNA panel for identifying patients with high-grade intraepithelial neoplasia. Intriguingly, the panel was able to identify patients with high-grade intraepithelial neoplasia (n = 13, p < 0.01, one-sided Student’s t-test; Fig. 5). However, the risk scores did not change significantly in patients with low-grade intraepithelial neoplasia (n = 8) or those with esophagitis (n = 6) compared to healthy controls (Fig. 5). These results suggest a potential use of our circulating miRNA signature for early detection of high-risk premalignant lesions.

Fig. 5
figure 5

The miRNA-classifier effectively discriminates stage I ESCC and premalignant lesions. Boxplots comparing risk scores between ESCC of different stages, premalignant lesions (esophagitis, low-grade intraepithelial neoplasia [LGIN], and high-grade intraepithelial neoplasia [HGIN]) and healthy controls. ** P < 0.01, *** P < 0.001

To determine whether screening using our miRNA signature would be cost effective, we performed cost effective analysis (see Supplementary Material for details). We estimated mass screening using our circulating miRNA signature to be cost-effective relative to current practice [ICER = CNY 15,800.4/QALY] (Tables S10, S11). In summary, our circulating miRNA signature demonstrated promising diagnostic performance in our multinational, multicenter cohort study, and is likely to provide a cost-efficient, highly robust option for non-invasive early detection of ESCC.