Background

Immune checkpoint inhibitor (ICI) has ushered in a new era of cancer treatment and provided unprecedented clinical benefits for patients [1]. However, only a relatively small proportion of patients respond to it [2], which highlights the necessity of biomarker research for optimizing patient selection and combination strategies to tackle immune resistance.

Traditional biomarker research mostly focused on the analysis of whole exome sequencing (WES) or RNA sequencing (RNA-Seq) from intact tumor tissue (bulk data) [3,4,5,6,7,8], which only reflects the average genetic profile across a large population of different cells. Pre-existing ICI biomarkers derived from these studies showed limited predictive values. The development of single-cell RNA sequencing (scRNA-Seq) enables us to dissect gene expression at single-cell resolution and identify novel biomarkers with better performance [9].

Cancer stem cells (CSCs) are self-renewal cells that promote tumor initiation, progression, and metastasis [

Results

Cancer stemness is associated with ICI resistance

A previous published ICI SKCM cohort with scRNA-seq data was firstly employed to evaluate the association between cancer stemness and ICI outcomes [14]. After removing patients without malignant cells data, we adopted 24 patients from this cohort, consisting of 11 non-responders (NR) and 13 treatment-naïve patients (TN). Ideally, it is better to compare the cancer stemness between responders (R) and non-responders. However, data of responders was not available in this cohort. Given that treatment naïve patients likely include both potential responders and non-responders, comparison of stemness was conducted between NR and TN as previously described [14]. As shown in Fig. 1A, cancer cells with high stemness were enriched in the NR subgroup. Further analysis showed that tumors from the NR subgroup had a significantly higher level of stemness (P < 0.001, Fig. 1B), indicating that cancer stemness is negatively associated with ICI outcomes. Another ICI cohort with a different cancer type (BCC) was employed to validate this finding [15]. In the BCC cohort, tumor stemness of 4 non-responders was compared to that of 6 responders. We found a more prominent gap of stemness level between NR and R subgroups in the BCC cohort (P < 0.001, Fig. 1C and D).

Fig. 1
figure 1

Identification and validation of a negative association between cancer cell stemness and ICI outcomes. A, C t-Distributed Stochastic Neighbor Embedding (tSNE) plot of malignant cells from SKCM or BCC. Top tSNE plots depicting the distribution of CytoTRACE scores among malignant cells. Dark-green indicates lower scores (low stemness) while dark-red indicates higher scores (high stemness). Bottom tSNE plots label the malignant cells by response phenotype. B, D raincloud plot of CytoTRACE scores by response phenotype (NR vs. TN) in SKCM cohort or by response phenotype (NR vs. R) in BCC cohort. The center of the box plot are median values, the bounds of the box are 25% and 75% quantiles (Wilcoxon test; *** P < 0.001). Abbreviation: NR, non-responders; R, responders; TN, treatment naïve patients.

Development of Stem.Sig through pan-cancer scRNA analysis

As cancer stemness is significantly associated with ICI resistance, we hypothesized that a Stem.Sig reflecting the stemness level of the tumor may help in the prediction of ICI efficacy. Therefore, 34 scRNA-Seq datasets were employed to develop the Stem.Sig (Fig. 2A; Additional file 1: Table S6). We performed Spearman correlation analysis between gene expression level and CytoTRACE scores for malignant cells among pan-cancer scRNA datasets. Genes that were positively correlated with CytoTRACE scores (Spearman R > 0 and FDR < 1e−05) were regarded as Gx. Genes that were differentially up-regulated in malignant cells were regarded as Gy. To obtain up-regulated tumor-specific genes that were positively associated with stemness, Gx and Gy were intersected to give rise to Gn for each dataset [14]. For example, G1 consisted of genes derived from the intersection of Gx and Gy in the first scRNA-Seq dataset. Geometric mean of Spearman R was calculated for each gene across G1–G34. Finally, genes with geometric mean of Spearman R > 0.4 (moderate to strong correlation) were pooled as Stem.Sig [87].

Fig. 2
figure 2

Development and description of stemness signature. A Circos plot depicting the development of Stem.Sig. B Pathway enrichment analysis of genes in Stem.Sig. The bar plot showed the top 20 enriched Reactome pathways. The cnetplot presented the network of specific genes from these pathways. Colored points referred to the corresponding pathways. Abbreviation: CFTR, cystic fibrosis transmembrane conductance regulator; GG-NER, global genomic nucleotide excision repair; HIF, hypoxia-inducible factor; PCP, planar cell polarity; CE, convergent extension

We investigated the biological functions that were over-represented in Stem.Sig (Fig. 2B). The enriched pathways mainly comprise processes involving hypoxia, glycolysis, ubiquitination, EPH-ephrin signaling, WNT signaling, and nucleotide excision repair (NER). Specific genes of these pathways were shown in the cnetplot of Fig. 2B. Some genes have been reported to be associated with unfavorable outcomes of immunotherapy, such as EPHA3, EPHA7, ENO1, ACTG1, DKK2, NPM1, and BCL10 [6, 2: Fig.S1 A and B). It is reasonable that the coexistence of these two factors (HSLT) may result in a TIME with the least infiltration of cytotoxic lymphocytes. On the contrary, LSHT could lead to the most abundant CTLs in the TIME. However, the anti-tumor immunity of the other two groups (HSHT and LSLT) seems to be more controversial than the aforementioned groups (HSLT, LSHT), since HSHT and LSLT both have an immune-suppressed (HS or LT) factor and an immune-promoted (LS or HT) factor. Further subgroup analysis found a higher level of cytotoxic lymphocytes in LSLT than in HSHT (p < 0.001, Additional file 2: Fig.S1 C). In conclusion, the order of anti-immunity from highest to lowest is: LSHT > LSLT > HSHT > HSLT (all p < 0.001, Additional file 2: Fig.S1 C). Therefore, tumors with low Stem.Sig presented with significantly better anti-tumor immunity than those with high Stem.Sig regardless of TMB level.

Immunotherapy outcome prediction by Stem.Sig

To investigate the predictive value of Stem.Sig, we collected bulk RNA-Seq data and clinical information from 10 ICI cohorts. Pre-treatment samples of these cohorts were curated and analyzed. Patients received anti-PD(L)-1, anti-CTLA-4, or anti-PD(L)-1 plus anti-CTLA-4. All these 10 cohorts were split into 3 data set: training set (n=620), validation set (n=154), and testing set (n=149). The flow chart of the analysis process was shown in Fig. 4A. Firstly, we trained the model with seven different machine learning algorithms and applied 10-time repeated 5-fold cross-validation for parameter optimization of each model. After training, we harvested seven models. Then, we evaluated and compared the AUC of these models in the validation cohort. Naïve Bayes model achieved the highest AUC of 0.71 and was selected as Stem.Sig model (Fig. 4B). For further assessment of the Stem.Sig model, we applied it to the independent testing set to predict ICI response and observed a same AUC of 0.71 (Fig.4C).

Fig. 4
figure 4

Prediction of ICI outcomes using Stem.Sig. A Flow chart of training, validating, and testing the Stem.Sig model constructed using machine learning process. In the training set, we applied 10-time repeated 5-fold cross-validation for parameters tuning of different machine learning algorithms. In the validation set, Naïve Bayes algorithm with best AUC was kept as the final Stem.Sig model. (parameter: fL=0; adjust = 0.75; useKernel = TRUE). B Comparison of multiple ROC plot depicting the performance of different machine learning algorithms in the validation set. C ROC plot depicting the performance of the final Stem.Sig model in validation and testing cohort. D Kaplan-Meier curves comparing OS between High-risk and Low-risk patients in validation and testing set. “NR” and “R” predicted by the final Stem.Sig Model was defined as “High-risk” and “Low-risk” patients respectively. HR were calculated by Cox proportional hazards regression analysis. Abbreviation: TPR, true positive rate; FPR, false positive rate; AUC, area under the curve; HR, hazard ratio; CI, confidence intervals

To evaluate whether the Stem.Sig model can predict overall survival, we divided ICI-treated patients into low-risk and high-risk subgroups based on the predicted “R” and “NR” respectively. The Kaplan-Meier analysis of OS was shown in Fig. 4D. Low-risk group achieved a significantly longer overall survival in training, validation, and testing sets (all log-rank p < 0.01). In the validation cohort, high-risk patients predicted by the Stem.Sig model had a median OS of only 13.3 months, compared to 31.2 months of low-risk patients (HR: 1.87; 95%CI: 1.21–2.90). In the testing set, a similar median OS of 13.4 months was observed in high-risk patients, while low-risk ones had not reached the median OS (HR: 3.08; 95%CI: 1.64–5.81).

We performed subgroup analysis for five individual cohorts that contribute to the testing set. Regarding ICI response prediction, AUC ranged from 0.62 to 0.81 among these cohorts (Additional file 2: Fig.S2A). Van Allen 2015 SKCM achieved a favorable AUC of 0.81 (95%CI: 0.66−0.95), followed by Synder 2017 UC (AUC: 0.80; 95%CI: 0.61−0.99). Compared to other cohorts, Zhao 2019 GBM presented with the lowest AUC of 0.62 (95%CI: 0.33−0.91). In survival analysis, Kim 2018 GC was removed due to a lack of OS data. For the other four cohorts, we observed a HR ranged from 1.73 to 4.05 in high-risk patients predicted by the Stem.Sig model (Additional file 2: Fig.S2B). After adjusting available confounding factors, significant survival benefits were still found in Van Allen 2015 SKCM (adjusted p = 0.02) and Synder 2017 UC (adjusted p = 0.02), while the other two cohorts showed only numerical survival differences. It is possibly due to the limited sample size.

We further compared the performance of Stem.Sig with previous well-established predictive gene signatures. Compared with pan-cancer signatures (INFG.Sig [76], T.cell.inflamed.Sig [76], PDL1.Sig [77], LRRC15.CAF.Sig [78], NLRP3.Sig [79], and Cytotoxic.Sig [80]), Stem.Sig showed best performance in the testing set with an AUC of 0.71, followed by INFG.Sig with an AUC of 0.66 (Fig. 5A). Most of these pan-cancer signatures showed ideal performance in only one or two cohorts. For example, AUC of INFG.Sig reached 0.85 in Kim 2018 GC and 0.67 in Van Allen 2015 SKCM, but it decreased to 0.53–0.54 in the other three cohorts (Additional file 1: Table S7). However, Stem.Sig achieved sufficient to very good performance in all cohorts, covering four cancer types: SKCM, GBM, UC, and GC, which further stresses its potential as a predictive model of ICI response in a pan-cancer manner (Fig. 5B). Compared with melanoma-specific signatures (CRMA.Sig [81], IMPRES.Sig [7], IPRES.Sig [82], TcellExc.Sig [14], ImmmunCells.Sig [83], IMS.Sig [84], and TRS.Sig [85]), Stem.Sig remained in the top 3 with an AUC of 0.76 in prediction of ICI response regarding melanoma patients. IMPRES.Sig and CRMA.Sig showed a slightly better AUC of 0.81 and 0.77 than Stem.Sig.

Fig. 5
figure 5

Comparing AUC of Stem.Sig with other predictive gene signatures. A Circos plot depicting the performance of pan-cancer signatures in the testing set. The vertical axis indicated AUC values. Testing set comprises five different cohorts, including Hugo 2020 SKCM, Van Allen 2015 SKCM, Kim 2018 GC, Zhao 2019 GBM, Synder 2017 UC. B Heatmap comparing the predictive value of Stem.Sig and other pan-cancer signatures. Different signature rows were ordered by their AUC in the testing set. From top to bottom, Stem.Sig ranked first while Cytotoxic.Sig ranked last. C Bar plot depicting the AUC values of Stem.Sig and other melanoma-specific signatures in the SKCM cohort (Hugo 2016 + Van Allen 2015).

Exploration of potential therapeutic targets from Stem.Sig using CRISPR screen data

We systemically collected immune response data of knockout genes from seven CRISPR cohorts, which were further divided into 17 datasets according to the model cells and treatment conditions used in these CRISPR cohorts. Totally, there were 22,505 genes recorded by these CRISPR datasets. We ranked genes based on their mean z scores. Top-ranked genes were immune-resistant genes, which may promote anti-tumor immunity after knockout. Bottom-ranked genes were immune-sensitive genes, which may suppress anti-tumor immunity after knockout. The process of gene ranking was shown in Fig. 6A. Among all 22,505 genes, the number of 1%, 2%, and 3% top-ranked genes was 225, 450, and 675, respectively. Next, we calculated the percentage of top-ranked genes that were presented in Stem.Sig and previous immune-resistant signatures, including TcellExc.Sig, ImmuneCells.Sig, IMS.Sig, LRRC15.CAF.Sig, and CRMA.Sig (except IPRES.Sig, which comprises 73 genetic pathways instead of individual genes) [14, 78, 81, 83, 84]. Stem.Sig, TcellExc.Sig, IMS.Sig, and ImmuneCells.Sig were the only four gene sets that had genes ranked in the top 3%. As expected, Stem.Sig had the highest percentage of top-ranked genes than other signatures (Fig. 6B). Immune-resistant genes (3% top-ranked genes) were significantly over-represented in Stem.Sig (P=0.03; Fisher’s exact test). There were 20 genes of Stem.Sig that were ranked in the top 3%, including EMC3, BECN1, VPS35, PCBP2, VPS29, PSMF1, GCLC, KXD1, SPRR1B, PTMA, YBX1, CYP27B1, NACA, PPP1CA, TCEB2, PIGC, NR0B2, PEX13, SERF2, and ZBTB43. Immune-resistant features of these stemness-associated genes were validated by multiple independent CRISPR datasets (Fig. 6C), which may serve as potential therapeutic targets in synergy with ICB.

Fig. 6
figure 6

Exploration of potential treatment targets from Stem.Sig using CRISPR screening data. A Ranking of genes based on their knockout effects on anti-tumor immunity across 17 CRISPR datasets. Negative (positive) z scores indicated better (worse) immune response after knockout of a specific gene. Genes were ranked according to their mean z scores. Top-ranking genes were associated with immune resistance. Blank squares in the heatmap referred to missing values of gene data from the corresponding cohort. B Radar plot comparing the percentage of top-ranked genes for Stem.Sig and other predictive signatures. C Heatmap depicting z scores of 20 Stem.Sig genes in the 3% top-ranked genes across different CRISPR datasets

Discussion

Although the mechanism between cancer stemness and anti-tumor immunity has been widely explored [10, 12, 96, 97], direct clinical evidence on the association of stemness and ICI response has not been reported. Here we utilized CytoTRACE to evaluate the stemness level of individual malignant cells and uncovered the inverse correlation between stemness and ICI outcomes, supported by the results from two ICI scRNA-Seq cohorts of SKCM and BCC [14, 15]. CSCs have been found in virtually all solid tumors [10]. Motivated by these observations, we hypothesized that the negative association between stemness and ICI efficacy generally existed across various cancers. Therefore, a large-scale comprehensive analysis was performed to identify over-expressed genes in malignant cells that significantly correlated with increased stemness. These genes formed a pan-cancer stemness signature, namely, Stem.Sig. We carefully validated the predictive value of Stem.Sig. Remarkably, Stem.Sig achieved better performance of predicting ICI response than previous predictive signatures across multiple independent ICI cohorts with bulk RNA-Seq data [52,53,54,55,56,57,58,59,60,61]. This study is the first report to demonstrate the robust link between stemness and ICI outcomes through a comprehensive analysis of large-scale data. Most importantly, we constructed a gene expression signature, Stem.Sig, that successfully predicts response to immunotherapy across multiple cancer types.

We found that Stem.Sig genes were enriched in the following biological functions: hypoxia, glycolysis, ubiquitination, nucleotide excision repair, EPH-ephrin signaling, and WNT signaling. WNT signaling is the key pathway that drives self-renewal of CSCs and maintains cancer stemness [98]. Hypoxia causes an increase in transcription factors (e.g., OCT4, SOX2, c-myc, and Nanog) which contribute to the sustenance of CSCs [99]. Anaerobic glycolysis is the distinct metabolic hallmark of stem cells [100]. Ubiquitination-mediated transcriptional regulatory network is essential in the maintenance of the stemness and pluripotency of stem cells [101]. Nucleotide excision repair (NER) is a major DNA repair pathway, which preserves genome integrity of cancer stem cells as to overcome stressful conditions [93]. Activity of EPH-ephrin signaling, as the largest family of receptor tyrosine kinases, is found enhanced in CSCs [102]. In our previous study, nonsynonymous somatic mutations of EPHA3 and EPHA7 was found associated with improved ICI efficacy [6]. It is reasonable that elevated EPH-ephrin signaling may contribute to the immunosuppressive features of CSCs. Furthermore, we evaluated the correlation between Stem.Sig and twelve previously identified stemness signatures [12]. As expected, Stem.Sig was found positively associated with these stemness signatures across different cancer types (Additional file 2: Fig. S3). Our results were in line with previous studies and suggested that Stem.Sig encompasses genes that robustly and specifically correlate with cancer stemness.

TCGA pan-caner transcriptomic analysis revealed a consistently down-regulated expression of immune-related genes and reduced infiltration of immune cells in tumors with high Stem.Sig level across different cancer types. Interestingly, a negative association between B cells and Stem.Sig was also observed. B cells could favorably affect ICI response via tertiary lymphoid structure (TLS), and hence we analyzed the relationship between TLS and Stem.Sig [103]. TLS scores were found inversely associated with Stem.Sig (Additional file 2: Fig. S4). Further analysis also revealed an up-regulation of some immune-relevant biological functions, including metabolism, DNA repair, and MYC signaling. Acquisition of hypermetabolic phenotype is an evolving mechanism that mediates immune evasion [94]. Enhanced DNA-repair capacity prepared malignant cells for unfriendly environments [93]. Increased MYC signaling suppresses immune response by elevating expression of PD-L1 and CD47 [95]. Tumors with high Stem.Sig presented with substantially immunosuppressive features, which corroborate the predictive value of Stem.Sig.

Also, we observed a positive correlation between Stem.Sig and both TMB and ITH, which is similar to the results of Miranda et al. [12]. It is noteworthy that high TMB is associated with high stemness. Although TMB is a well-recognized ICI biomarker, there is still a significant number of patients with high TMB fail to response to ICI [104]. Our stratified analysis revealed a significantly negative correlation between Stem.Sig and anti-tumor immunity in both low TMB and high TMB tumors. Cancer stemness can be a reasonable explanation of the immune resistance of high TMB tumors, which further stressed the importance of Stem.Sig as a predictive ICI biomarker.

Stem.Sig is a novel biomarker that is capable of predicting ICI response effectively and distinguishing patients with survival benefits successfully. We further compared Stem.Sig with other state-of-the-art signatures, including six pan-cancer signatures [76,77,78,79,80] and seven melanoma-specific signatures [7, 14, 81,82,83,84,85]. Stem.Sig outperformed pan-cancer signatures with better generalization and achieved an overall favorable performance in different cohorts across multiple cancer types. Compared with melanoma-specific signatures, Stem.Sig ranked top 3 and achieved a competitive AUC of 0.76.

Biomarker research is not only for improving patient selection but also for combination strategies that can overcome immune resistance. Considering such a robust link between Stem.Sig and ICI outcomes, we used CRISPR datasets to explore potential drug targets from Stem.Sig. We ranked genes based on their relevance to immune response and harvested the most immune-resistant Stem.Sig genes. For example, BECN1 is among the top-ranked Stem.Sig genes to render the TIME resistant to ICI. BECN1 plays a central role in autophagy, which is essential for self-renewal of CSCs and the maintenance of cancer stemness [105]. Targeting BECN1 can induce expression of CCL5, promote infiltration of NK cells, and thus improve antitumor immune response [106]. Top-ranked Stem.Sig genes, such as EMC3, BECN1, VPS35, and PCBP2, showed improved immune response after knockout in melanoma, renal carcinoma, breast carcinoma, and colon adenocarcinoma from multiple CRISPR datasets. These stemness-associated genes could be potential therapeutic targets for various cancer types. Further research of these top-ranked Stem.Sig genes would help to develop a combined strategy of immunotherapy.

Our study has some limitations. First, there were only treatment naïve patients and non-responders from GSE115978 [14]. Comparison of the cancer stemness was conducted between non-responders and treatment naïve patients. Considering the average response rate of melanoma is 30–40%, a considerable proportion of treatment naïve patients would probably not response to ICI. Theoretically, the difference between TN and NR is smaller than that of R and NR, since TN is a mixture of NR and R. However, a significant difference of stemness level still existed between NR and TN in this study, which indicates an even greater gap between NR and R. And this was confirmed by analysis of another scRNA-Seq ICI cohort, GSE123813 [15]. Secondly, some clinical annotation data (e.g., sex/age/tumor stage/TMB/ITH) was unavailable in some RNA-Seq ICI studies for multivariate cox regression analysis of overall survival. Thirdly, the 10 RNA-Seq ICI cohorts adopted in our studies only cover five cancer types (GC, SKCM, RCC, UC, and GBM). The consistent negative association between Stem.Sig and anti-tumor immunity across 30 cancer types can compensate this to some degree. Still, the predictive value of Stem.Sig in a pan-caner setting needs to be verified by future prospective ICI trials.

Conclusions

We provided the first solid clinical evidence that cancer stemness was associated with immunotherapy resistance. Using pan-cancer analysis of single-cell transcriptomic data, we developed a gene expression signature, Stem.Sig, which outweighs other well-established signatures in predicting ICI outcomes across multiple cohorts. Further exploration of Stem.Sig also revealed some potential therapeutic targets. Our study demonstrates a promising solution for patient selection in immunotherapy and sheds light on tackling ICI resistance through targeting cancer stemness to boost anti-tumor immunity.