Introduction

Lung cancer poses a significant worldwide health issue, wherein non-small cell lung cancer (NSCLC) comprises approximately 80-85% of the reported instances [1]. According to global cancer statistics, over 2 million new cases of lung cancer are diagnosed each year [2, 3]. Among the various subtypes of NSCLC, lung adenocarcinoma (LUAD) represented approximately 55-60% of cases [4A, the OS of LUAD patients exhibited a significant correlation with 18 ERSRGs. Subsequently, LASSO analysis was employed to detect and screen 15 DEGs associated with ER stress (Fig. 4B). Furthermore, a multiple factor stepwise regression analysis was performed, resulting in the selection of 6 genes for constructing the prognostic model (Fig. 4C). The risk scoring model was established using the following formula: Riskscore = (-0.894871321) × EIF2AK3 + (0.268950582) × BAK1 + (-0.127961954) × NUPR1 + (0.255795681) × VCP + (0.393370303) × MBTPS2 + (-0.292131004) × RHBDD2. Based on the median risk score, patients from the TCGA-LUAD cohort were stratified into a high-risk subgroup (n = 220) and a low-risk subgroup (n = 220) to facilitate further investigations. Subsequent K-M analysis and risk survival status plot revealed that the high-risk subgroup exhibited a worse prognosis, whereas the low-risk subgroup demonstrated prolonged survival (Fig. 4D and F). The prognostic models were assessed by calculating the area under the curve (AUC) for 1-year, 3-year, and 5-year survival, yielding values of 0.68, 0.69, and 0.70, respectively (Fig. 4E).

Fig. 4
figure 4

Identification of ERSRGs-signature. Univariate analysis (A), LASSO analysis (B) and stepwise Cox algorithm (C) were used to identified a prognostic ER stress-related signature. (D) Kaplan-Meier survival curves between high and low subgroups. (E) For this ERSRGs-signature, the area under the ROC curve is 0.69 (1 years), 0.68 (3 years), 0.70 (5 years). (F) Riskscore plot showed the relationship among status, survival time and ERSRGs expression

Assessment and external validation for ERSRGs-signature

The risk distribution curve, survival status, and expression heatmap of the external validation sets (GSE37745 and GSE30210) demonstrated that patients with low-risk scores exhibited significantly longer survival times compared to those with high-risk scores, thus validating the findings from the training set (Fig. 5A and B). To further consolidate the prognostic model, the clinical information and genetic characteristics from TCGA were integrated, and a comprehensive multi-factor Cox regression model was developed, resulting in the construction of a nomogram (Fig. 5C). Calibration plots were employed to assess the predictive accuracy of the nomogram, revealing excellent agreement between the predicted and observed OS rates at 1, 3, and 5 years (Fig. 5D). Moreover, the nomogram model was subjected to decision curve analysis (DCA) to evaluate its clinical utility and potential benefits (Fig. 5E-G). Collectively, the risk score, when combined with the ERSRGs-signature, pathological stage, and N-stage, emerged as an independent and robust prognostic indicator, providing enhanced prognostic value for patients with LUAD.

Fig. 5
figure 5

Assessment and external validation for ERSRGs-signature. (A) Riskscore plot of 6 ERSRGs-signature in external testing set, with riskscore and survival status in GSE37745 and GSE31210. (B) The Kaplan-Meier survival curves of high-risk and low-risk subgroups in external testing set. (C) Nomogram equipped with the riskscore and clinical parameters (age, gender, T, N and pathological stage) in TCGA. (D) The calibration curves displayed the accuracy of nomogram. (E-G) Decision curve analysis of nomogram (1-, 3-, 5- years)

Exploring immune infiltration patterns and single-cell analysis of ERSRGs-signature in LUAD

To unravel the potential functions and pathways associated with prognostic features, we conducted comprehensive enrichment analyses of Gene Set, GO, and KEGG pathways. The results revealed that the genes linked to prognostic features were predominantly enriched in pathways related to immunoinfiltration. Hence, we proceeded to explore the heterogeneity of immune microenvironments among ERSRGs-signature (Fig. 6A-C). Initially, we assessed the correlation between gene expression and immune infiltration in LUAD and observed significant variations in the expression of different genes across immune cells (Fig. 6D). Subsequently, we employed the TIMER and EPIC algorithms to investigate immune infiltration patterns between the low and high-risk subgroups. The low-risk subgroup exhibited significantly elevated expression of B cells, CD4 T cells, CD8 T cells, and macrophage cells compared to the high-risk group (Fig. 6E and F). To validate the stability and robustness of these findings, we utilized additional algorithms, namely MCP-counter and ESTIMATE, which yielded consistent results (Fig. 6G and H). Furthermore, we observed substantial differences in the expression of immune checkpoints between the two subgroups (Fig. 6I).

Subsequently, we conducted single-cell sequencing analysis of the ERSRGs-signature. Cluster analysis was performed, and Fig. 7A depicted the cluster display using t-distributed stochastic neighbor embedding (tSNE), where each color represented a distinct cell type identified within the clusters. Each cell was represented by a scatter plot, and the numbers in the figure corresponded to the cluster numbers. It was evident that there are 25 distinct cell populations. Figure 7B presented the annotation of clusters based on marker analysis, revealing significant differences in gene expression among different immune cells. After applying tSNE dimensionality reduction, the mRNA distribution of BAK1, EIF2AK3, MBTPS2, NUPR1, RHBDD2, and VCP was shown in Fig. 7C-H. Finally, we analyzed the differential expression of ERSRGs in the various immune cell clusters. Among them, BAK1 exhibited the lowest expression in immune cells, while VCP demonstrated the highest expression (Fig. 7I).

Overall, the riskscore demonstrated an inverse correlation with the level of immune infiltration, providing novel insights into the relationship between ERSRGs and the immune status of LUAD.

Fig. 6
figure 6

Immune infiltration analysis of ERSRGs-signature in LUAD. (A) The GSEA enrichment analysis between high riskscore subgroup and low riskscore subgroup. Analysis of GO (B) and KEGG (C) in differentially expressed genes. (D) The correlation between ERSRGs-expression and immune infiltrates. The TIMER (E), EPIC (F), MCP-Counter (G) and ESTIMATE (H) algorithm between high and low risk subgroups. (I) The expression of immune checkpoints was compared between the low vs. high riskscore subgroups. *P < 0.05, **P < 0.01

Fig. 7
figure 7

Single cell sequencing analysis of ERSRGs-signature. (A) tSNE clustering colored by groups. (B) The annotation of clusters based on marker analysis. mRNA distribution of BAK1 (C), EIF2AK3 (D), MBTPS2 (E), NUPR1 (F), RHBDD2 (G) and VCP (H) after tSNE dimensionality reduction. (I) Differential expression of ERSRGs in the different cell clusters

Validation of the expression levels of ERSRGs in LUAD

To further investigate the association between the prognostic ERSRGs-signature and LUAD, in vitro experiments were conducted using qPCR analysis on peritumoral and tumor tissues. The findings revealed a significant upregulation of BAK1 and EIF2AK3 expression in LUAD tissues, whereas NUPR1, RHBDD2, and VCP exhibited the opposite trend (Fig. 8A-G). Moreover, the Human Protein Atlas (HPA) database analysis showed higher expression levels of BAK1A and EIF2AK3 in LUAD tissues compared to normal tissues (Fig. 8G). However, NUPR1 data was unavailable in the HPA database. Therefore, to explore the protein expression of NUPR1 in LUAD patients, IHC analysis was performed at Nantong Cancer Hospital. Interestingly, the protein expression of NUPR1, as determined by IHC, exhibited an opposite pattern compared to the mRNA expression patterns (Fig. 9A).

Fig. 8
figure 8

Validation of the expression levels of ERSRGs in LUAD. The mRNA expression of BAK1 (A), EIF2AK3 (B), MBTPS2 (C), NUPR1 (D), RHBDD2 (E) and VCP (F) in LUAD patients from Nantong tumor hospital. N = 8, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. (G) The protein expression of BAK1, EIFAK3, MBTPS2, RHBDD2 and VCP in HPA

Validation of NUPR1 under experiment

In this study, we implemented a comprehensive validation of NUPR1 within authentic laboratory conditions. Initially, IHC analysis was conducted on pathological specimens obtained from 6 LUAD patients. The results revealed a conspicuous aggregation of NUPR1 within cancerous tissue compared to adjacent non-cancerous tissues (Fig. 9A and B). Subsequently, both RNA and protein expression levels of NUPR1 were scrutinized in normal lung epithelial cells and four distinct LUAD cell lines. Surprisingly, NUPR1 RNA exhibited its highest expression in normal cell lines (Fig. 9C), aligning with our bioinformatics analysis outcomes. In contrast, NUPR1 protein displayed heightened expression levels in LUAD cells (Fig. 9D and E, Fig. S2 and S3). We postulated that potential post-translational modifications may underlie this incongruity. To gain deeper insights into the functional role of NUPR1 in LUAD progression, we procured NUPR1 inhibitors and executed cell proliferation and transwell experiments. The results starkly indicated that upon NUPR1 inhibition, both cell proliferation and invasive capacity were markedly attenuated (Fig. 9F and G). This unequivocally underscores the contributory role of NUPR1 protein in the advancement of LUAD.

Fig. 9
figure 9

Expression analysis of NUPR1 at transcription and translation Levels. Representative images (A) and quantification (B) of NUPR1 in intratumoral and peritumoral fractions through immunohistochemistry staining (N = 6). MRNA (C) and protein expression (D&E) of NUPR1 in cell lines (N = 3). (F) Cell viability assessed through CCK8 assays between saline and trifluoperazine subgroups (N = 6). (G) Representative images and results of cell counting from the Transwell invasion assay (N = 3). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001

Correlation between risk score and IC50 values for therapeutic agents

The impact of risk scores on the IC50 values of a set of 30 distinct drug molecules was systematically assessed to discern their therapeutic efficacy. Except for BI-2536 and WIKI4, all other drugs exhibited higher resistance in the high-risk group (Fig. 10 and Fig. S1). This observation underscores the potential utility of our prognostic model in guiding the use of therapeutic agents.

Fig. 10
figure 10

(A-P) Therapeutic drugs showed significant IC50 differences in high- and low-risk groups

Discussion

LUAD represents the most prevalent subtype of lung cancer, a grave malignancy arising from the accumulation of various genetic mutations. These mutations lead to uncontrolled proliferation of lung cells and the subsequent formation of tumors. Upon recognition by the immune system, these transformed cancer cells elicit an immune response aimed at their elimination [21]. Nonetheless, immune escape not only expedites tumor progression but also impairs the efficacy of cancer immunotherapy [22, 23]. The ER pathway serves as a critical regulator of ER homeostasis. Disruption of ER function triggers a phenomenon referred to as “ER stress” [24]. In the context of tumorigenesis, the rapid proliferation rate of cancer cells necessitates heightened activity of ER protein folding, assembly, and transport, thereby inducing physiological stress within the ER [25]. The ER stress response is believed to confer cellular protection and is implicated in tumor growth and adaptation to challenging environments [26]. Sustained ER stress represents a novel characteristic of cancer, resulting from various metabolic and carcinogenic abnormalities that disrupt protein-folding homeostasis in aggressive immune cells. Constitutive activation of the ER stress response enables malignant cells to adapt to carcinogenesis and environmental stressors by coordinating multiple immune regulatory mechanisms and promoting malignant progression concurrently [27]. Nonetheless, the precise relationship between ER stress and the immune microenvironment remains inadequately investigated.

In our study, we initially screened 106 genes associated with ER stress to identify differential expression patterns between cancer and para-cancer samples. K-Medoids clustering was employed for this purpose. The differential genes in the two resulting clusters were primarily enriched in processes related to the adaptive immune system, humoral immune response, and regulation of humoral immune response. Notably, patients belonging to cluster 1 exhibited a significantly longer survival time compared to those in cluster 2. This discrepancy in prognosis suggests a potential correlation with immune response. Through a series of statistical analyses, including univariate regression, LASSO, and logistic stepwise regression, we identified 6 key ERSRGs. Subsequently, we constructed a novel prognostic risk spectrum based on the expression signature of these six genes (referred to as ERSRGs). This risk spectrum allowed us to classify patients with LUAD into distinct risk subgroups, based on their respective median risk scores. Importantly, a higher risk score was associated with worse prognosis for the patients.

The prognostic features of interest encompass 6 ERSRGs, specifically EIF2AK3, MBTPS2, RHBDD2, VCP, NUPR1, and BAK1. Among these, EIF2AK3, NUPR1, and RHBDD2 demonstrated protective characteristics, while MBTPS2, VCP, and BAK1 were strongly associated with poor prognosis. To assess their expression levels, qPCR analyses were conducted on cancer and para-cancer samples from 8 patients diagnosed with LUAD. The results revealed significant differential expression of EIF2AK3, RHBDD2, VCP, NUPR1, and BAK1, with NUPR1 and RHBDD2 exhibiting the most pronounced differences. EIF2AK3 has been identified as an immune-related prognostic gene in breast cancer, exerting a role in tumor cell apoptosis and facilitating sustained protective antitumor immunity [28]. MBTPS2, a membrane-embedded zinc metalloprotease, activates signaling proteins involved in transcriptional control of sterol and the ER stress response [29], thus promoting the progression of prostate cancer [30] and colorectal cancer [30]. The RHBDD2 (Rhomboid domain containing 2) gene is found to be overexpressed in advanced stages of colorectal cancer (CRC) and potentially modulates the UPR pathway, thereby favoring cell migration, adhesion, and proliferation [31]. VCP (valosin-containing protein) is crucial for maintaining mitochondrial function, and in prostate cancer cells, it employs self-aggregation to inhibit mitochondrial activity, thereby evading cell death during nutrient deprivation and promoting malignancy [32]. In a cohort study, Tao et al. demonstrated that NUPR1 serves as a protective factor in the survival prognosis of LUAD [33], while Li et al. suggested NUPR1 to be a potential risk gene [34]. NUPR1, a nuclear protein, plays a critical role in redox reactions [35], and macrophages have been implicated as the most relevant immune cells associated with NUPR1 expression in bladder cancer [36]. Furthermore, the mechanism through which BAK1 promotes cisplatin resistance in NSCLC is believed to involve the inhibition of cell apoptosis [37]. In summary, all 6 identified genes contribute to tumor development and progression by modulating pathways associated with tumor metabolism, with NUPR1 considered particularly significant.

Nuclear Protein 1 (NUPR1) is a small, highly basic transcriptional regulator involved in the regulation of diverse cellular processes, such as DNA repair, ER stress, and oxidative stress response. The cellular localization of NUPR1 appears to be associated with pathological conditions. Prominent cytoplasmic staining has been observed in large papillary tumors, tumors exhibiting lymph node metastasis, and NSCLC [38]. Our IHC analysis corroborated these findings. However, intriguingly, our real-world cohort study revealed that, in contrast to mRNA expression, NUPR1 accumulates in cancerous tissues, contributing to the malignant progression of cancer, which necessitates further investigation. Garcia Montero et al. reported that under various stress conditions, NUPR1 mRNA expression was rapidly, strongly, and transiently stimulated [39]. Cancer cells endure and adapt to various types of stressful environments over prolonged periods [40], leading us to speculate that NUPR1 mRNA may be consumed more in cancerous tissues compared to adjacent tissues. Additionally, interestingly, the protein expression of NUPR1 has been shown to positively correlate with cell density [41]. Considering that cancer arises from unregulated and excessive cell division and proliferation, resulting in higher cell density [42], we hypothesize that NUPR1 expression is relatively elevated in cancer cells characterized by higher cell density compared to adjacent cells with relatively fewer cells.

To verify the broad applicability of the risk assessment element group, we conducted validation using external datasets GSE31210 and GSE37745. The signature exhibited robust predictive performance not only in the internal dataset but also in the validation sets. Evidence from ROC curves and K-M analysis demonstrated the remarkable predictive effect of the ERSRGs on the prognosis of LUAD patients. Importantly, even after stratifying clinical features, this signature remained significantly prognostic in LUAD patients. Therefore, we propose that ER stress-related features possess excellent predictive performance for OS and could serve as independent prognostic indicators for LUAD. To facilitate clinical application, we constructed a nomogram model and verified its accuracy using calibration diagrams.

Previous research has highlighted the role of ER stress in promoting immune escape and facilitating metastasis [43, 44]. Subsequent GSEA, GO, and KEGG analyses of the two subgroups revealed enrichment in immune-related pathways. Notably, tumor purity has been identified as negatively correlated with immune response, suggesting its potential as an indicator of the immune response level in the tumor microenvironment [45]. To explore this further, we employed four different immune scoring algorithms, and all results consistently indicated that individuals classified as low-risk exhibited higher expression levels of B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and endothelial cells. The density of CD8+ T cells and mature dendritic cells has been closely associated with the survival rate of lung cancers, with higher CD8+ T cell density correlating with better 5-year survival rates [46], consistent with our findings. Additionally, we observed decreased expression of immune checkpoint genes in the high-risk group, which may be attributed to immune cell dysregulation. Therefore, our new prognostic model holds potential to not only assess the survival prognosis of LUAD but also shed light on the immune microenvironment.

Several limitations should be acknowledged in this study. Firstly, the model primarily relies on data from the TCGA database and the Nantong cohort, thus its generalizability to other datasets may be limited. Therefore, a prospective multicenter cohort study is necessary to validate the findings and ensure their applicability to diverse populations. Secondly, in order to comprehensively elucidate the underlying reasons for the discordance between NUPR1 mRNA and protein expression levels, further evidence from additional experiments and investigations is required.

Overall, this study presents a prognostic model based on six genes associated with ER stress. The model exhibits utility in predicting the survival outcomes of patients with LUAD and offers insights into tumor immune infiltration to some extent. Furthermore, the identification of key genes provides novel insights into the molecular mechanisms underlying LUAD.