Introduction

As one of the most commonly primary malignancies, liver cancer has become one of the top five causes of cancer-related death around the world according to the World Health Organization [1]. Approximately 90% of liver cancer patients die from hepatocellular carcinoma (HCC) which is the most common pathological type [2]. Surgical treatment remains the most effective way to HCC. However, due to the insidious onset and rapid progression of HCC, patients frequently fail to avail themselves of the surgical option because of delayed medical consultation [3]. Moreover, HCC patients often face the daunting challenge of high prevalence in chemotherapy drug resistance, distant metastasis and recurrences, consequently resulting in an unfavorable prognosis [3, 4]. Therefore, it is vital to deeply investigate the underlying mechanism of HCC occurrence and development, so that we can find new and promising targets for diagnosis and treatment of HCC patients.

Endoplasmic reticulum (ER) involved in lipid and carbohydrate metabolism and calcium strorage [5, 6]. Moreover, as the largest and the most powerful organelle in eukaryotic cells, ER is also mainly responsible for the synthesis, transportation and folding of protein [5, 6]. Endoplasmic reticulum stress (ERS) refers to the protein folding disorder in ER under pathological or physiological stimuli, such as activation of oncogenes, oxidative stress, hypoxia, and infection [7, 8]. ERS regulate three main pathway of unfolded protein response (UPR), including PRKR-like ER kinase, activated transcription factor 6, and inositol requirement Enzyme 1 which alleviate the load of unfolded proteins load, and maintain cell homeostasis and function [9]. UPR pathways are activated in most cancer types because protein synthesis increases dramatically during the rapid proliferation of tumor cells [10, 11]. As the initiating factor of UPR, ERS plays a crucial role in the therapy response and prognosis of cancer. At the beginning of chemotherapy, drugs cause deficiencies in nutrients and hypoxia of tumor cells, which lead to the ERS followed by UPR [12, 13]. Once the UPR is activated, tumor cells release pro-survival components including cytokines, growth factors, and other factors, which induce cancer cell growth and proliferation and suppressing anti-tumor immune response [14, 15], It is reported that when HCC mice were treated with the IRE1α-inhibitor, alleviation of tumor load and collagen accumulation were observed, which indicate that regulating ERS and UPR is an effective way to inhibit drug resistance to HCC.

In our study, machine learning techniques such as Random Forest (RF) and Support Vector Machine (SVM) algorithms were applied to screen for key genes associated with hepatocellular carcinoma (HCC). Subsequently, by integrating these genes, an artificial neural network was utilized to construct an ERS-related HCC diagnostic model. On the training set and three validation sets, the diagnostic model exhibited satisfactory predictive performance. We also conducted a comprehensive analysis of the expression levels, immune infiltration, methylation, and mutation status of ERSRGs. Our research offers a novel perspective on understanding the molecular mechanisms of HCC and identifies potential targets for develo** new diagnostic and therapeutic strategies for HCC.

Methods

Data sources used for analysis

The author first integrated gene expression matrices from GSE25097, GSE62232, and GSE65372, analyzed the gene expression differences between normal and liver cancer tissues, and conducted functional enrichment analysis. By comparing the intersection of differentially expressed genes with genes related to endoplasmic reticulum stress, and employing two machine learning methods, six candidate biomarkers were identified, including SRPX, THBS4, CTH, PPP1R16A, CLGN, and THBS1. Based on these genes, an artificial neural network (ANN) algorithm was utilized to construct a diagnostic model. Subsequently, the diagnostic performance of these candidate genes was validated in three independent validation sets (GSE121248, GSE45267, and GSE84005). Moreover, molecular docking was employed to screen potential target drugs, and the immune cell infiltration rate, methylation level, and mutation rate of the marker genes were assessed. It was found that PPP1R16A exhibited a high copy mutation rate and was significantly correlated with the level of immune cell infiltration. To further identify PPP1R16A as a core gene in the endoplasmic reticulum stress model, single-cell sequencing and cell communication analyses were conducted to study its expression and distribution patterns in the tumor microenvironment. Finally, the biological function of the PPP1R16A gene was validated through in vitro experiments. The overall design of this study is illustrated in Fig. 1.

Fig. 1
figure 1

The overall flow of this study

Data collection and preprocessing

Transcriptome data and clinical information of HCC patients and normal tissue donors were from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). Three datasets were obtained, including GSE25097 (249 normal tissues and 268 tumor tissues), GSE62232 (10 normal tissues and 81 tumor tissues), and GSE65372 (15 normal tissues and 39 tumor tissues). These datasets were combined and removed repeating tissues using “sva” R package. A total of 662 samples were obtained, and the expression matrix of 14,738 genes was used as the training set. Besides, validation sets consisted of three datasets including GSE121248 (37 normal tissues and 70 tumor tissues), GSE45267 (39 normal tissues and 48 tumor tissues), and GSE84005 (38 normal tissues and 38 tumor tissues). All data is standardized and log-transformed by using the R “limma” software package for subsequent analysis [16].

The identification of differential expressed genes and ERSRGs

The “limma” R package was used to detect differential expressed genes (DEGs) between HCC and normal tissues in the training set with |log2 fold change (FC)|> 1.5 and adjusted p < 0.05 as cutoff value. The volcano plot and heat map showing the differential expression of genes between HCC and normal tissue were made using the “ggplot2” and “heatmap” R packages. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analysis were conducted among DEGs using “clustersProfiler”, “enrichplot”, “limma”, “ggplot2” and “org.Hs.eg.db” R package. The hallmark gene set “h.all.v7.4.symbols.gmt” was downloaded from MSigDB datasets (https://www.gsea-msigdb.org/)17 and used for gene set variance analysis (GSVA) analysis with the p < 0.05 and false discovery rate (FDR) < 0.25. In addition, 15 hallmark gene sets including 312 ERSRGs were also downloaded from the MSigDB database.

The construction and validation of artificial neural network (ANN) prediction model using artificial intelligence algorithms

With “random Forest” R package, we established the RF model using fivefold cross-validation method to iterate on the variables’ number at each split and tree. When the number of branches was 125, we got the minimum residual error. We ranked genes according to Gini coefficient score and those genes with score > 20 were finally selected [18, 24, 25].However, the biological mechanisms underlying ERSRGs remain unclear, and their impact on HCC warrants further exploration。.

In this study, we conducted a comprehensive analysis of transcriptomic data between liver cancer and normal tissues. We identified differentially expressed genes (DEGs) between liver cancer and normal tissues, as well as the functions and molecular pathway enrichment of these DEGs. Based on the expression profile analysis of the training set, six endoplasmic reticulum stress-related genes (ERSRGs) were identified using the Random Forest (RF) and Support Vector Machine (SVM) algorithms. Subsequently, an artificial neural network (ANN) prediction model was constructed and demonstrated effective predictive performance. This model was further validated on three independent test sets, confirming its superior predictive capability. We also conducted an in-depth study on the association and function of these genes in tumorigenesis and immunomodulation.

The current (ERSRGs) encompass six potential genes (SRPX, THBS4, CTH, PPP1R16A, CLGN, and THBS1). In fact, previous studies have elucidated the significant roles of some of these genes in various tumors. Cystathionine γ-lyase, encoded by the CTH gene, plays a crucial role in the cysteine sulfur metabolism pathway. It catalyzes the generation of hydrogen sulfide (H2S), L-cysteine, α-ketobutyrate, and ammonia [26]. Several studies have indicated that aberrant activation of the CTH/H2S signaling pathway is closely linked to the occurrence and progression of HCC [27]. X-rays activate the p38 mitogen-activated protein kinase, which in turn activates the CTH/H2S signaling pathway, inducing epithelial-mesenchymal transition and promoting invasion of liver cancer cells [28]. Recent research has also reported that FOXC1, by regulating CTH, inhibits cysteine metabolism, increases reactive oxygen species levels, and promotes tumorigenesis. Overexpression of CTH significantly inhibits the proliferation, invasion, and metastasis of liver cancer cells induced by FOXC1 [29]. In contrast, CTH presents a potential therapeutic target when normally regulated in contrast to FOXC1. Furthermore, Sushi repeat-containing protein X-linked (SRPX) has been identified as a potential therapeutic target in HCC treatment. SRPX has been identified through mRNA expression network analysis, and has been shown to suppress cancer cell stemness [30]. SRPX also regulates the migration and invasion of ovarian cancer through the Ras homolog family member A signaling pathway [31]. Thrombospondin-1 (THBS1), known for inhibiting angiogenesis, has been studied for its potential as a therapeutic target [32]. THBS1 promotes the progression and development of various cancers by regulating angiogenesis and tumor vascular perfusion [33]. Additionally, THBS1 modulates innate and adaptive immune cells through the CD47 signaling molecules, thereby restricting anti-tumor immunity [34]. Overexpression of THBS4 promotes the proliferation and migration of liver cancer cells, participates in the regulation of epithelial-mesenchymal transition progression and interacts with members of the integrin family to modulate the FAK/PI3K/AKT pathway [35]. The miR-142 is highly correlated with THBS4 overexpression in HCC tissue samples, by regulating THBS4 expression in HCC cells [36], PPP1R16A encoded the membrane-associated subunit of protein phosphatase 1 which is located on the plasma membrane as a CAR-binding protein [37]. The area under the ROC curve for PPP1R6A in global and initial-stage tumors was 0.82 and 0.76, respectively, showing excellent sensitivity and specificity to define the diagnosis likelihood of endometrial carcinoma [38]. However, the role of PPP1R6A in HCC diagnosis and prognosis is rarely known and requires further exploration. A recent study reported that upregulation of CLGN in HCC is significantly related to poor prognosis, especially in advanced stages which might be regulated by miR-194-3p, thus providing a potentially therapeutic target and prognosis predictor in HCC [39].

In cellular communication analysis, we found that the expression levels of these ERSRGs were closely associated with immune cell infiltration and the activity of immune-related pathways. Single-cell sequencing revealed that the high expression of PPP1R16A in the liver parenchyma may be a trigger for high-copy mutations. Given that MIF acted as a macrophage stimulator, we speculate from cell communication results that PPP1R16A cells may promote macrophage aggregation through the MIF pathway, which inducing M2 polarization of liver cancer cells. Recent study suggests that novel ERSRGs signature could an independent prognostic factor for HCC [40]. ERS regulate immune levels by regulating myeloid cells, mainly macrophages which is related to tumor evasion of the immune response, and chemoresistance. ER-stressed HCC cells release exosomes, upregulate the expression of PD-L1 in macrophages, and consequently suppresses T-cell function. A higher density of infiltrated macrophages in the liver has been observed to be associated with enhanced tumor aggressiveness and unfavorable prognosis among patients with HCC [11].

In recent years, immune checkpoint inhibitors (ICIs) as a new therapeutic approach targeting T cells regulatory pathways, have much attention [41, 42], and have great prospects in the field of anti-tumor therapy. Through the analysis of ssGSEA results, we found that THBS1 and SRPX showed significant positive correlations with immune cell infiltration, neutrophils, helper T cells, TILs, CCR, and inflammatory response-related pathways. In contrast, THBS4, PPP1R16A, and CLGN exhibited significant negative correlations with immune cell infiltration, neutrophils, helper T cells, TILs, CCR, immune checkpoints, T cell co-inhibition, and other immune-related pathways. These findings suggest that ERSRGs is closely related to the immune status of liver cancer, and offered a new research direction for the combination targeting of ERSRGs and ICIs in the treatment of liver cancer. Through combination therapy, there is potential to enhance anti-tumor immune responses and improve the prognosis of liver cancer.

We also found in cellular communication analysis that, simultaneously, fibroblasts, as essential factors promoting tumor metastasis, may reshape the tumor microenvironment by enhancing the collagen pathway and promoting collagen deposition to affect the function of PPP1R16A cells. These results further indicate that PPP1R16A may influence the prognosis of HCC by regulating the tumor immune microenvironment. Additionally, our experimental results suggest that knocking out PPP1R16A can inhibit the proliferation, invasion, and migration capabilities of HCC cells, indicating that PPP1R16A may be a crucial tumor-promoting factor. Cancer-associated fibroblasts produce collagen and change the extracellular matrix, which is an important mechanism of tumor metastasis. Modulating targeting specific signaling molecules responsible for crosstalk between Cancer‐associated fibroblasts and tumor cells is considered a promising approach to modulating HCC metastasis [43, 44]. Our study indicated that PPP1R16A may be one of such potential targets.

However, it is important to note that there are some limitations that need further addressing and in-depth exploration. Firstly, Considering the bioinformatics analysis based on public cancer databases, it is crucial to further validate the diagnostic and predictive performance of ERSRG markers in large-scale and prospective clinical trials and assess their potential clinical applications. This will contribute to ensuring the reliability and reproducibility of the analysis results and provide a more solid foundation for the clinical application of ERSRGs in liver cancer patients. Secondly, Despite some cell experimental validation was involved, providing support for preliminary findings, further in vivo and in vitro experiments are needed to thoroughly investigate the functions of ERSRGs in HCC. This expanded experimental research will contribute to a more comprehensive and in-depth understanding of the exact mechanisms of action of these genes in the development and progression of HCC, and contribute to a more comprehensive understanding of the potential efficacy and mechanisms of ERSRGs in combination with ICIs in liver cancer. Therefore, future research directions should include broader experimental designs to more comprehensively and systematically reveal the role of ERSRGs in the biology of HCC biology.

Conclusions

In this study, the researchers integrated the six identified ERSRGs into an ANN prediction model based on RF and SVM algorithms. Furthermore, we further investigated the biological mechanisms, immune regulation, and genomic mutations associated of these six ERSRGs in the diagnosis of liver cancer. The comprehensive analysis of ERSRGs provides a powerful tool for the prognosis prediction and personalized treatment of liver cancer patients. The feature model based on ERSRGs holds promise as a crucial prediction and therapeutic decision support system in the field of liver cancer research.