
Colorectal cancer (CRC) is a prevalent and lethal type of cancer, ranking as the third most common malignancy and the second leading cause of cancer-related death worldwide [1]. Despite the recent advancements in immunotherapy, its efficacy in treating CRC remains limited, with only a small subset of patients with mismatch repair deficiency (dMMR) or high levels of microsatellite instability (MSI-H) experiencing positive outcomes from immune checkpoint blockade (ICB) [2,3,4,5]. To address this challenge, it is crucial to gain a deeper understanding of the complex interactions between different cells and molecules in the CRC tumor microenvironment (TME) and identify new targets for immunotherapy.

Mast cells (MCs) are a type of immune cell that play a crucial role in the body’s response to allergens and in defending against pathogens [6, 7]. Upon activation, MCs release a variety of mediators, including proteases, cytokines, histamine, and lipid mediators, which have been implicated in the development of various diseases, including allergies, asthma, autoimmune disorders, and infections [6, 8, 9]. However, despite extensive research, the role of MCs in cancer, including CRC, remains controversial [9,10,11]. While some studies suggest a pro-tumorigenic role for MCs in CRC [12,13,14,15], others report an inhibitory effect [16,17,18]. Further investigation is necessary to gain a comprehensive understanding of the role of MCs in CRC.

The growth of single-cell sequencing techniques has revolutionized biological research by providing a detailed understanding of molecular and functional heterogeneity within individual cells [19, 20]. In the context of CRC, single-cell sequencing has been widely applied to study the molecular and functional profiles of immune cells in TME, such as T cells and myeloid cells [21, 22]. However, despite the growing body of research in this area, the molecular and functional heterogeneity of MCs within the TME of CRC remains unexplored.

This study aims to fill this gap by leveraging single-cell sequencing technologies to investigate the heterogeneity of MCs in the TME of CRC. Our findings represent the first identification of MC activation in CRC, revealing potential mechanisms behind this activation and the protective role of MCs in prognosis. This study provides valuable insights into the complex interactions between cancer and the immune system, and has implications for the development of novel therapeutic strategies for CRC.



The study utilized various public datasets, including one Spatial Transcriptomics (ST) dataset, three single-cell RNA-sequencing (scRNA-seq) datasets, and 11 bulk RNA-sequencing (bulk RNA-seq) datasets (Table 1). The bulk RNA-seq datasets comprise high-throughput sequencing data from the TCGA-CRC and ten Gene Expression Omnibus (GEO) microarray datasets (GSE20842, GSE20916, GSE39582, GSE41258, GSE44076, GSE44861, GSE68468, GSE83889, GSE87211, and GSE106582). The transcriptome data and clinical information from the TCGA-CRC were obtained from UCSC Xena (, while those from the TCGA pan-cancer cohort were obtained from the National Cancer Institute Cancer Research Data Commons ( and UCSC Xena. Transcriptome and clinical information from the GEO datasets were acquired from the GEO database (

Table 1 Sources of the ST, scRNA-seq, and bulk RNA-seq datasets

Cell culture

The mouse CRC cell line MC38 was maintained in our lab (**an, China). The mouse mast cell line P815 was obtained from pricella (Wuhan, China). Both cells were cultured in DMEM medium (Gibco, Thermo Fisher Scientific, Cambridge, MA, USA), containing 10% fetal bovine serum (Oricell; Guangzhou, China), 100 µg/ml streptomycin, and 100 U/ml penicillin in the medium (HyClone; Logan, Utah, USA). Both cells were incubated in a humidified incubator at 37℃ with 5% CO2.

Total RNA extraction and qRT-PCR

Trizol reagent (Invitrogen, Waltham, MA, United States) was used to isolate and extract total RNA from P815 cell line. The obtained RNA was then reverse transcribed into cDNA using the PrimeScript RT Reagent Kit (TaKaRa, Tokyo, Japan). qRT-PCR was then employed using the SYBR Premix Ex Taq II Kit (TaKaRa, Tokyo, Japan) to measure the expression levels of KIT. GAPDH was set as the internal standard. The relative mRNA expression was calculated using the 2−ΔΔCt method. The primer sequences are provided as following. KIT forward: 5’-GGCCTCACGAGTTCTATTTACG-3’; reverse: 5’-GGGGAGAGATTTCCCATCACAC-3’; GAPDH forward: 5′-GGTGAAGGTCGGTGTGAACG-3′; reverse: 5′-CTCGCTCCTGGAAGATGGTG-3′.

Western blot assay

For Western blot assays, we analyzed both human CRC tissue samples and mouse cell (P815) lysates. Human samples were obtained from ****g Hospital, and proteins were isolated using RIPA lysis buffer supplemented with a protease inhibitor cocktail. The primary antibodies used for human samples were mouse anti-human TPSAB1 (Thermo Scientific, Cat#ab2378, 1:50) and β-actin (#3700S, Cell Signaling Technology, 1:1000). In contrast, mouse cell lysates were prepared and their protein concentrations determined using a BCA kit. The primary antibodies for mouse cells were anti-KIT (#PA6364, abmart, Shanghai, China, 1:500) and anti-β-actin (#3700S, Cell Signaling Technology, 1:1000). Both human and mouse samples underwent 10% SDS-PAGE and were transferred onto either nitrocellulose or nylon membranes. After incubation at 4 °C overnight with primary antibodies, membranes were treated for one hour at 37 °C with HRP-conjugated secondary antibodies specific to either mouse IgG or rabbit IgG. Visualization and quantification of protein bands were performed using enhanced chemiluminescence and ImageJ software, respectively.

CCK-8 assay

P815 cells were pre-incubated in DMEM supplemented with different concentrations of KITLG (0ng/ml, 50ng/ml, or 200ng/ml). The medium supernatant was collected for subsequent culture of MC38 cells. In a 96-well plate, 2 × 103 MC38 cells were seeded in each well with 100 µl of medium. At 0, 24, 48, 72, and 96 h, following the removal of the original medium, a mixture of CCK-8 solution (TransDetect Cell Counting Kit, Transgene, Bei**g, China) and fresh medium (without FBS) in a 1:9 ratio was added to each well. Subsequently, the cells were cultured at 37 °C for a duration of 3 h. Following the incubation period, the absorbance of each well was quantified at 450 nm using a microplate reader (Bio-Rad, CA, USA) to determine the level of cell viability or proliferation.

In vitro migration and invasion assays

To evaluate the migration and invasion abilities of the cells, 24-well Transwells with 8 μm pore size (Corning, Inc., NY, USA) were utilized. In the top chamber, a total of 5 × 104 MC38 cells in 200 µl of fresh medium (without FBS) were seeded. In the lower chamber, 600ul medium supernatant supplemented with a 20% concentration of FBS was added. For the invasion assay, the top chamber was coated with 200 mg/ml Matrigel (Corning, Inc., NY, USA) before adding 5 × 104 cells. After a 48-hour incubation period, the cells that invaded through the Transwell membrane were stained and quantified. The migration assay was conducted without the use of Matrigel, following the same steps as the invasion assay.

Immunofluorescence staining

Human tissue specimens were obtained from ****g Hospital with the approval of the Institutional Review Board. CRC paired specimens were secured within 30 min post-tumor resection and preserved in paraformaldehyde for 48 h. Standard procedures were employed for dehydration and paraffin embedding. The specimens were treated with 3% H2O2 for 25 min to quench endogenous peroxidase activity. To block nonspecific binding, the tissue sections were pre-incubated with 10% normal goat serum for 30 min. Subsequently, they were incubated overnight at 4 °C with primary antibodies in a humidified chamber. The primary antibodies used to validate mast cells included mouse anti-human CMA1 (AbCam, Cat# ab2377, 1:1000), mouse anti-human TPSAB1 (Thermo Scientific, Cat#ab2378, 1:8000), rabbit anti-human CPA3 (Sigma, Cat#HPA008689, 1:200), and rabbit anti-human KIT (AbCam, Cat#ab283653, 1:200), and rabbit anti-human KITLG (AbCam, Cat#ab52603, 1:200). Following thorough washing, the sections were mounted with an anti-fade reagent and covered with coverslips. Fluorescence images were captured using a NIKON ECLIPSE C1 microscope and further analysis was performed using CaseViewer software.


In this study, we employed CIBERSORT [23], a computational tool, to analyze the cell type composition in bulk gene expression data. CIBERSORT estimates the relative abundance of different cell types within a sample based on the expression of specific gene markers, using a reference set of gene signatures. We utilized CIBERSORT to determine the proportion of resting and activated MCs in normal and CRC samples.


Cell composition deconvolution was conducted utilizing CIBERSORTx [24]. Our initial step was to generate a signature gene expression matrix using the CRC scRNA-seq dataset (GSE178341). We extracted raw count matrix data and cell type classifications from a subset of the Seurat object, which incorporated 1000 cells each from MC subsets, only including activated MCs and resting MCs. This raw count matrix was introduced into CIBERSORTx and subsequently normalized. The signature matrix, was established with CIBERSORTx, utilizing all genes to create the signature gene expression matrix. We evaluated the proportions of activated MCs and resting MCs in each CRC sample using CIBERSORTx, based on the bulk RNA-seq data (TCGA-CRC and GSE39582). To correct for cross-platform variation in the deconvolution of the RNA-seq data, we performed batch correction using S-mode with 100 permutations for significance analysis.

Single-cell sequencing data processing

The three single-cell datasets used in this study underwent initial quality control by the original authors, and subsequent independent analyses were performed on each dataset. The expression of all cells was normalized using the “LogNormalize” function with a scale factor of 10,000. The top 2000 highly variable genes were selected based on their mean and dispersion, and regression of percent mitochondrial content was performed during scaling of these highly variable genes using the “” option. We zero-centered and scaled each gene to unit variance before principal component analysis (PCA) to minimize potential batch effects. Results were obtained through linear dimensionality reduction using PCA. The “FindClusters” function was utilized for preliminary clustering and annotation, employing 50 principal components with a resolution of 0.8. The UMAP method was used for nonlinear dimensionality reduction and visualization of cell clustering. Next, a second round of clustering was then performed to further characterize subpopulations of MCs.

Expression difference analysis

To identify marker genes for each cluster or subset, we utilized the “Findallmarkers” function in Seurat. Genes were considered as differentially expressed genes (DEGs) of MCs in major cells if they met the following criteria: log fold-change of average expression > 1, pct.1 (percentage of expressed cells in MCs) > 0.7, pct.2 (percentage of expressed cells in other cells) < 0.3, and P value < 0.01. For MCs, genes were considered as upregulated DEGs in CRC if they met the following criteria: log fold-change of average expression > 0.25, pct.1 (percentage of expressed cells in MCs in CRC) > 0.25, and P value < 0.05. In analyzing expression differences between activated and resting MCs, we excluded 1,514 genes related to mitochondria, heat-shock proteins, ribosomes, and dissociation to eliminate noise and expression artifacts (Table S1).

Trajectory analysis

To delineate the developmental trajectory of various MC subsets—including activated MCs, resting MCs, and proliferating MCs in the GSE178341 dataset—we employed the “monocle” package (version 2.28.0) [25]. The DDRTree method implemented with the “reduceDimension” function of Monocle 2 was used for dimensionality reduction and construction of pseudo-temporal order.

Cell communication analysis

To investigate the interactions between MCs and other major cells, we utilized the Python-based software CellphoneDB [26]. Putative ligands and receptors were determined based on their expression on each cell. To accurately determine the extent of cell interactions, we performed a random sampling of 1,000 cells per population from the resting MCs, activated MCs, and major cell types in the GSE178341 dataset.

Defining phenotype scores

To characterize the differences between various MC subsets, we obtained phenotype scores using the “AddModuleScore” function in the “Seurat” package. These scores were calculated based on the average expression of genes related to a particular phenotype.

In this study, the MC signature was defined by the average expression levels of five MC signature genes (TPSAB1, TPSB2, CPA3, HPGDS, and MS4A2). Additionally, the MC activation signature [27] and angiogenesis signature [28]were utilized to assess the characteristics of different MC subsets (Table S2).

Spatial transcriptomic analysis

Standardization of the spatial transcriptomic data from the CRC sample was performed using the “SCTransform” function in the “Seurat” package. Dimensionality reduction and clustering were conducted using “RunPCA” and “RunUMAP” (with 15 principal components and a resolution of 0.8). Following the merging of similar clusters, we identified normal, stromal, and tumor regions in the CRC sample. To evaluate the spatial distribution of the MC activation signature, we utilized the “AddModuleScore” function in the “Seurat” package.

Functional and pathway enrichment analysis

In this study, Metascape ( [29], a platform for gene function annotation analysis, was utilized to perform enrichment analysis of DEG of MCs between normal and CRC tissue, using Gene Ontology Biological Process (GO BP) gene sets.

Furthermore, the “GSVA” package (version 1.44.2) was used to perform gene set enrichment analysis (GSVA) between activated and resting MCs, utilizing Hallmark gene sets.

Prognosis analysis of MC Signature and MC signature genes

To assess the prognostic role of MC signature in each cancer, we utilized univariate Cox regression and the Kaplan-Meier model. We analyzed four types of prognosis data, including overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). In the univariate Cox regression analysis, we used continuous expression data of MC signature. Furthermore, we performed Kaplan-Meier curve analysis using bivariate MC signature expression levels, with the cutoff determined by the “surv-cutpoint” function of the R package “survminer”. We presented the results as a heatmap, including log-rank p value, hazard ratio (HR) with 95% confidence interval (95%CI). To perform overall survival analysis of MC signature in two CRC cohorts, we utilized the “survival” package. Additionally, we conducted disease-free survival analysis of MC signature genes on the TCGA pan-cancer cohort using GEPIA2 [30].

Statistical analysis

The statistical analysis in this study was conducted using R (version 4.2.2), GraphPad Prism (version 9), and Python (version 3.7). We utilized the Mann-Whitney U test to compare differences between two groups, and the Spearman method for correlation analysis. For survival analysis, we employed univariate Cox and Log-Rank methods, and a P-value of less than 0.05 considered statistically significant.


Identification of MC signature genes and decreased MC density in CRC

In this study, we analyzed three large CRC single-cell datasets (GSE164522, GSE178341, and 5-cohorts) comprising 341 samples and a total of 953,493 cells, identifying 8,875 MCs (Fig. 1a). We annotated major clusters based on defining marker genes and identified various cell types, including T cells (CD3D, CD3E), natural killer (NK) cells (KLRF1, GNLY), B cells (CD19, MS4A1), plasma cells (MZB1, IGHA1), MCs (TPSAB1, TPSB2, CPA3), myeloid cells (LYZ, CD68), endothelial cells (PECAM1, VWF), epithelial cells (EPCAM, AGR2), and fibroblasts (DCN, COL1A2) (Fig. 1a and b, Table S3).

Fig. 1
figure 1

Identification of MC signature genes and decreased MC density in CRC.

a. UMAP plots displaying the major cell types in the GSE164522 (n = 52 samples), GSE178341 (n = 100 samples), and 5-cohorts (n = 189 samples) datasets. b. Dot plots of marker genes for each major cell type in the 5-cohorts dataset. c. Venn diagrams (center) of differential expressed genes (DEGs) in MCs in the GSE164522, GSE178341, and 5-cohorts datasets, with the intersection showing the five MC signature genes (right). The criteria for screening DEGs are depicted within the dotted lines (left). d. Heatmap showing the expression of MC signature genes in each cancer (Normal vs. Tumor). Histogram shows the number of genes with statistical significance (upper). Red represents an increase in tumor expression, green represents a decrease in tumor expression, and only p-values < 0.05 are displayed. e. Heatmap showing the expression of the MC signature genes in 10 bulk RNA-seq cohorts of CRC (Normal vs. Tumor). Blue represents a decrease in tumor expression, dataset source (top), sample size (bottom). f. The expression of TPSAB1 in human CRC tissue and paired NC tissue by Western blotting. g. Immunofluorescence staining of human CRC tissue and paired NC tissue. TPSAB1 (pink), CMA1 (green), CPA3 (green), DAPI (blue), Bar, 200 μm. CRC, colorectal cancer; DEGs, differential expressed genes; FC, fold change; FDR, False Discovery Rate; MC, mast cell; NC, normal colorectum or adjacent colorectum

To identify highly expressed genes that could serve as MC markers, we established strict criteria for differential gene screening, including a log2 fold change > 1 a proportion of expressed cells in MCs (PCT1) > 0.7, a proportion of expressed cells in other cell types (PCT2) < 0.3, and a P-value < 0.01. Five genes (TPSAB1, TPSB2, CPA3, HPGDS, and MS4A2) were identified as common DEGs across all three single-cell datasets and defined as MC signature genes (Fig. 1c).

The MC signature genes were used as MC markers to assess the density of MCs in bulk RNA-seq samples of both tumors and normal tissues. All five MC signature genes that achieved statistically significant (P < 0.05) were considered credible. Results showed that MC signature genes were significantly increased only in KICH, KIRC, and THCA but significantly decreased in BLCA, CESC, COAD, ESCA, LUAD, LUSC, READ, STAD, and UCEC (abbreviations of cancers are represented in Table S4), suggesting a reduction in MC density in the majority of tumors (Fig. 1d). Subsequent analyses in additional CRC cohorts verified a significant decrease in MC density in CRC (Fig. 1e). This was further corroborated by our Western Blot (Fig. 1f) and immunofluorescence (Fig. 1g) results derived from paired CRC and NC samples, which also pointed to a significant decrease in MC density within CRC.

In summary, we identified five reliable MC signature genes as markers, and our findings consistently demonstrated a reduction in MC density in CRC.

Activation of MCs in CRC

Activated MCs refer to MCs that have been stimulated by external factors, and they release biologically active substances, including histamine, cytokines, proteases, and lipid mediators such as leukotrienes and prostaglandins [6, 9]. Our research on the activation of MCs in CRC began with an unexpected finding. Using the CIBERSORT algorithm, we compared immune cell ratios between CRC and normal tissues in TCGA-CRC data and observed a significant increase in the proportion of activated MCs and a decrease in the proportion of resting MCs in CRC (Fig. 2a). This finding was further confirmed by data from nine additional CRC cohorts.

Fig. 2
figure 2

Activation of MCs in CRC.

a. CIBERSORT-based analysis-generated heatmap showing the difference in the proportion of activated and resting MCs between NC and CRC in 10 bulk RNA-seq cohorts of CRC. Red indicates a higher proportion in CRC, while blue indicates a lower proportion in CRC. b. DEGs between NC MCs and CRC MCs from the GSE164522, GSE178341, and 5-cohort datasets (top). Venn diagram of upregulated DEGs in CRC MCs (bottom). Screening criteria are indicated by the dotted lines. c. Gene Ontology Biological Process (GO BP) enrichment analysis of upregulated DEGs in CRC MCs. d. Heatmap showing the expression of cytokine and growth factor, protease and histamine, lipid mediator, and various receptor-related genes across different major cell types (5-cohorts). e. Heatmap showing the expression of cytokine and growth factor, protease and histamine, lipid mediator, and various receptor-related genes in MCs (NC vs. CRC) (5-cohorts). The tissue type is indicated by the color above the heatmap. f. Heatmap showing the expression of cytokine and growth factor, protease and histamine, lipid mediator, and various receptor-related genes in MCs (NC vs. CRC) (GSE178341)

To gain a deeper understanding of the changes in MCs during the tumorgenesis of CRC, we compared gene expression differences between MCs in CRC tissue and normal tissue in three single-cell datasets (GSE164522, GSE178341, and 5-cohorts) (Fig. 2b and Table S5). Genes that met the criteria of log2 fold change > 0.25, a proportion of expressed cells in CRC MCs (PCT1) > 0.25, and a P-value < 0.05 were defined as DEGs. The number of DEGs enriched in MCs in CRC was significantly higher in all three datasets compared to those enriched in normal tissue, indicating a widespread gene expression increase in MCs during the progression of CRC. Notably, TMEM176B and CD52 were among the top 5 significant genes in all three datasets, with CD52 being reported as a marker of neoplastic MCs in patients with advanced systemic mastocytosis [31]. To study the functional changes in MCs during the progression from normal tissue to CRC, we used DEGs (n = 268) that were increased in two or more datasets for enrichment analysis. The results of the GO analysis showed that pathways related to cell activation were the most significantly enriched in MCs in CRC (Fig. 2c).

To better understand the activation features of MCs during CRC progression, this study analyzed the expression of receptor and mediator genes related to MCs in the 5-cohorts dataset (Fig. 2d). The results showed that MCs were the main population that expressed receptors for IL-33 (IL1RL1) and KIT, with the highest expression levels of MRGRPX2, CSF2RB, and AHR. Notably, MCs were also the only cell population that expressed all three subunits of the high-affinity IgE receptor FcεR1 (i.e., FCER1G, FCER1A, and MS4A2). In addition, MCs showed high expression of signature proteases, including TPSAB1, TPSB2, CPA3, CMA1, and CTSG, with CTSD and CTSW being enriched in MCs but not limited to them. Moreover, MCs displayed high expression of genes involved in histamine biosynthesis (HDC), leukotriene biosynthesis (LTC4S, ALOX5OP, and ALOX5), and prostaglandin biosynthesis (HPGDS, PTGS1, PTGS2). MCs were the only cell population that expressed mRNA encoding diverse cytokines, chemokines, and growth factors, including IL4, IL5, IL9, IL13, CCL1, LIF, CSF1, and AREG. MCs also showed high expression of IL18, VEGFA, and TGFB1, although expression of these genes was not restricted to MCs.

The expression of MC receptors and mediators in CRC MCs and normal MCs was compared in the 5-cohorts (Fig. 2e) and GSE178341 (Fig. 2f) datasets. The results showed that most genes were significantly upregulated in CRC MCs, indicating that CRC MCs have a more activated MC phenotype compared to normal MCs. The exception was CMA1, which encodes chymase and was significantly more highly expressed in normal tissue.

In summary, the evidences indicate activation of MCs in CRC.

Heterogeneity of MCs in CRC

The advent of single-cell analysis has enabled the characterization of MC activation during CRC from the perspective of MC heterogeneity. In the GSE178341 cohort, analysis of 4,155 MCs led to the identification of 12 clusters corresponding to 4 distinct MC subtypes (Fig. S1a). The MC11 and MC12 clusters, enriched in MZB1 and CD3D respectively, were considered B cell doublets and T cell doublets, respectively, while the MC09 and MC10 clusters, enriched in interferon-related genes (IFITs) and mitochondrial genes (MTs), respectively, were grouped as “Other MCs”. The MC08 cluster, enriched in genes related to proliferation such as MKI67, was named as “proliferation MCs”. Based on the overall expression levels of MC receptor and mediator genes, the relatively high MC01-04 clusters were named “activated MCs”, while the MC05-07 clusters were named “resting MCs” (Fig. 3a and Fig. S1a). The significant enrichment of the MC activation signature in activated MCs compared to resting MCs further supported the naming of these MC clusters (Fig. 3b). The proportion of activated MCs in CRC was significantly higher, while the proportion of resting MCs was significantly higher in normal tissue (Fig. 3a and c), indicating that the activation of MCs in CRC is due to a higher proportion of activated MCs.

Fig. 3
figure 3

Heterogeneity of MCs in CRC.

a. UMAP plots of 4,155 MCs colored by cluster (left) and tissue type (center) in the GSE178341 dataset. Bar charts show the proportion of MC subtypes in different tissues (right). b. UMAP plots of MC activated signature. c. Log ratio of average fraction per MC clusters in tumor to normal tissue (top). Mann-Whitney U test, *: p < 0.05, **: p < 0.01, ***: p < 0.001. Dot plots of cytokine and growth factor, protease and histamine, lipid mediator and various receptor-related gene expression in MC clusters (bottom). d. Volcano plot of differentially expressed genes between resting and activated MCs. e. Differential pathway enriched in resting and activated MCs by GSVA, showing the top 10 significant enriched hallmark terms. f. Differentiation trajectory of MCs with each color coded for MC subsets (left), tissue types (center), and pseudotime (right). g. Pseudotime trajectory of CMA1, CPA3, TPSAB1, and KIT expression levels

In the comparison of activated MCs and resting MCs, it was found that most MC receptor and mediator-related genes, including the five major MC signatures (TPSAB1, TPSB2, CPA3, HPGDS, and HS4A2), were enriched in activated MCs, while CMA1 was enriched in resting MCs (Fig. 3d and Table S6). GSVA analysis revealed that the TNFA signature via NF-κB was most significantly enriched in activated MCs, whereas the angiogenesis-related pathway was enriched in resting MCs (Fig. 3e). This finding was supported by the result of the angiogenesis score, which confirmed the higher angiogenesis feature of resting MCs compared to activated MCs (Fig. S1b). Additionally, the enrichment of MHC-I and MHC-II related genes in activated MCs indicated that activated MCs have a stronger antigen-presenting function (Fig. S1c). The heterogeneity observed in MCs within CRC was also corroborated in the 5-cohorts dataset (Fig. S2a-d).

In addition, our pseudo-temporal analysis using Monocle 2 further supported a transition from resting to activated MCs (Fig. 3f), during which we also observed a decline in CMA1 expression (Fig. 3g).

High MC signature associated with favorable outcome in tumors

The impact of MCs on cancer prognosis remains controversial [9,10,11]. Based on the Kaplan-Meier model of TCGA pan-cancer data, we found that high expression of five MC signature genes was associated with better disease-free survival (Fig. 4a). In addition, we used eight prognostic indicators based on univariate Cox regression and Kaplan-Meier models for OS, DSS, DFI, and PFI to evaluate the impact of MC signature on prognosis in different cancers. A reliable result was considered if statistical significance (p < 0.05) was reached in at least four indicators. The results showed that the MC signature had a significant protective effect in 10 cancer types, including ACC, CESC, CHOL, HNSC, KIRC, KIRP, LIHC, LUAD, PRAD, and SARC, but was only associated with poor prognosis in STAD (Fig. 4b). Kaplan-Meier analysis of the TCGA-CRC (log-rank, p = 0.019) and GSE39582 (log-rank, p = 0.029) also indicated that a high MC signature was associated with better overall survival in CRC cohorts (Fig. 4c and d, Fig. S3a, and Fig. S3b).

Fig. 4
figure 4

MC signature predicts better prognosis.

a. Kaplan-Meier disease-free survival curves grouped by MC signature genes (TPSAB1, TPSB2, CPA3, HPGDS, and MS4A2) in pan-cancer. b. Summary of the correlation between the expression of MC signature and overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI) based on univariate Cox regression and Kaplan-Meier models. Red indicates factors that are detrimental to the prognosis of cancer patients, while green represents protective factors. Only p-values < 0.05 are displayed. c. The Kaplan-Meier overall survival curve of the MC signature in TCGA-CRC is shown, with the High-MC group and Low-MC group including patients with CRC who had MC signature expression in the top 30% and bottom 30%, respectively. d. Kaplan-Meier overall survival curve of MC signature in GSE39582

Additionally, we employed CIBERSORTx to delve into the influence of MC phenotypes on prognosis and observed a heightened proportion of activated MCs in CRC samples compared to normal tissues, concomitant with a reduction in resting MCs (Fig. S3c). However, no statistically significant differences were discovered concerning the proportions of activated and resting MCs calculated via CIBERSORTx in relation to CRC prognosis (Fig. S3d and Fig. S3e).

These findings demonstrate the important role of MCs in the prognosis of CRC patients and their potential as a protective prognostic biomarker.

KITLG/KIT signaling in MC activation and CRC inhibition

In this section, we aimed to identify the potential causes of MC activation in the CRC TME. Using the GSE178341 dataset, we performed CellphoneDB analysis and found that activated MCs had a higher number of interactions with other cell types compared to resting MCs. The prediction results also showed that activated MCs had the highest number of interacting receptors with myeloid cells, endothelial cells, fibroblasts, and epithelial cells (Fig. 5a).

Fig. 5
figure 5

KITLG/KIT signaling in MC activation and CRC inhibition.

a. Heatmap generated by CellphoneDB analysis showing the potential ligand-receptor interactions between resting and activated MCs and other major cell types in CRC (GSE178341). Numbers indicate the number of potential ligand-receptor pairs. b. Dot plots of interactions between resting and activated MCs and other major cell types along the IL33-IL1RL1 and KITLG-KIT axes. c. Dot plots displaying the expression of IL1RL1, IL33, KIT, and KITLG in different major cell types (5-cohorts) (left), and UMAP plots displaying the expression of IL33 and IL1RL1 (right upper), and KITLG and KIT (right bottom). d. Dot plots showing the expression of IL1RL1, IL33, KIT, and KITLG in different major cell types (GSE178341). e. Bar plots comparing the positive rate of KITLG expression between normal tissue and CRC in fibroblasts (left) and endothelial cells (center). Dot plots showing the expression of KITLG in different endothelial cell subsets (right). f. Correlation analysis of KITLG and KIT expression in TCGA-CRC (Spearman test). g. Correlation analysis of KITLG and KIT expression in GSE39582 (Spearman test). h. qRT-PCR analysis shows KIT mRNA expression level in P815 after manipulating different concentrations of KITLG protein. i. Western blot analysis shows KIT protein expression level in P815 after manipulating different concentrations of KITLG protein. j. CCK-8 assay comparing the proliferative capacity of CRC cells when exposed to medium with only different KITLG concentrations (left), compared to medium from p815 coculture with varying KITLG concentrations (right). Optical density (OD) was monitored daily for a 5-day period. k. Transwell analysis showing the impact on CRC cell migration and invasion when exposed to medium with only different KITLG concentrations (left), compared to medium from p815 coculture with varying KITLG concentrations (right). All data are shown as the mean ± SD. **: p < 0.01, ***: p < 0.001

Previous studies have shown that KITLG (SCF, stem cell factor) [

Fig. 7
figure 7

Diagram of MC Activation in CRC.

a. Compared to normal tissue, the overall density of MCs decreases in CRC, but the phenotype of MCs changes from resting MCs with high CMA1 expression to activated MCs with high expression of TPSAB1, CPA3, and KIT. b. KITLG (SCF) expressed by fibroblasts and endothelial cells in the stromal region increases in the TME, which may promote MC activation through the KITLG-KIT axis and thereby suppress tumor progression