Background

Lymphomas are malignant lymphoid tumors that arise as the clonal proliferation of lymphocytes classified as non-Hodgkin lymphomas (NHLs) and Hodgkin lymphomas. Ocular adnexal lymphoma (OAL) is a rare form of malignant lymphoid proliferation that constitutes 1–2% of NHLs. OALs arise in the conjunctiva, eyelids, and orbit, including the lacrimal gland [1, 2]. Most OALs are B-cell lymphomas. Extranodal marginal zone B-cell lymphoma (EMZL) is the most frequent subtype of OABL (55–69%), followed by diffuse large B-cell lymphoma (DLBCL) (10–15%) [3,4,5,6].

While gene expression profiling has led to landmark discoveries of NHLs, few studies have examined ocular adnexal B-cell lymphomas (OABLs) [7,8,9]. Furthermore, defining the biology of NHLs solely based on the transcriptome is challenging. By combining proteomic and transcriptomic data, proteotranscriptome-based studies have revealed novel insights into the development and progression of malignancies [10, 11], with findings that cannot be revealed by mRNA-based studies. By investigating mass spectrometry (MS)-based TMT labeling quantitative proteome and transcriptome, we provided an integrated gene expression landscape of OABL, revealed the global protein-mRNA concordance as a novel prognostic-related disease characteristic, and identified a novel pathology diagnostic marker.

Our analysis also revealed the importance of the alternative splicing pathway in OABL. It is a posttranscriptional gene regulation approach, which contributes to protein diversity [12]. Dysregulation of alternative splicing has been shown to contribute to the development and progression of various types of malignancy [13]. While some studies have identified mutation of SFs in mantle cell lymphomas (MCLs), alternative splicing in NHLs has not been well studied [14]. We provided a landscape of alternative splicing events (ASEs) as well as their potential biological implication in OABL and further demonstrated the oncogenic nature of the splicing regulator ADAR in OABL.

Methods

Patient selection and ethical approval

We reviewed our medical records database to identify patients confirmed by surgical biopsy at the Department of Ophthalmology, Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine from January 2016 to February 2020. The inclusion criteria were as follows: (1) diagnosis of histologically confirmed OABL, idiopathic orbital inflammation (IOI), reactive lymphoid hyperplasia (RLH), and patients who underwent orbital plastic surgery for aesthetic reasons; (2) availability of clinical and laboratory information at the time of diagnosis; and (3) specimen storage at − 80 °C. Clinical data were obtained from medical records. IOI, RLH, and normal specimens were defined as controls. IOIs and RLHs were defined as the “inflammation” in subgroup analysis (Supplementary Table S1).

The study protocol was approved by the institutional review board of Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine (protocol SH9H-2019-T185–2). Informed consent was obtained from all patients enrolled in the proteomic and transcriptomic analysis. The clinical characteristics of these patients (40 OABL patients and 31 controls) are summarized in Supplementary Table S2.

Protein sample preparation and sequencing

Proteomic and transcriptomic data were generated from 71 samples from the above-mentioned patients. The pathological sections were reviewed by three pathologists to validate the diagnosis before sequencing. All specimens were stored at − 80 °C until protein and RNA isolation, and sequencing was performed by Bei**g CapitalBio Technology Inc.

The experimental steps are described in the Supplementary Methods. Briefly, specimens were lysed using protein extraction buffer (8 M urea,0.1% SDS) containing 1 mM phenylmethylsulfonyl fluoride (Beyotime Biotechnology, China) and protease inhibitor cocktail (Roche, USA). Tandem mass tags TMTpro (Pierce, USA) with different reporter ions (126–131 Da) were applied as isobaric tags for relative quantification and TMT labeling was performed following the manufacturer’s instructions. The MS analysis was conducted using an Q Exactive mass spectrometer (Thermo Scientific, USA). Proteome discoverer software (version 1.4) (Thermo Scientific, USA) was used to perform database searching against the RefSeq database. The results were filtered using the following settings: high confident peptides with a global FDR < 1% based on a target-decoy approach. The proteomic data have been uploaded into the iProX database (https://www.iprox.org); (project ID IPX0004253000).

RNA sample preparation and sequencing

RNA samples were prepared using TRIzol reagent (Ambion, 15,596–026) following the manufacturer’s protocol. The poly-A containing mRNA molecules were purified from RNA using poly-T oligo-attached magnetic beads. The fragments were reversely transcribed into first strand cDNA using random hexamers, following by second strand cDNA synthesis using DNA polymerase I and RNase H. PCR was used to selectively enrich DNA fragments with adapter molecules on both ends and to amplify the amount of DNA in the library. The library was qualified using the Agilent 2100 bioanalyzer and quantified by Qubit and qPCR. The produced libraries were sequenced on the illumina Novaseq 6000 platform. Reads were aligned to hg38. The RNA-seq data have been deposited in the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo) under accession numbers GSE171059 and GSE199517.

AASE identification and analysis

The AASEs in OABLs were identified using rMATS [15]. All the sequences and annotations used in this analysis were based on GRCh38 genome assembly. An ASE with a ΔInclevel value between the OABLs and controls of more than 5% (|ΔInclevel | < 0.05) and adj p-value of < 0.01 was identified as an AASE. The list of annotated splicing factors and regulators was downloaded from the SpliceAid-F database and the study by Nostrand et al. [16, 5). Among these, 763 gene sets were significantly dysregulated in at least one type of omics (FDR < 0.2), and 725 were concordantly dysregulated and 38 were discordantly regulated (Fig. 1D). Arranged by the sum of NES rank in each omic, CO-UP gene sets were mainly enriched in mRNA processing and splicing, DNA damage and repair, and protein sumoylation. CO-DOWN gene sets were mainly enriched in normal tissue development and organization, and aerobic glucose metabolism (Fig. 1E). As our study contained multiple subtypes of OABL, we performed GSVA in the proteomic data and examined the robustness of the dysregulations (Supplementary Fig. S2A, Supplementary Table S6, Supplementary Methods). The variations of these top-ranked genesets were consistent among all four subtypes. For discordantly dysregulated genes and pathways, immune, Golgi traffic and amide metabolism related gene sets were identified by DEG enrichment and GSEA analyses (Supplementary Fig. S2B, C).

Global protein-mRNA concordance is a distant recurrence-related characteristic of OABL

Next, we examined the relationship between protein and mRNA abundance and its association with disease characteristics. Global protein-mRNA concordance was computed as the Spearman correlation result of all paired protein and mRNA abundance in each patient (Fig. 2A). We analyzed this concordance in patients with both proteomic and transcriptomic data (Fig. 1A). Considering the different transcript/protein abundance distribution between OABLs and controls, we analyzed protein-mRNA pairs separately in these two groups. We identified 3818 protein-mRNA pairs in OABLs and 3728 pairs in controls. The concordance was significantly higher in OABLs (median rho = 0.364) than in controls (median rho = 0.208, p = 0.01, Fig. 2B-C). Considering the association between inflammation and lymphoma [25,26,27,28], we compared the global concordance across subgroup specimens. The results showed that OABLs exhibited a relatively higher concordance than other groups (median rho of RLH = 0.254, IOI = 0.23, normal = 0.16). These data indicated that the increased correlation between protein and mRNA abundance is a disease-associated characteristic of OABL.

Fig. 2
figure 2

The match-subject analysis identifies global protein-mRNA concordance as an OABL-associated characteristic. A Schematic diagram of global protein-mRNA concordance calculation. B Density plot showing the global Spearman correlation for protein-mRNA pairs within OABLs (n = 3818 protein-mRNA pairs) and controls (n = 3728 pairs). C Concordance of protein-mRNA pairs is significantly higher in OABLs compared with the control or normal group and relatively higher compared with the RLP or IOI group. D Global protein-mRNA concordance is associated with prognostic risk factors. No-EMZL subtype, high LDH, and high IPI score have an increased concordance. E Global protein-mRNA concordance is positively correlated with the MKI67 proteomic abundance in the OABLs (r = 0.495, p = 0.005) but not in the controls (r = 0.203, p = 0.527). Blue line shows liner regression. F High global protein-mRNA concordance is associated with distant recurrence in OABL. G Kaplan-Meier plot shows high global concordance in OABLs is associated with decreased distant recurrence-free survival. H Bar plot of top 20 gene sets identified by GSVA correlated with global protein-mRNA concordance

We next analyzed if the concordance is associated with disease aggressiveness. First, we compared the concordance across subtypes, Ann Arbor stage, and prognostic risk factors (Fig. 2D). The concordance within no-EMZL subtypes was significantly higher than that of EMZL (p = 0.039), an indolent lymphoma subtype. High concordance was significantly associated with higher LDH (p = 0.014) and IPI (p = 0.046), two prognostic risk factors of NHL. High concordance was relatively associated with a higher Ann Arbor stage (p = 0.054) (Supplementary Fig. S3A). Next, we evaluated the correlation between proliferation ability and global concordance (Fig. 2E). The result showed a strong positive association between Ki67 protein abundance and global concordance in OABLs (r = 0.495, p = 0.005), but this association was not present in the controls (r = 0.203, p = 0.527). Hence, higher global concordance was associated with disease aggressiveness solely in OABL.

We then tested the correlation between the protein-mRNA concordance and OABL prognosis. We divided OABL patients into two groups using the median value of concordance and compared PFS, OS, RFS, LRFS, and DRFS between the two groups (Fig. 2G, Supplementary Fig. S3B). Among the 15 patients in the high rho group, 6 showed recurrence, 1 showed local recurrence, and 6 showed distant recurrence. For the 15 patients in the low rho group, 2 showed recurrence, 1 showed local recurrence, and 1 showed distant recurrence (Supplementary Table S1). Survival analysis revealed that a globally increased concordance in OABL was significantly associated with reduced DRFS (p = 0.037) and relatively associated with reduced RFS (p = 0.083), but not with PFS (p = 0.19) or OS (p = 0.26).

We then compared the global concordance between patients with and without the recurrence events (Fig. 2F, Supplementary Fig. S3C). High concordance was significantly associated with distant recurrence events (p = 0.0034) and recurrence events (p = 0.0072), but not with local recurrence events (p = 0.41). We analyzed the relationship between the global concordance and recurrence in small B-cell lymphoma (SBL), EMZL, DLBCL, and other subtypes to ensure the robustness of the finding (Supplementary Fig. S3C). High concordance was significantly associated with distant recurrence events in SBL (p = 0.0059) and EMZL (p = 0.039). Despite the low incidence of recurrence and limited number of patients, the median value of patients with distant recurrence was still higher than that of patients without the events in DLBCL and other subtypes. These data demonstrated that the high global protein-mRNA concordance was a predictive factor for distant recurrence in OABL.

Next, we investigated the potential regulators and biological implications of the abnormally upregulated global protein-mRNA concordance. Because the global concordance is an intrinsic continuous variable, we examined the correlation between GSVA results and the concordance (Fig. 2H). In the top 20 positively correlated genesets, 8 gene sets were TP53-related gene sets, and the others were mostly immune-related gene sets. ECM-associated gene sets accounted for the majority of negatively correlated gene sets.

These findings indicate that increased global protein-mRNA concordance is a novel molecular characteristic of OABL that is associated with disease aggressiveness and higher risk of recurrence. This abnormally upregulated concordance in OABL is positively related to the TP53 pathway.

Trend analyses identify alternative splicing as an inflammation-independent signature of OABL

In the proteotranscriptomic data, we observed a similarity of molecular characteristics between inflammation and OABL samples through hierarchical clustering, principle component analysis, and global protein-mRNA concordance (Fig. 2C, Fig. 3A-B, Supplementary Fig. 1). As previous studies demonstrated the activation of NFκB signaling pathway in both inflammation and NHL [25, 26], we performed hierarchical clustering in the NFκB signaling pathway across subgroups (Fig. 3C). The abundance of NFκB-related genes progressively increased from normals to inflammations, and to OABLs, which was consistent with the previous reports [25, 26]. However, issues remained as: what extent the similarity is; which pathways discriminate OABL from inflammation; and whether these pathways are driver events of OABL.

Fig. 3
figure 3

Establishment of inflammation-OABL protein signature. A PCA of high variant proteins (protein with top 25% median absolute deviation). B boxplot of Dim1 showing difference between normal, inflammation, and OABL groups. C Heatmap of NFkB pathway analyzed by proteomic data. D Heatmap of k-means clustering result of highly variant proteins (protein with top 50% median absolute deviation). E Facet plot showing the trend of protein abundance of each k-means cluster between groups. Each dot represents the median protein z-score for one sample in each cluster. The blue lines indicate the segmented linear regression between two adjacent groups, and the blue numbers indicate the slope value of the regression. The red lines indicate linear regression for all groups, and the red numbers indicate the slope value of the regression. The numbers in black indicate the percentage of genes in the cluster compared to all genes analyzed. F Facet plot shows nine patterns of protein expression changes across groups identified by t-test. Black numbers indicate the percentage of genes in the cluster compared to all genes analyzed. G Alluvial plot showing the process of inflammation-OABL protein signature establishment. H Bar plot of top 20 enrichment terms of Specific up signature and Mimic up signature

To address these questions, we constructed a robust inflammation-OABL signature in proteomic data by supervised and unsupervised clustering genes across the normal, inflammation, and OABL groups. First, we performed the t-test of protein abundance between each two groups and hypothesized nine dysregulated patterns of genes (Fig. 3F, Supplementary Methods). Most were constituted by the upregulated patterns (cluster3u 24.6%; cluster2u, 21.94%; cluster4u, 11.68%). Interestingly, in these upregulated patterns, inflammations could not be discriminated from OABLs in a majority of genes (906/2144, 42.3%). We additionally performed an unsupervised k-means clustering for normalized HVGs (genes with top 50% MAD, k = 4) (Fig. 3D-E). Cluster numbers were determined by the elbow plot (Supplementary Fig. S4A).

Combining the results of k-means clustering and t-test gene patterns, we identified five clusters of genes: inflammation mimic upregulated genes (MIMIC-UP), vaguely upregulated genes (VAGUE-UP), OABL specific upregulated genes (SPECIFIC-UP), inflammation mimics downregulated genes (MIMIC-DOWN), and OABL specific downregulated genes (SPECIFIC-DOWN) (Fig. 3G, Supplementary Table S7). Because upregulated genes constituted most of the clustered proteins, we focused on MIMIC-UP and VAGUE-UP, which represented extremely different patterns of dysregulation. MIMIC-UPs were mostly enriched in immune-related gene sets, while SPECIFIC-UPs were mostly enriched in gene sets that related to mRNA metabolism and splicing, DNA damage and metabolism, and chromatin remodeling (Fig. 3H).

These results clearly demonstrated that the similarity between inflammation and OABL is not only in the NFκB pathway but also in a larger immune landscape. More importantly, we identified gene sets specifically dysregulated in OABL, including mRNA splicing and well-known pathways associated with malignancy development (DNA damage, chromatin remodeling).

Alternative splicing and its regulators potentially influence OABL development and progression

Alternative splicing plays an important role in OABL. Our findings revealed dysregulated alternative splicing in an inflammation-independent pattern in OABLs (Fig. 1E, Fig. 3H). The mRNA splicing geneset was upregulated in all subtypes of OABL (Supplementary Fig. S5A). We further investigated the enriched motif/domain of OABL in the proteome, and the RNA recognition motif was the most significantly enriched term (Supplementary Fig. S5B). Alternative splicing was reported to be associated with malignancy development and progression [29, 30]. Therefore, we hypothesized that alternative splicing may play an oncogenic role in OABL.

We therefore constructed a workflow to 1) evaluate raw RNA-sequencing data and identify AASEs of OABLs compared with controls; 2) syndicate clinical data to identify prognostic-related AASEs; and 3) combine proteomic data to investigate potential splicing regulators and biological function of AASEs (Fig. 4A). We analyzed five types of ASEs: alternative 3′ splice sites (A3), alternative 5′ splice sites (A5), mutually exclusive exons (MX), retained introns (RI), and skip** exons (SE). A total of 1806 AASEs were identified (Supplementary Table S8), and most were SE. These AASEs affected 916 genes in total, and most of them were affected by SE (Fig. 4C). Among the 916 AASE related genes, 651 genes were only modulated by one type of ASE, while the rest were affected by several types of ASE (Fig. 4B). Interestingly, 60 of 134 MX related genes were also affected by SE. Using univariate cox regression analysis, we identified 91 progression-related AASEs (64 affected genes), including 59 SE, 15 MX, 8 RI, 7 A5, and 2 A3 (Fig. 4G, Supplementary Table S10).

Fig. 4
figure 4

Integrated aberrant alternative splicing event (AASE) landscape of OABL. A Workflow of AASE analysis. B UpSet plot showing intersections among the five types of AASEs in each patient. C Bar plot showing the number of the five types of AASEs and related genes. D Bar plot of top 20 enrichment terms identified in AASE-related genes. E Heatmap of top 20 enrichment terms correlated with Inclevel of AASEs identified by GSVA. Enrichment terms are ranked by the correlation score and shown in the orange bar. Count of high correlated AASEs with |r| > 0.6 is showed in blue bar. F Heatmap of top 20 splicing factors and regulators correlated with Inclevel of AASEs. G Forrest plot of top 15 progression-related AASEs ranked by p-value. Progression-related AASEs are calculated by cox regression. H Heatmap of top 20 splicing factors and regulators correlated with Inclevel of progression-related AASEs

Next, we investigated the biological implication of the AASEs. The top listed enrichment terms of the AASE related genes were associated with the gene sets related to cytoskeleton, Rho GTPase, and organelle (Fig. 4D). Noticeably, Rho GTPase and cytoskeleton pathway were also identified in the enrichment analysis of CO-UP and CO-DOWN DEGs (Fig. 1C). We then analyzed the potential implicated biological function through correlating AASE Inclevel with GSVA results from proteomic data (Fig. 4E, Supplementary Methods). Among the top 20 listed genesets, 8 were related to DNA damage and cell cycle. Others most included steroid hormone-related and organic acid-related gene sets.

We then investigated the potential regulators of AASEs. We correlated the protein abundance of annotated SFs with Inclevel of AASEs (Fig. 4F, Supplementary Methods) and progression-related AASE (Fig. 4H). Among the top 20 listed SFs, 13 were recurrently identified in both correlations. We searched these 13 genes in 4 DLBCL cohorts in cbioportal (http://cbioportal.org) [31]. While 4 genes exhibited no genomic events. While a total of 10% (30/300) patients had genomic events in the other 9 genes (Fig. S5C).

Together these data indicate that AASEs are widely present in OABLs and associated with prognosis. Genes that exhibited alternative splicing were associated with dysregulated gene sets in OABLs. AASEs were associated with key biological functions, like DNA damage and cell cycle, which might imply that they function as post-transcriptional regulators of these process. Some DLBCL patients had genomic events in AASEs that were highly correlated splicing factors and regulators, which also suggested that alternative splicing was a driver event of OABL.

ADAR is a core regulator of alternative splicing in OABL and influences key biological functions

ADAR, a member of the adenosine deaminases acting on the RNA family of enzymes, catalyzes the editing of adenosine to inosine in double-stranded RNA. ADAR was recently reported to regulate alternative splicing independent of editing ability [32]. In correlation analyses, we found that ADAR was a top-listed SF recurrently associated with AASEs and progression-related AASEs (Fig. 4F, H). Among the SFs recurrently associated with AASEs, ADAR showed the highest incidence of genomic events (3%, 9 of 300 patients).

We next investigated the potential biological functions associated with ADAR and found that, the protein abundance of ADAR was strongly correlated with 280 AASEs (|r| > 0.6), which affected 145 genes. These genes were mostly enriched in Rho GTPase-related gene sets (Fig. 5A), an important geneset and critical transducer of intracellular signaling in tumor initiation and progression [33, 34].

Fig. 5
figure 5

ADAR regulates key biological functions in OABL. A Bar plot of top 20 enrichment terms identified in ADAR highly correlated AASE genes (IrI > 0.6, p < 0.05). B Bar plot of top 20 enrichment terms identified in ADAR regulated AASE affected genes (frequency ≥ 2). C ADAR protein abundance is significantly positively correlated with MKI67 abundance in OABL. Blue line shows linear regression. D ADAR RNA expression level is significantly positively correlated with MKI67 level in most TCGA databases. Name of databases labeled with red/blue color if p < 0.05. Name of databases is bold if |r| > 0.4. E ADAR KD SU-DHL-4 and Raji exhibit decreased cell proliferation ability. F SU-DHL-4 and Raji cells with control ADAR knockdown are treated by Rho GTPase inhibitor MLS000532223 for 72 hours. ADAR knockdown cell lines exhibit lower IC50

To examine the role of ADAR, we identified AASE in 11 groups of ADAR knockdown (KD), knockout, or overexpression cancer cell lines (Supplementary Methods). A total of 1472 genes were recurrently affected by ADAR regulated AASE (Supplementary Table S11). These genes were also enriched in Rho GTPase-related gene sets (Fig. 5B). Because the affected genes were enriched in apoptotic and proliferation related gene sets (Fig. 5A-B), we investigated the relationship between ADAR and MKI67. In our data, ADAR protein abundance was positively correlated with MKI67 (r = 0.477, p = 0.002, Fig. 5C). Among 33 tumor types in TCGA databases, 29 exhibited a significantly positive correlation with ADAR expression (p < 0.05, r > 0) and 14 exhibited a strong correlation with ADAR expression (p < 0.05, r > 0.4) (Fig. 5D).

To verify the hypothetic regulator role, we established ADAR KD NHL cell lines (Fig. S6A). ADAR KD cell lines exhibited significantly decreased cell proliferation compared with the control cells (Fig. 5E). We further found that ADAR KD sensitized NHL cell lines to Rho GTPase inhibitors. The IC50 of the Rho GTPase inhibitor MLS000532223 after 72 h of treatment was lower in ADAR KD cell lines compared with controls (Fig. 5F). ADAR KD cell lines also exhibited increased sensitivity to a Rho-kinase inhibitor (HA110 HCL). These results demonstrated that ADAR, a core regulator of alternative splicing in OABL, regulated cell proliferation and sensitivity to Rho GTPase inhibitors.

Proteomic analysis identifies DNAJC9 as a diagnostic marker of EMZL

Pathological diagnosis of OABL remains difficult. Therefore, we examined our proteomic data to investigate a potential diagnostic marker for OABL. A workflow of biomarker detection is shown in Fig. 6A. To screen biomarkers, we first identified 98 differentially expressed proteins and 98 significant proteins through univariate logistic regression. Next, 85 overlapped proteins were included into the lasso penalty regression model, and the analysis yielded four proteins (DNAJC9, TFEB, SUMO3 and MBD1) through 200 iterations of cross-validation. We then performed stepwise logistic regression for these proteins, and DNAJC9 was the only protein identified.

Fig. 6
figure 6

Proteomic analysis identifies DNAJC9 as a diagnostic marker of OABL. A Workflow of identifying a diagnostic marker from proteomic data. B DNAJC9 protein abundance is significantly higher in OABLs and any subtype of OABL. C ROC plot of DNAJC9, CD20, and PAX5. DNAJC9 protein abundance exhibits higher AUC than CD20 and PAX5. D Representative IF images of DNAJC9 in control and OABL samples. E MFI of DNAJC9, and overlap coefficient of DNAJC9 and DAPI are significantly higher in EMZL and DLBCL samples compared with inflammations and paracancer sites. MFI of CD20 are significantly higher in EMZL and DLBCL samples compared with paracancer sites. Each dot represents a sample. F Representative IHC images of DNAJC9 in control and OABL samples. G Staining score of bulk cells and nuclei of DNAJC9 in Inflammation, EMZL, and DLBCL samples. Each dot represents a sample

To verify the result, we analyzed the protein abundance of DNAJC9 across groups (Fig. 6B). DNAJC9 abundance was significantly higher in OABLs compared with control, IOI, and RLH groups. DNAJC9 abundance was also significantly higher in all subtypes of OABL compared with controls. Compared with the traditional diagnostic markers CD20 and PAX5, DNAJC9 exhibited a relatively higher AUC value for OABLs (Fig. 6C).

Next, we investigated the diagnostic performance of DNAJC9 in patient FFPE samples. EMZL and DLBCL (the two main subtypes of OABL) were chosen as the experimental group, and inflammation samples were used as the control; paracancer sites of OABLs were additionally counted as the control in IF analysis. Analysis of subcellular localization revealed that DNAJC9 was localized in the nucleus in OABLs and lymphoid regions of inflammation tissues. In paracancer sites of OABL and gland regions of inflammation, DNAJC9 was localized in the cytoplasm. Most DNAJC9 was co-expressed with CD20 in the same cell in lymphoid regions and OABLs (Fig. 6D-E). The MFI of DNAJC9 was significantly higher in EMZLs and DLBCLs compared with controls. The MFI of CD20 was not higher in OABLs compared with inflammations (Fig. 6E). IHC analyses showed DNAJC9 staining score of bulk cells and nuclei were both significantly higher in EMZLs compared with inflammations (Fig. 6F-G). For DLBCLs, the staining score of bulk cells in was not significantly higher compared with inflammations, and the staining score of nuclei was significantly lower compared with EMZLs. These results demonstrated that DNAJC9, especially strong nuclear staining of DNAJC9, is a promising pathological diagnostic marker of EMZL that can differentiate EMZL from inflammation.

Discussion

OABL is a rare subtype of NHL and a common type of malignancy in the orbital region [1, 2, 35]. To date, only a few studies have examined the gene expression profile of OABL [7]. Although there are several transcriptomic studies of NHLs, their proteomics and integrated molecular characteristics have been poorly understood [7,8,9]. Our study reports proteotranscriptomic data of OABL for the first time. We performed integrated quantitative proteome and transcriptome analyses, that allowed us to: 1) identify robust dysregulated genes and pathways; 2) gain insights into post-transcriptional expression regulation; and 3) investigate novel disease characteristics of OABL. Together, our findings provide novel insights into the molecular landscape of OABL and identify a promising diagnostic biomarker.

In our data, proteome described disease pathways and DEGs are partially captured by the transcriptome. Different distribution between transcriptomic and proteomic data was previously observed [10]. Thus, we additionally performed GSEA, a rank-based algorithm, to investigate the dysregulated pathways in OABL. The robustly concordant dysregulations are mostly consistent with previous observations in mature B-cell lymphomas [9]. Post-transcriptional regulation mechanisms of gene expression, including protein sumoylation, RNA m6A modification, and alternative polyadenylation, are current focuses in cancer research [36, 37]. These processes can result in different expression patterns between protein and mRNA. We assumed that by computing the correlation between transcript and protein abundance, the global protein-mRNA concordance can imply the level of impact of post-transcriptional regulation mechanisms for each patient. Consistent with previous observations in breast cancer, our findings showed that high concordance is a disease-specific characteristic in OABL and associated with poor prognosis [11]. These results suggest that traditional translational regulation of gene expression and expression-independent post-transcriptional modification play a major role in malignancy development.

The similarity and association between inflammation and NHLs are frequently reported, but the extent and underlying mechanisms of this relationship have not been thoroughly studied [25,26,27,28]. Herein, we observed a similar situation [25, 26]. We therefore constructed a robust inflammation-OABL signature. NFκB pathway is previously reported upregulated in both inflammation and NHL. In our data, the gene expression profiles of inflammation and OABL are similar and broadly affects immune-related genes, which confirms and extends the scope of this similarity. Echoing our previous assumption, alternative splicing as an expression-independent post-transcriptional regulation mechanism is specifically dysregulated in OABL in an inflammation-independent manner.

Alternative splicing is a post-transcription regulator of pre-mRNA and allows the generation of multiple splice isoforms from genes that can exhibit distinct functions [12]. Numerous studies have demonstrated the oncogenic role of alternative splicing in cancers [29, 30]. Though components of the spliceosome are recurrently mutated in hematologic malignancies [14], including the SF3B1 mutation present in approximately 10% of chronic lymphocytic leukemia and DDX41 mutation in follicular lymphoma and Hodgkin lymphoma, the biological function and oncogenic potential of alternative splicing have not been well studied [38]. From our results on specific dysregulated gene expression and enriched RNA binding motif, we speculate that alternative splicing is a potential oncogenic event in OABL. By constructing an AASE landscape, we found that AASEs are highly correlated with important biological functions, and some AASEs predict the progression of OABL. Our analysis further identified ADAR as a core SF of AASEs. ADAR is recurrently and highly correlated with all AASEs and prognostic-related AASEs and mutated in DLBCL patients. ADAR directly edits and splices RNA and promotes malignancy development and progression [32, 39,40,41,42], but its oncogenic role in NHL has not been demonstrated. High correlated AASE affected genes in the OABL and ADAR regulated AASE affected genes identified using publicly available datasets were both enriched in Rho GTPase and cell proliferation pathways. We further showed that ADAR regulates cell proliferation and sensitivity to Rho GTPase inhibitors in NHL cell lines. Together, our findings indicate that alternative splicing is an inflammation-independent oncogenic event of OABL, and dysregulation of the splicing regulator ADAR may result in malignancy development and progression.

A major issue in clinical practice is the efficient clinical or pathological differential diagnosis between OABL (especially EMZL) and inflammation [43,44,45,46], which is critical because of the different therapeutic approaches for these diseases. We constructed a proteome-based workflow and identified DNAJC9 as a potential diagnostic marker of OABL. DNAJC9 is a heat shock protein family member that is a histone co-chaperone and a p53-target gene [47, 48]. In inflammation and OABL tissues, DNAJC9 was co-expressed with CD20 and predominantly localized in the nucleus. Our study demonstrates that nuclear staining of DNAJC9 is a promising pathology diagnostic biomarker of EMZL, which may provide important benefits in clinical practice.

Conclusions

OABL is a rare subtype of non-Hodgkin lymphoma, and its molecular characteristic is poorly understood. We performed an integrated study to investigate the proteotranscriptome landscape of OABL. We found that alternative splicing may be the biological foundation for malignancy development. Furthermore, ADAR, a core SF, regulates the proliferation and Rho GTPase inhibitor sensitivity of NHL cell lines. OABL is characterized by high global protein-mRNA concordance, which is a novel recurrence-related characteristic. This study also identified the strong nuclear staining of DNAJC9 as a promising pathology diagnostic biomarker of EMZL. Our results provide insights into the biology of OABL and pave the way for clinical practice and further study of OABL.