Introduction

Chronic lymphocytic leukemia (CLL) is a B-cell neoplasm and is the commonest adult leukaemia in Western countries with reported incidence rates in the USA and Europe being 4–6 per 100,000/ annually1,2 The clinical outcomes of CLL are extremely heterogeneous, where overall survival (OS) and time-to-first treatment (TTFT) vary drastically across CLL patients3. Early need for therapy (short TTFT) and short OS are characteristics of progressive and aggressive form of CLL, whereas late or no need for therapy and long OS are features of indolent form of the disease. Several prognostic markers have been reported to predict the clinical course of the disease. For instance, the mutational status of immunoglobulin genes (IGHV); mutated IGHV is associated with good prognosis, whereas unmutated IGVH predicts poor prognosis. In addition, chromosomal aberrations, such as deletions in 17p and 11q are characteristics of high-risk CLL4. Furthermore, increased protein expression of CD38, CD49d, ZAP-70 and CXCR4 is associated with an aggressive form of CLL3,5,6,7.

Despite the great effort and research that have been made in the area of cancer therapy8,9,10,11,12,13,14,15, CLL remains incurable and life threatening especially for those with poor prognosis16,17. Casein kinase II subunit alpha (CK2α; a protein encoded by CSNK2A1) is a catalytic subunit of a constitutively active serine/threonine-protein kinase complex that phosphorylates a wide range of substrates and regulates a diverse of cellular processes, such as cell proliferation, apoptosis, haematopoiesis, resistance to cytotoxin agents, protein stability and chaperon activities18. In malignancies including CLL, CK2α was reported to be over-expressed19,20. CK2α enhances cellular viability and proliferation through PI3K/AKT, Wnt/β-catenin and JAK/STAT dependent signaling mechanisms21,22. Therefore, targeting CK2α was shown to induced apoptosis in CLL cells23,24,25. Similar findings were also reported in CLL cells xenografted in mice26. Interestingly, targeting the expression of CK2α in CLL cells isolated from patients with poor prognosis and chemotherapy resistance due to chromosomal alternations (11q and 17p deletions) induced apoptosis27. Collectively, these findings argue that CLL patients may benefit from therapeutic strategies of targeting CK2α.

Several inhibitors of CK2α have been reported28, one of which is CX-4945 (Silmitasertib) that has been designated as an orphan drug by FDA for cancer treatment29,30. Nevertheless, CX-4945 has some limitations, such as restricted selectivity because it exerts inhibitory effect on twelve other kinases and shows a stronger binding affinity with Clk2 (one of the twelve kinases) compared to CK2α31. Moreover, therapeutic resistance to CX-4945 due to various factors like drug uptake, drug efflux, gene mutations, pathway alteration and target inactivation are of great concern32,33.

Natural products and their derived compounds are believed to be a rich source of therapeutics that could be employed in the treatment and management of several communicable and non-communicable diseases34,35,36,37,38,39,40,41,42,43,44,45,46,47,48. Natural products like metabolites from fungus and plants have been considered to be safe and economical with great bioactive potentials against multidrug-resistant cancers49,50. Fungi were reported to have variety of metabolites which proves to be blockbuster drugs, such as Caspofungin, Cyclosporine, Finglomide, Lovastatin and many more. Nearly 40% of new chemical entities approved by the United State Food and Drugs Administration (US FDA) are of natural origin and most of them are fungal metabolites. Hence, the metabolites originated from fungi have great potential and prominent role in therapeutic drug discovery51,52,53.

Although several pieces of evidence have shown the value of targeting CK2α for CLL therapy, information about how CK2α contributes to the disease progression and worse clinical outcomes of CLL remains scarce in the literature. Therefore, in current work, we first studied the impact of CSNK2A1 expression on OS and TTFT of CLL patients and conducted bioinformatic investigations54,55 to identify possible roles of CSNK2A1 in CLL progression. Consequently, this constructed a rational for targeting CK2α in CLL. In silico approach-based search for kinase competent inhibitors have been reported to be effective56,57,58,59,60,61, Therefore, we used various computational tools to search for a competent inhibitor of CK2α from fungal metabolites that could be proposed for CLL therapy.

Methodology

Transcriptomics data sets

Transcriptomics data sets of CLL available in GEO (accession number: GSE2276262 and GSE3967163 were used to study the impact of CK2α transcript expression on the prognosis and progression of CLL. These two data sets were selected for four reasons. First, the transcriptomics analysis was conducted on CLL cells isolated from peripheral blood of CLL patients. Second, the two data sets included prognostic information of CLL patients on whose samples the transcriptomics analysis was conducted. The data set (GSE22762) included OS data and the data set (GSE39671) contained TFTT data. To the best of our knowledge these two data sets are the only CLL transcriptomics data with OS and TTFT information available in GEO. Third, the two data sets were generated from two separate CLL cohorts with > 100 patients each (GSE22762 = 107 patients; GSE39671 = 130 patients). Fourth, the same oligonucleotide microarray platform (Affymetrix Human Genome U133 Plus 2.0 Array) was used to produce the two transcriptomics data sets. This was an important inclusion criterion because it reduces the possible variation that could rise if the two data sets had been produced using different platforms of oligonucleotide microarray. The files (type: DataSet SOFT) of the transcriptomics data sets were downloaded from GEO and used.

Functional profiling

Functional profiling of the genes that correlated with CSNK2A1 (PS =  > 0.60) was performed using the gProfiler (https://biit.cs.ut.ee/gprofiler/gost)64. The analysis was conducted against four known databases: Gene Ontology (GO) database (http://geneontology.org/)65,66, KEGG pathway database (https://www.genome.jp/kegg/)67, Reactome pathways database (https://reactome.org/)68 and WikiPathways database (https://www.wikipathways.org/index.php/WikiPathways)69. The option “only annotated genes” was selected for statistical domain scope and corrected p value cut-off was set at ≤ 0.05. The calculation of corrected p value was conducted on the basis of Benjamini–Hochberg method.

Protein–protein interaction network analysis

Protein–protein interaction (PPI) analysis and network construction were conducted using the “Search Tool for the Retrieval of Interacting Genes” (STRING; https://string-db.org/)70. The following criteria were applied: homo sapiens was chosen for organism; full STRING network was selected for the network type; confidence was chosen for the meaning of network edges; all active sources for interaction were selected. Only PPIs with enrichment score < 0.001 were reported. Next, file generated from STRING was loaded into Cytoscope (version 3.4.0; https://cytoscape.org/)71 for network visualization.

Prediction of physicochemical, medicinal chemistry, and ADME-T properties

Total of 19,967 compounds from fungus database of PubChem (accessed on: 04/12/2021) was filtered out to get drug like metabolite on the basis of their physiochemical properties, medicinal chemistry parameters and blood–brain barrier permeability through SwissADME (http://www.swissadme.ch) web-based tool72. Furthermore, the ADMET analysis was performed for 10 best hits (best docking score in comparison with a standard reference) of fungal metabolites such as, absorption, and metabolism was predicted through SwissADME whereas distribution and excretion were predicted through Admetlab2.073 web-based tool87. Using a hybrid algorithm of the steepest descent and the limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithms in the OPLS3e force field, the solvated system was treated for energy minimization to eliminate stearic collisions among protein and solvated water molecules88,89. In NPT ensembles (isothermal-isobaric) with 100 ps intervals between trajectory snapshots, the simulation was run for 100 ns. Temperature of 300 K and a pressure of 1 bar during simulation is maintain through the Nose–Hoover chain thermostat and Martyna-Tobias-Klein barostat controllers, respectively90,91.

Post simulation binding free energy analysis

The molecular mechanics combined with Generalized Born surface area (MM/GBSA) approach was used to calculate the post-simulation binding free energies (ΔGBind) of ligand–protein complexes. The binding free energy (ΔGBind) based on MM/GBSA was calculated using the thermal_mmgbsa.py script. The binding free energy was computed using a 0–1000 ns MD simulation trajectory with the VSGB solvation model associated with the OPLS3e force field with 10-step sampling size (every ns) as input for the MM/GBSA analysis. The Prime MM/GBSA binding free energy (kcal/mol) is evaluated using the law of additivity, which combines different energy modules such as hydrogen bonding, van der Waals, columbic, lipophilic, covalent, solvation, π- π stacking’s, and self-contact of ligand and protein were combined collectively92.

Statistical analyses

Kaplan–Meier curves were constructed using Prism Graphpad software (version 7; https://www.graphpad.com/guides/prism/7/user-guide/index.htm) and p values with hazard ratios (HRs) were calculated using the Log-rank test. Correlation analysis and Pearson score calculations were performed using Excel software (version 14.4.0). The p values and the FDRs of the functional profiling analysis were calculated using the gProfiler64. The p value of PPI enrichment analysis was calculated using STRING (https://string-db.org/)70. Cluster analysis using average linkage method for clustering and Manhattan method for distance measurement was conducted using Heatmapper web-based tool (http://www.heatmapper.ca/)93. Heatmapper was also employed to construct heatmaps.

Results and discussion

Implication of CSNK2A1 in the progression and prognosis of CLL

OS and TTFT are very important clinical measures of CLL prognosis94. In contrast to indolent form (good prognosis) of CLL, the progressive and aggressive form of the disease (poor prognosis) is characterized by short TTFT and short OS3. Investigation was conducted to determine whether the expression of CSNK2A1 in CLL cells is associated with short TTFT and short OS of CLL patients. The analysis was performed on two CLL transcriptomics data sets from GEO (accession number: GSE2276262 and GSE3967163. As shown by Kaplan–Meier curve (Fig. 1A), increased expression of CSNK2A1 is associated with short TTFT in CLL patients; the median TTFT in the high-expression group was 1.3 years compared with 6.5 years in the low-expression group (n = 130, p < 0.0001, HR of high-expression versus low-expression = 3.70). Likewise, Kaplan–Meier curve also showed that high-expression of CSNK2A1 was associated with short OS; the median OS was 4.5 years for the high-expression group and was undefined for the low-expression group (Fig. 1B, n = 107, p = 0.005, HR of high-expression versus low-expression = 3.30). These findings provided evidence for the implication of CSNK2A1 in CLL progression and poor clinical outcomes, supporting previous studies that involved CSNK2A1 in the survival and proliferation of CLL cells21,22,23,27,95,96. In line with our findings, earlier studies also showed an association between increased expression of CSNK2A1 and poor prognosis of other malignancies, such as acute myeloid leukaemia97, hepatocellular carcinoma98, ovarian cancer

Figure 10
figure 10

Histogram shows protein residues that interact with the ligand over the course of the trajectory.

Post simulation binding free energy

MM/GBSA is a common and rigorous approach for post-simulation binding free energy prediction because it considers protein flexibility, entropy, and polarizability, which are often overlooked in docking protocols. The binding energy estimation approach based on molecular mechanics-generalized Born surface area (MM/GBSA) allows for the identification of ligands that bind efficiently with receptors One of the most significant goals in bimolecular investigations is to calculate the free energy of binding precisely because it is responsible for driving all molecular activities such as chemical reaction, molecular recognition, association, and protein folding133. Hence validity of compounds identified by docking and MD simulations was investigated further by using MMGBSA binding free energy estimate calculations. The post-simulation MM/GBSA was estimated at every 10th frame from frame 0–1001, totaling 100 conformations (every ns) of each simulated complex, and the average binding energies with standard deviation are given in Table 5. MM/GBSA binding energy statistics show that the cumulative contributions of Coulombic, H-bond, Lipo, and vdW interactions have a significant impact on ΔGBind.

Table 5 Post simulation Components of binding free energy for protein–ligand complexes estimated using MM-GBSA analysis.

The calculated average ΔGBind of the complex Butyl Xanalterate, Fumiquinazoline Q and Silmitasertib, in complex with the protein kinase CK2α subunit was found − 74.41 kcal/mol, − 34.6 kcal/mol and − 40.86 kcal/mol, respectively. A more negative value shows stronger binding, the highest ΔGBind was seen for the Butyl Xanalterate-3PE1complex, this value is significantly higher than that observed for the standard Silmitasertib-3PE1 complex. Furthermore, vdW and H-bond interactions are important contributors to ligand binding in all cases; however, it seems ΔGLipo may also significantly affect the binding free energy of Butyl Xanalterate-3PE1 complex. Thus, based on binding free energy values, the order of best compounds is Butyl Xanalterate > Silmitasertib > Fumiquinazoline Q.

The present study should be viewed with some considerations. First, the investigations that were conducted to show the association of CSNK2A1 with i) short TTFT, ii) short OS, and iii) survival and proliferation-dependent pathways in CLL cells was entirely based on gene expression rather than protein expression (which is the main functional molecule in a cell). While the central dogma of molecular biology states that gene expression correlates with protein expression, some genes do not follow this pattern134. In this work, we carefully searched for CLL proteomics data sets with available clinical information about OS and TTFT in public depositories, such as National Center for Biotechnology Information (NCBI), Protein Atlas and The Cancer Genomic Atlas (TCGA), but we could not find any. The only available data sets that fit with the goal of our study were the two CLL transcriptomics data sets used here. Therefore, in our future work, we will evaluate the protein expression and the kinase activity of CK2α in CLL cells isolated from CLL patients to confirm its prognostic value in the disease. Second, although the identification of Butyl Xanalterate as a competent inhibitor of CK2α was based on rigorous in silico approach, confirming this finding using wet lab experiments is needed. Consequently, in our future work, we will compare the therapeutic potential of Butyl Xanalterate with that of CX-4945 using wet lab experimental settings to determine if Butyl Xanalterate is stronger inhibitor of CK2α and more competent than CX-4945 in causing CLL cells undergo apoptosis.