Background

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes the well-known Coronavirus disease 2019 (COVID-19), which has become a major global health burden. SARS-CoV-2 infection occurs through the nasopharyngeal mucosa [1]. Subsequent immune responses occur at the local mucosa and at a systemic level. An effective response to SARS-CoV-2 infection requires coordination between the innate and adaptive immune systems, including granulocytes, monocytes, macrophages, and T and B cells [2, 3]. The range of immune responses to SARS-CoV-2 infection is diverse, from asymptomatic or mild upper-respiratory illness to severe viral pneumonia, acute respiratory distress syndrome, and death [4]. The most severe forms of COVID-19 are caused by dysregulation of immune homeostasis, which leads to hyperinflammation in the lungs [5]. This has been shown to be more pronounced in the elderly and in individuals with pre-existing comorbidities [10,11]. IFN is essential for inducing the innate immune response during viral infection through different interferon regulatory factors (IRFs) [12]. Further, in COVID-19 patients, type I IFN deficiency appears to be a hallmark of severe cases [13,14,15,16,17,18,19] in association with persistent blood viral load and an exacerbated inflammatory response [14].

Single-cell omics studies have identified specific transcriptional features in monocytes, natural killer (NK) cells, dendritic cells (DCs), and T cells associated with the severity of COVID-19 [13, 20,21,22]. These studies have revealed that severe COVID-19 is marked by a dysregulated myeloid cell compartment [13]. It has also been shown that monocytes from severe COVID-19 patients are characterized by a tolerogenic phenotype with reduced expression of class II major histocompatibility complex (MHC-II) antigens [23] and increased activation of apoptotic pathways [24].

Differentiation and activation of monocytes and other myeloid cells are directly associated with epigenetic mechanisms [25]. The functional plasticity of these cells is also reflected at the epigenetic level, and several studies have shown that DNA methylation profiles, among other epigenetic marks, vary in response to inflammatory cytokines, hormones, and other factors [26, 27], depending on their functionality. Cytosine methylation (5mC) occurs at CpG dinucleotides and is generally associated with transcriptional repression [28], although its relationship with transcription depends on the genomic location of the affected CpG sites [29]. In some cases, DNA methylation changes occur as a result of upstream environmental effects that link cell membrane receptors, signaling pathways, and transcription factors (TFs) that can either directly recruit DNA methyltransferases (DNMT) and ten–eleven translocation (TET) enzymes, or indirectly influence their binding to specific genomic sites.

The characterization of the epigenetic and transcriptomic reprogramming in monocytes, given their central role in inflammatory responses, is essential if we are to understand the specific dysregulated pathways involved in severe forms of COVID-19. In this study, we obtained the DNA methylation profiles of peripheral blood monocytes of severe COVID-19 patients and studied their relationship with transcriptomic changes, obtained by generating droplet-based single-cell RNA sequencing (scRNA-seq) data from peripheral blood.

Methods

Human samples

Our study included a selection of 58 severe COVID-19 patients from the intensive care unit (ICU) of Vall d’Hebron University Hospital (Barcelona) recruited during the second wave of infection in Spain (October to November 2020). Peripheral blood samples were taken at different times following admission of the patient to the ICU, as specified in Additional file 1. Table S1 (Days in ICU). Ninety-four percent of the patients required intubation and all enrolled cases were confirmed to be infected with SARS-CoV-2 using real-time RT-PCR at the time of collection. For all enrolled patients, the date of enrollment, clinical classification, or treatment was obtained from the clinical records. From all these patients, 48 of the 58 patients were selected for DNA methylation analysis (Additional file 1. Table S1) and peripheral blood mononuclear cells (PBMCs) from 10 of the 58 patients were used for droplet-based scRNA-seq analysis (Additional file 2. Table S2). The control population for the DNA methylation analysis comprised 11 healthy donors (HDs) recruited at the Blood Bank of Vall d’Hebron University Hospital. Table 1 summarizes the characteristics and clinical data from patients included in the DNA methylation analysis. We included an additional group of 14 patients from the same hospital for DNA methylation and expression validation, including 9 severe COVID-19 patients and 5 mild COVID-19 patients, together with an additional group of 6 HDs. The validation cohort was collected during February 2022, applying the same selection criteria as for the discovery cohort. For the validation cohort, we only included non-vaccinated patients, to match the vaccination status with that of the patients collected in the initial phase of the study. Clinical information corresponding to the new cohort is also included in Additional file 1. Table S1 (validation cohort). This study was approved by the Clinical Research Ethics Committees of Hospital Universitari Germans Trias i Pujol (PI-20–129) and Vall d’Hebron University Hospital (PR(AG)282/2020), both of which adhered to the principles set out in the WMA Declaration of Helsinki. Informed consent was obtained from all patients before their inclusion.

Table 1 Summary of patient cohort for DNA methylation analysis

Monocyte purification and DNA isolation

PBMCs were obtained from peripheral blood by Ficoll gradient using Lymphocyte Isolation Solution (Rafer, Zaragoza, Spain) from 48 of the severe COVID-19 patients and 11 HDs. Once PBMCs were isolated, all samples were stored at − 150 °C in 10% DMSO in fetal bovine serum (FBS) until monocyte purification. The monocyte population was isolated by flow cytometry (FacsAria Fusion, BD, Beckton Dickinson, San Jose, CA, USA). PBMCs were stained with CD14-APC-Vio770 (Miltenyi Biotec) and CD15-FITC (Miltenyi Biotec) in staining buffer (MACS) for 20 min. A gating strategy was employed to eliminate cell debris, doublets, and DAPI + cells. CD14 and CD15 antibodies were used to isolate CD14 + CD15 − . Purified cells were pelleted and stored at − 80 °C.

After monocyte isolation, DNA was isolated using the AllPrep DNA/RNA/miRNA Universal Kit (Qiagen) following the manufacturer’s instructions.

DNA methylation profiling

Bisulfite (BS) conversion was performed using EZ-96 DNA Methylation™ Kit (Zymo Research, CA, USA) according to the manufacturer’s instructions. Five hundred nanograms of BS-converted DNA was hybridized on Infinium Methylation EPIC BeadChip arrays (Illumina, Inc., San Diego, CA, USA). These were used to analyze DNA methylation. They enable > 850,000 methylation sites per sample to be assessed at single-nucleotide resolution, which corresponds to 99% of the reference sequence (RefSeq) genes.

Each methylation data point was obtained from a combination of the Cy3 and Cy5 fluorescent intensities from the methylated and unmethylated alleles. Background intensity computed from a set of negative controls was subtracted from each data point. For representation and further analysis, we used beta (b) and M values. Beta is the ratio of methylated probe intensity to overall intensity (the sum of the methylated and unmethylated probe intensities). M is calculated as the log2 ratio of the intensities of the methylated and unmethylated probes. For statistical purposes, the use of M is more appropriate since b-values are severely heteroscedastic for highly methylated and unmethylated CpG sites. Raw DNA methylation data are available at GEO, with accession number GSE188573 [30].

Quality control, data normalization, and statistical analysis of DMPs

Quality control and analysis of EPIC arrays were performed using ShinyÉPICo [31], a graphical pipeline that uses minfi (v1.36) [32] for normalization, and limma (v3.46) [33] for analyzing differentially methylated positions. ShinyÉPICo is available as an R package at the Bioconductor (http://bioconductor.org/packages/shinyepico/) and GitHub (https://github.com/omorante/shinyepico) sites. We used the BS conversion control probe test included in ShinyÉPICo to determine whether the conversion rate was above the quality threshold of 2 established by Illumina. The threshold was calculated from the information of the BS conversion control probes of the EPIC arrays. When the BS conversion reaction is successful, control probes display strong signal in the red channel, whereas if the sample has unconverted DNA, control probes have a strong signal in the green channel. The red/green ratio for each control position was calculated for each sample.

CpH and SNP loci were removed by the Noob method, followed by quantile normalization. Sex chromosomes (X and Y) were also excluded from the analysis to avoid discordant information among samples. Even when data were generated in a single batch and randomized, we applied the batch effect correction. Sex and age of the donors were included as covariates, to minimize confounding effects due to differences between the median age of the patient and control cohorts, and the Trend and Robust options were implemented in the eBayes moderated t-test analysis. To compare healthy donors with the entire severe COVID-19 patient cohort, we identified differentially methylated CpG sites by using t-tests and a method with defined empirical array weights, included in the limma package [33], and selecting CpGs with a false discovery rate (FDR) of < 0.05 and a Δβ of > 0.15. To test the effects of potential changes in monocyte subset proportions, we also included this information as a covariate, and performed the same analysis as above, but including only those samples for which such information was available.

We used the iEVORA package (v1.9.1) [34] to identify differentially variable positions (DVPs). This algorithm identifies differences in variance using Bartlett’s test (FDR < 0.001), followed by the comparison of means using t-test (p < 0.05) to regularize the variability test, which is overly sensitive to single outliers. For the analysis in Fig. 2, we calculated Spearman’s correlation coefficient (rho) to measure the association of two variables and thereby identify CpG sites in which DNA methylation was correlated with SOFA in patients with severe COVID-19. We selected the CpG sites for which Spearman’s rho was greater than 0.4 and had an associated value of p < 0.01. Principal component analysis (PCA) of b-values from ShinyÉPICo was used to determine the correlations of PCs with clinical variables such as dexamethasone treatment, obesity, and hypertension. Pearson correlation coefficients between numerical variables and PCs were calculated. Categorical variables were entered in a linear model together with the PCs, which were considered as a function of the variable.

Gene ontology, transcription factor enrichment, and chromatin state discovery and characterization

The GREAT (v3.0.0) online tool (http://great.stanford.edu/public/html) was used for gene ontology (GO) analysis, in which genomic regions were annotated by applying adapted basal and extension settings (5 kb upstream, 5 kb downstream, 1000 kb plus distal). GRCh37 (UCSC hg19, Feb. 2009) was used as the alignment genome reference. Annotated CpGs in the EPIC array were used as background. GO terms were considered significant for a > twofold change and an FDR < 0.05. Enrichment is represented as − log2(FDR). GO categories with p < 0.05 were considered significantly enriched. GO analysis of differentially expressed genes (DEGs) was carried out using the online Enricher gene ontology analysis tool (https://maayanlab.cloud/Enrichr/). GO categories with a > twofold change and an FDR < 0.05 were considered significantly enriched.

We used the findMotifsGenome.pt tool from the motif discovery HOMER software (v4.10.3) to analyze motif enrichment [35]. A flanking window of ± 250 bp from each CpG was applied, and CpGs annotated in the EPIC array were used as background. To determine the location relative to a CpG island (CGI), we used “hg19_cpgs” annotation in the annotatr (v1.8) R package. The statistical test used for the enrichment in these analyses was Fisher’s exact test. Chromatin functional state enrichment of DMPs was measured using public PBMC data from the Roadmap Epigenomics Project (http://www.roadmapepigenomics.org/) generated with ChromHMM (v1.23) [36]. Enrichments were calculated with Fisher’s exact test using array annotation as background regions. Only significantly enriched states are shown.

Heatmaps and PCA plots

Heatmaps of DMPs were generated with functions available in the ComplexHeatmap (v2.11.1) and gplots (v3.1.3) R packages. We used PCA for the low-dimensional analyses. PCA projection matrices were calculated with R’s prcomp function, and visual representations of PCs were plotted with the ggfortify package (v4.1.4).

Whole-genome bisulfite sequencing (WGBS) analysis

DNA methylation values of Ensembl Regulatory Build regions of progenitor cells such as hematopoietic stem cell (HSC), multipotent progenitor (MPP), common myeloid progenitor (CMP), granulocyte macrophage progenitor (GMP), and control monocytes were extracted from public whole-genome bisulfite sequencing (WGBS) (GSE87197) [37]. Using GenomicRanges (v1.42.0) and based on genomic location, the overlap of the hypermethylated DMPs observed in COVID-19 compared with HD was determined with the Ensembl Regulatory Regions from the hematopoietic precursors and monocytes. For this analysis, all DNA methylation data were annotated with respect to the GRCh38 human genome reference.

Single-cell capture

PBMCs from 10 ICU patients were used to generate single-cell gel beads-in-emulsion (GEMs) (Additional file 2. Table S2). Cells were then washed three times and counted. For samples with low viability (< 90%), we performed Ficoll separation in an Eppendorf tube to eliminate dead cells and increase cell viability. For samples with greater than 90% viability, we filtered using a Flowmi strainer and counted the cells before loading into 10X chromium to generate single-cell GEMs, following the manufacturer’s instructions. We loaded 50,000 cells per pool, including a total of 4 patients per pool. Datasets from patients and HDs are available as h5ad files (https://www.COVID-19cellatlas.org/index.patient.html (Additional file 2. Table S2). In parallel, genomic DNA was isolated from the same 10 PBMCs for genoty** and subsequent donor deconvolution (as described in [38]) using a Maxwell® 16 Blood DNA Purification Kit from Promega following the manufacturer’s instructions.

scRNA-seq cell type identification and annotation

Single-cell transcriptome data from COVID-19 patients were quantified and aligned using Cell Ranger (v3.1) with the GRCh38 genome concatenated to SARS-Cov-2 genome as a reference. Thereafter, cells from pooled samples were deconvolved and demultiplexed using Souporcell (v3.0) [39], yielding a genotype variant that allows donor identity to be matched across different samples. This additionally enabled the removal of doublet cells that could not be explained by any single genotype. Scrublet (v0.2.3) [40] was subsequently employed to further filter out other doublets based on computed doublet scores. Specifically, Student’s t-test (p < 0.01) after Bonferroni correction was used within fine-grained sub-clustering of each initial cluster produced by the Leiden algorithm. Data were not denoised because no significant contamination or ambient RNA was present. Previously described scRNA-seq datasets of HDs [41] were then integrated for comparison using single-cell variational inference (scVI) [42] with a generative model of 64 latent variables and 500 iterations. More specifically, scVI employs a negative binomial model using raw counts, selecting 5000 highly variable genes to produce the latent variables. Defined cell-cycle phase-specific genes in the Seurat package (v4.1.0) [43] were excluded from these to reduce the dependence of clustering on cell-cycle effects. Data were subsequently analyzed using Scanpy (v1.9.1) [44] following the recommended standard practices. For quality control, genes expressed in fewer than three cells, and cells with fewer than 200 genes or more than 20% mitochondrial gene content, were removed prior to downstream analysis. Data were normalized (scanpy.pp.normalize_per_cell, scaling factor = 10,000) and log2-transformed (scanpy.pp.log1p). For gene expression visualization (e.g., heat maps), data were further scaled (scanpy.pp.scale, maximum value = 10).

Cell type clustering and annotation

The resulting latent representation from the integrated datasets was used to compute the neighborhood graph (scanpy.pp.neighbors), then the Louvain clustering algorithm (scanpy.tl.louvain, resolution = 3) and Uniform Manifold Approximation and Projection (UMAP) visualization (scanpy.tl.umap) were employed. Cell type annotations were manually refined using literature-driven, cell-specific marker genes. Identified residual RBCs from incomplete PBMC isolation were excluded before further analysis, as recommended [45].

Differential gene expression and transcription factor-enrichment analysis

Differential gene expression between COVID-19 patients and healthy individuals (FDR < 0.05) was analyzed using the limma package [46]. To predict transcription factor (TF) involvement in transcriptomic changes, we used DoRothEA (Discriminant Regulon Expression Analysis) v2 tool [47]. Regulons with a confidence score of A–C were analyzed, and cases with p < 0.05 and a normalized enrichment score (NES) of ± 2 were considered significantly enriched.

Cell–cell communication

Based on the differential expression analysis, CellPhoneDB [48] v3 (www.CellPhoneDB.org) was used to infer changes in ligand/receptor interactions between the identified cell types in COVID-19 versus HD. Specifically, instead of random shuffling, as used in the previously described statistical method, differentially expressed genes (FDR < 0.05) were used to select interactions that were significantly enriched in either severe COVID-19 patients or healthy individuals relative to the other group. An interaction was considered enriched if at least one of the two partners (ligand or receptor) was differentially expressed, and if both partners were expressed by at least 10% of the interacting cells.

Bisulfite pyrosequencing

EZ DNA Methylation-Gold kit (Zymo Research) was used to BS-converted 500 ng of genomic DNA following the manufacturer’s instructions. BS-treated DNA was PCR-amplified using IMMOLASE DNA polymerase kit (Bioline). Primers used for the PCR were designed with PyroMark Assay Design 2.0 software (Qiagen) (Additional file 3. Table S3). PCR amplicons were pyrosequenced with the PyroMark Q24 system and analyzed with PyroMark Q48 Autoprep (Qiagen).

Real-time quantitative polymerase chain reaction (RT-qPCR)

The Transcriptor First Strand cDNA Synthesis Kit (Roche) was used to convert 250 ng of total RNA to cDNA following the manufacturer’s instructions. RT-qPCR primers were designed with Primer3 software [49] (Additional file 3. Table S3). RT-qPCR reactions were prepared with LightCycler 480 SYBR Green I Master (Roche) according to the manufacturer’s instructions and analyzed with a LightCycler 480 instrument (Roche).

Flow cytometry

To study the surface cell markers on monocytes (CD14 +), PBMCs from the 10 patients used for single-cell analysis and 10 HDs were defrosted and washed once with PBS. After blocking for non-specific binding with Fc block (BD Pharmingen) for 5 min on ice, cells were incubated for 20 min on ice using staining buffer (PBS with 4% fetal bovine serum and 0.4% EDTA). Antibodies used included the following: CD14-FitC (Miltenyi Biotec), CD85-PEvio770 (Miltenyi Biotec), CD172a-APC (Miltenyi Biotec), CD97-PEvio770 (Miltenyi Biotec), CD31-PE (Miltenyi Biotec), CD366-PEvio615 (Miltenyi Biotec), CD62L-APC (Miltenyi Biotec), CD58-PE (Miltenyi Biotec), CD191-PEvio770 (Miltenyi Biotec), CD52-PEvio615 (Miltenyi Biotec), CD48-APC (Miltenyi Biotec). Cells were analyzed in a BD FACSCanto-II flow cytometer.

Statistical analysis

All statistical analyses were done with R v4.0.2. Box, bar, violin, bubble, and line plots were generated using functions from the ggplot2 (v3.3.6) and ggpubr (v4.0) packages. Mean normalized DNA methylation values were compared using two-tailed test. Multivariate frequency distributions were calculated using Fisher’s exact test. The levels of significance are indicated as: * p < 0.05, ** p < 0.01, *** p < 0.001, and **** p < 0.0001.

Results

DNA methylome remodeling in peripheral blood monocytes of severe COVID-19 patients

To directly inspect epigenetic alterations in peripheral blood monocytes in severe COVID-19, we isolated CD14 + CD15 − cells from 59 blood samples, comprising 48 severe COVID-19 patients and 11 healthy donors (HDs), and performed DNA methylation profiling (Fig. 1A, Table 1, and Additional file 1. Table S1). For cell sorting, we first separated live cells from debris, then extracted singlets and isolated CD14 + CD15 − cells to avoid neutrophil contamination (Fig. 1B) [50]. Since we selected CD14 + cells, the purification procedure only included classical (CM) (CD14 + CD16 −) and intermediate monocytes (IM) (CD14 + CD16 +), excluding the non-classical monocyte (NCM) (CD14lowCD16 +) subpopulation, which in healthy individuals corresponds to around 5% of the total monocyte compartment [51]. Negative selection using CD15 was necessary, as there is a significant increase in the frequency of neutrophils in severe COVID-19 patients, as activated neutrophils are not separated in the Ficoll step [52] (Additional file 4. Figure S1A-S1C). To confirm the purity of our monocytes, we performed FACS analysis and obtained an average purity of 98% (example in Additional file 4 Figure S1D). Studies in various other inflammatory diseases have shown that the proportions of monocytes can shift between the three major subsets, i.e., CM, IM, and NCM. For instance, it has been shown that severe COVID-19 patients feature reduced NCM and IM populations [53]. The analysis of monocyte subpopulations in our cohort showed a significant increase in the CM population and a decrease in the NCM population (Additional file 4. Figure S1E-S1F). Since we purified CD14 + monocytes, our study only included CM and IM.

Fig. 1
figure 1

Analysis of DNA methylation in blood monocytes of severe COVID-19 patients. A Scheme depicting the cohort and workflow for monocyte purification of severe COVID-19 patients and controls and DNA methylation analysis. B Representative flow cytometry profile, indicating sorting gates used to purify monocytes from HD and COVID-19 patients’ peripheral blood. C Scaled DNA methylation (z-score) heatmap of differentially methylated positions (DMPs) between HDs (blue bar above) and COVID-19 patients (red bar above). Significant DMPs were obtained by applying a filter of FDR > 0.05 and a differential of beta value (Δß) > 0.15. A scale is shown on the right, in which blue and red indicate lower and higher levels of methylation, respectively. Clinical and treatment data of COVID-19 patients are represented above the heatmap. SOFA, IL-6 level, and days in the ICU scales are shown on the right of the panel D Principal component analysis (PCA) of the DMPs. HDs and severe COVID-19 patients are illustrated as blue and red dots, respectively. E Gene ontology of hypermethylated and hypomethylated DMPs. Selected significant functional categories (FDR < 0.05) are shown. F Bubble plot of TF motifs enriched on hypermethylated and hypomethylated DMPs. Bubbles are colored according to their TF family; their size corresponds to the FDR rank. G Box plot of individual DNA methylation values of CpG from hypermethylated and hypomethylated clusters (b-values), with the name of the closest gene and the position relative to the transcription start site

We performed DNA methylation profiling of isolated monocytes and identified 2211 differentially methylated positions (DMPs) of CpGs in severe COVID-19 patients compared with HDs (FDR < 0.05 and absolute Δß > 0.15). Of these, 1773 were hypermethylated (hypermethylated cluster) and 438 were hypomethylated (hypomethylated cluster) (Fig. 1C and Additional file 5. Table S4). PCA of these DMPs showed that the two groups of monocytes (COVID-19 and HD) separated along the first principal component axis (Fig. 1D). We obtained similar results when we included monocyte subpopulation proportions as a covariate in the analysis (overlap, p < 0.0001) (Additional file 6. Figure S2A). No significant differences (FDR < 0.05) were observed within COVID-19 patients separated by their condition (obesity, hypertension, days admitted to the ICU, and exitus/death) or treatment with dexamethasone (Additional file 1. Table S1). None of the abovementioned conditions was significantly correlated with the DNA methylation changes (Additional file 6. Figure S2B). This was also apparent from the PCA showing the overlap of patients with different clinical parameters (Additional file 6. Figure S2C).

The analysis of the genomic functional features of the DMPs in the hypermethylated and hypomethylated clusters (Additional file 6. Figure S2D) using public data from monocytes [36] revealed an enrichment in promoters and enhancers. This is consistent with their proposed roles for DNA methylation in regulatory elements [54].

Gene ontology analysis (GO) of the two DMP clusters revealed several functional categories associated with the immune response to viral infection (Fig. 1E). In the hypermethylated cluster, we observed enrichment of categories such as natural killer-mediated immunity, leukocyte migration, adaptive immune response, and positive regulation of interferon gamma production. We also observed hypermethylation in the MHC-II protein complex that was related to antigen presentation. In addition, we found an enrichment of the positive regulation of MAP kinase activity category (Fig. 1E, top panel). In the hypomethylated cluster, we observed enrichment of functional categories relevant to viral infection, including defense response to virus and negative regulation of viral genome replication. Importantly, the hypomethylated cluster also featured enrichment of functional categories related to type I interferons (IFN) signaling and MHC class II (Fig. 1E, bottom panel).

Transcription factor (TF) binding motif enrichment analysis, in 250-bp windows surrounding DMPs, revealed overrepresentation of TFs of significance to the immune response. The hypermethylated cluster CpGs displayed enrichment of binding motifs of IRFs and ETS TF families, which are linked to IFN changes (Fig. 1F, left panel). Motifs of the bZIP TF family like AP-1, Jun, Fosl2, Fra1, and Fra2 were enriched in the hypomethylated cluster. DMPs of the hypomethylated cluster were also enriched in motifs of the signal transducer factor and activator of transcription factor (STAT) members STAT1 and STAT3. We also detected enrichment of the glucocorticoid response element (GRE) in the hypomethylated cluster (Fig. 1F, right panel). Given these results, we hypothesized that pharmacological treatment with glucocorticoids (GCs) in severe COVID-19 patients in the intensive care unit (ICU) might influence DNA methylation in monocytes. To test this possibility, we performed limma analysis and subsequent binding motif enrichment after separating COVID-19 patients into two groups, with and without GC treatment. Both groups of patients exhibited significant enrichment of GRE motifs in the hypomethylated cluster (Additional file 6. Figure S2E), suggesting that the endogenous production of GCs in severe COVID-19 patients could participate in the hypomethylation through GRE. However, given the size of the cohort, we cannot rule out the possibility that pharmacological treatment could also influence DNA methylation changes and therefore remains as a potential confounder factor.

Inspection of the individual genes within or in the vicinity of the DMPs revealed several genes with functions essential to the viral immune response. The list of relevant genes included IRF8, RUNX3, CD226, and CD83 in the hypermethylated cluster, and STAT1, FOXO3, IL1R1, and OAS1 in the hypomethylated cluster (Fig. 1G). We validated these results using bisulfite pyrosequencing in a new cohort of severe COVID-19 patients (Additional file 6. Figure S2F). Interestingly, these changes were also observed in mild COVID-19 patients (Additional file 6. Figure S2F). IRF8, IL1R1, and CD83 are associated with the IFN response. CD226 encodes a glycoprotein related to monocyte, NK, and T cell adhesion. This glycoprotein has been shown to be involved in the cytotoxicity of these cells and is known to be altered in COVID-19 patients [13]. STAT1 is associated with the cytokine response, which, in turn, is related to IL1R1. The latter is the receptor of interleukin 1, which participates in the inflammatory response and is strongly expressed in severe COVID-19 patients [14]. OAS1 is induced by interferons and activates latent RNase, causing viral RNA degradation, which could be related to the identification of the category negative regulation of viral genome replication in the GO analysis.

Monocytes from severe COVID-19 patients display increased DNA methylation variability

Overall, our DNA methylation analysis showed greater heterogeneity (different variable positions, DVPs) in the profiles from COVID-19 patient monocytes than in those from HDs (Additional file 6. Figure S2G). We then examined the relationship between the DNA methylation profiles and the Sequential Organ Failure Assessment (SOFA) score, which is used in ICUs to calculate organ damage. The score ranges from 0 to 24, with values greater than 6 being associated with a significant increase in the risk of mortality [55]. Using Spearman’s correlation coefficient to assess specific hypermethylated or hypomethylated CpGs with SOFA, we identified 1375 CpG sites whose methylation levels positively correlated with SOFA (increased methylation) (rho < 0.4 and p < 0.01) and 1497 CpG sites with an inverse correlation with SOFA (decreased methylation) (rho <  − 0.4 and p < 0.01) (Fig. 2A and Additional file 7. Table S5). The mean normalization DNA methylation profiles of increased and decreased methylation CpG sites were similar in patients with low SOFA (< 6) and in healthy controls in an unsupervised representation but differed between the low and high SOFA score groups (Fig. 2B). These results suggest that changes in DNA methylation are concomitantly exacerbated for higher SOFA scores, which is associated with bad prognosis. Several CpGs correlating with SOFA were associated with genes, such as IL17R, SOCS5, and PCDHA5, that are involved in T cell-mediated inflammatory responses (Fig. 2C). Others, like FOXG1 and CDC20B, are associated with DNA damage. GO analysis revealed that changes in DNA methylation that are concomitant with SOFA show an overrepresentation of terms associated with IFNγ, production of the molecular mediator involved in inflammatory response, viral gene expression, the B cell proliferation involved in immune response, and Th1 cell cytokine production (Fig. 2D).

Fig. 2
figure 2

DNA methylation changes in COVID-19 monocytes parallel organ damage. A Heatmap of severe COVID-19 patients with DNA methylation ordered by SOFA score, including all CpG-containing probes significantly correlated with the SOFA score (Spearman correlation coefficient rho > 0.4, p < 0.01). Clinical and treatment data of COVID-19 patients are shown above the heatmap. SOFA, IL-6 level, and days in the ICU scales are shown on the right of the panel B. Normalized methylation values from heatmap showing overall group methylation of HD. Patients with SOFA ≤ 6 are indicated as SOFA LOW; those with SOFA > 6 are indicated as SOFA HIGH. C DNA methylation levels (b-values) of selected individual CpGs (and closest genes) in hypermethylated and hypomethylated sets and their position relative to the transcription start site. D Gene ontology (GO) analysis of hypermethylated and hypomethylated DMPs, analyzed with the GREAT online tool, in which CpG annotation in the EPIC array was used as background. Statistical significance: * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001

DNA methylation alterations in monocytes of severe COVID-19 patients significantly associate with those derived from patients with bacterial sepsis, myeloid differentiation, and the influence of inflammatory cytokines

To better characterize the impact of DNA methylation changes in COVID-19, we compared the DMPs from severe COVID-19 patients with those obtained from monocytes derived from patients with bacterial sepsis in a previous study by our team [27], given that severe COVID-19 can be considered a form of sepsis [56]. To this end, we first estimated the DNA methylation values of DMPs corresponding to the sepsis relative to the HD comparison from our previous sepsis study (accession number GSE138074) [27] using the data from the severe COVID-19 methylation dataset. Overall, we found significant enrichment in the hypermethylation and hypomethylation clusters (Fig. 3A). We also calculated the odds ratio of the overlap between these two datasets and found a strong enrichment of the hyper-DMPs in COVID-19 relative to those in sepsis (FDR ≤ 2.22·10−16) and in the hypo-DMPs (FDR ≤ 2.22·10−16) (Fig. 3B). We also confirmed an enrichment in introns and depletion in promoters relative to the background when testing the genomic location of the DMPs common to both COVID-19 and sepsis (Fig. 3C and Figure S3A). DMPs located in introns are often localized in enhancer regions involved in long-distance regulation [54].

Fig. 3
figure 3

Comparative analysis of DNA methylation in blood monocytes of severe COVID-19 and bacterial sepsis patients. A Violin plot representing the mean methylation state of the DMPs found in the comparison between HDs and sepsis patients with b-values obtained from severe COVID-19 patients. B Fisher’s exact test showing the odds ratio ± 95% confidence interval of the overlap between DMPs found in monocytes from bacterial sepsis patients and DMPs in monocytes from COVID-19 patients. C Proportions of the genomic locations (in relation to genes) of DMPs in COVID-19 and sepsis; Bg., background, EPIC probes. D Venn diagram of the overlap of COVID-19 DMPs identified by the comparison of HDs and severe COVID-19 patients with DMPs identified by the comparison between HDs and sepsis patients, separating hypermethylated and hypomethylated DMPs. E Gene ontology analysis of hypermethylated and hypomethylated overlap** DMPs identified in the previous comparison. Selected significant categories (p < 0.05) are shown. F TF binding motif analysis of shared hypermethylated and hypomethylated DMPs comparing HDs and COVID-19 patients, and by HDs and sepsis patients. The panel shows the fold change (FC), TF family. Boxes with black outlines indicate TF binding motifs with FDR < 0.05. G Box-plot showing the DNA methylation values of individual CpGs (together with the name of the closest gene and its position relative to the transcription start site) from the hypermethylated and hypomethylated clusters from both COVID-19 and sepsis. H Scaled DNA methylation heatmap of regions from the whole-genome bisulfite sequencing (WGBS) data of hematopoietic stem cells (HSCs), multipotent progenitors (MPPs), common myeloid progenitors (CMPs), and granulocyte macrophage progenitors (GMPs) that overlap with the genomic location of the 1772 hypermethylated DMPs identified in the COVID-19 vs. HDs comparison. Statistical significance: * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001

We then determined that the two datasets had 362 hypermethylated and 92 hypomethylated CpGs in common (Fig. 3D), corresponding to 51% of the total DMPs of the sepsis patients (Additional file 8. Figure S3B). GO analysis of the shared DMPs revealed significant enrichment in functional terms related to host response, including regulation of NK cells, inflammatory response, and leukocyte chemotaxis (Additional file 8. Figure S3C). Shared hypermethylated CpGs were enriched in functional categories related to cell signaling, such as the JAK-STAT and MAPK pathways, that could be involved in the reduction of the inflammatory response and the IL15- and IL12-mediated signaling pathways, which are related to cytokine production and Th1 proliferation (Fig. 3E, left panel). Shared hypomethylated CpGs were enriched in functional categories responsible for regulating the inflammatory response, such as negative regulation of IL-1 production and positive regulation of macrophage activation. In concordance with the hypermethylated cluster, we also observed negative regulation of IFNα production (Fig. 3E, right panel). It is of note that severe COVID-19-specific DMPs were enriched in functional categories related to virus infection, such as the defense response to virus, and impairment of the antigen-presenting process, which seems to be specific to COVID-19 infection [13, 23] (Additional file 8. Figure S3D).

Inspection of TF binding motifs corresponding to the DMPs shared between the two groups, separating the shared hypermethylated and hypomethylated CpG sets revealed IRF family transcription factors like IRF1, IRF2, IFR3, and IRF8 in the shared hypermethylated CpG set, which are well established regulators of the type I IFN system, being common in viral and bacterial infections [57]. We also detected enrichment of the ETS transcription factors that are regulated by MAPK proteins, which were enriched in the GO analysis (Fig. 3F). In the shared hypomethylated set, we noted enrichment of STAT3 and TFs from bZIP AP-1, like Jun, and other bZIPs, like CEBP. Interestingly, GRE was also present in the shared hypomethylated cluster (Fig. 3F). This suggests the influence of GC in the acquisition of aberrant methylation profiles in COVID-19 and sepsis. Individual genes associated with the COVID-19/sepsis shared hypermethylated and hypomethylated CpG genes include type I IFN-related genes, like IRF2, and others, such as IL1A and CCR2, that are involved in inflammatory processes and monocyte chemotaxis, respectively (Fig. 3G). We also identified several genes among the shared hypomethylated set, like CD163, SOCS1, and IL10, that have been associated with the acquisition of tolerogenic properties in monocytes [58] (Fig. 3G).

In both infections, systemic inflammation could be responsible for part of the DNA methylation changes that arise in monocytes. To address this possibility, we examined the DNA methylation levels of the hypomethylated and hypermethylated CpGs of severe COVID-19 and sepsis patients in monocytes isolated from healthy donor PBMCs that had been treated in vitro with inflammatory cytokines like IFNα, IFNγ, and TNFα [26] (accession number GSE134425). This analysis revealed several significant changes following the trends for both COVID-19 and sepsis (Additional file 8. Figure S3E), suggesting that these inflammatory cytokines, which are elevated in these patients, could influence the monocyte DNA methylomes.

An alternative explanation for the observed changes in severe COVID-19 monocyte methylomes could be that DNA methylation changes reflect alterations during myeloid/monocyte differentiation or the release of immature or aberrant monocytes. This has been described in severe COVID-19 cases [13, 59,60,13, 21, 23, 76]. In this regard, our study provides the first instance of DNA methylome profiling in a specific immune cell type in COVID-19 patients.

Our data revealed that most DNA methylation changes in monocytes derived from severe COVID-19 patients occurred in genomic sites enriched in PU.1 binding motifs, consistent with earlier studies showing its role as a pioneer TF directly recruiting TET2 and DNMT3b [77]. In our case, most DNA methylation changes occurred in genes related to cytokines, MHC class II proteins, and IFN signaling. Similar results about the defective function of MHC-II molecules and activation of apoptosis pathways were obtained in single-cell atlas studies of PBMCs from severe COVID-19 patients [13, 20, 93, 94] and sepsis [63]. Release of immature myeloid cells from the bone marrow in severe COVID-19 is reminiscent of emergency myelopoiesis [95]. This is a well-known phenomenon, characterized by the mobilization of immature myeloid cells to restore functional immune cells, and by its contribution to the dysfunction of innate immunity [96]. In fact, a proportion of the hypermethylated CpGs in monocytes from severe COVID-10 patients overlap with regions that become demethylated during myeloid differentiation. This suggests that part of the hypermethylated CpG sites in isolated peripheral blood CD14 + might be associated with aberrantly differentiated monocytes released into the bloodstream in severe COVID-19 patients. However, the small numbers of CD34 + cells in the PBMC fraction of COVID-19 patients and the lack of CD14 + cells in this subset suggest no interference with our results for CD14 + CD15- cells, isolated with our method.

The relationship between DNA methylation and gene expression is complex. DNA methylation patterns are cell-type-specific and are established during dynamic differentiation events by site-specific remodeling at regulatory regions [97]. In general, methylation of CpGs located in gene promoters, first exons, and introns is negatively correlated with gene expression [98]. The analysis of our data shows that there is an inverse correlation between the CpG methylation changes and the expression levels of the closest genes. The comparison of the inferred TFs associated with DNA methylation changes and gene expression changes shows common factors like IRF2 and IRF3, which regulate downregulated genes and hypermethylated CpGs. In this context, it is possible that reduced levels of IFN regulatory factor IRF3 or defective IRF7 function reduces the level of IFNα/β gene expression, increasing the sensitivity to viral infection [12, 99].

Finally, analysis of cell–cell communication has revealed potential relationships between DNA methylation changes and altered communication of monocytes and other immune cells (e.g., T, plasma B and NK cells). Our data suggest the potential reduction of interactions between monocytes and NK cells through CD160, which mediates the antibody-dependent cell-mediated cytotoxicity that it is essential for IFNγ production [67]. The potentially greater interaction between monocytes and Treg through multiple ligand and receptor pairs is an interesting finding, since Tregs are immunosuppressive cells responsible for maintaining immune homeostasis [100]. In any case, the use of CellPhone DB is useful for inferring cell–cell communications events; however, additional validation experiments would be necessary to validate interactions and activation of downstream signaling pathways.

In our study, we could not determine whether the observed DNA methylation alterations in COVID-19 were the cause or the consequence of the changes in gene expression. The analysis of mild COVID-19 cases, in which the DNA methylation and expression level of a few genes showed differences in their similarities with severe COVID-19 cases, suggests that there are cases where expression changes might anticipate DNA methylation changes. In any case, it is reasonable to propose that some DNA methylation changes help perpetuate dysregulated immune responses.

Some limitations of our study include the size of the cohort, and the unequal numbers of individuals administered particular drugs in the different patient groups, which could have affected the COVID-19 data. However, despite these limitations, we found no significant differences among severe COVID-19 patients with respect to the time they were admitted to the ICU or began to receive treatment. This suggests that DNA methylation is quite a general occurrence in the context of COVID-19. Another limitation concerns the cell population analyzed, since the method for monocyte isolation comprises two populations, CM and IM, one of which (CM) is expanded in the patient group. However, the analysis including the monocyte subsets as a covariate indicates that there are no major differences. Finally, in the comparison with DNA methylation of progenitor cells, it is important to note that the DMPs were overlapped with genomic regions, and not single-base data, and further analyses would be required.

Future studies would benefit from having access to a wider cohort in which it is possible to identify significant links between alterations and drug treatments. Incorporating mild and asymptomatic cases would improve our ability to dissect drug- and severity-related specificity in relation to DNA methylation changes. As is the case for other medical conditions, the analysis 1of DNA methylation changes would be very likely to help predict disease severity, progression, and recovery.

Conclusions

Our study provides unique insights into the epigenetic alterations of monocytes in severe COVID-19. We have shown that peripheral blood monocytes from severe COVID-19 patients undergo changes in their DNA methylomes, in parallel with changes in expression, and that these significantly overlap with those found in patients with sepsis. We have also shown DNA methylation changes are associated with organ dysfunction. Finally, our results suggest a relationship between DNA methylation changes in COVID-19 patients and changes that occur during myeloid differentiation and others that can be induced by pro-inflammatory cytokines. CellPhoneDB analysis also suggests that alterations in immune cell crosstalk can contribute to transcriptional reprogramming in monocytes, which involves dysregulation of interferon-related genes and genes associated with antigen presentation and chemotaxis.