Introduction

The novel coronavirus, SARS-CoV-2, which caused the outbreak of coronavirus disease (COVID-19), has infected over 79 million people, and caused nearly one million deaths in the U.S. alone [1]. Immunocompromised populations, such as people living with HIV (PLWH), generally have a higher risk of severe COVID-19 outcomes, such as hospitalization or death [2]. Several multi-centered population-based studies have suggested the elevated risk of severe COVID-19 outcomes, particularly in those with pronounced immunodeficiency (e.g., CD4 counts < 200 cells/ml), compared to non-PLWH [3,4,5].

Older age and a variety of comorbidities (e.g., myocardial infarction, diabetes, liver disease, renal disease) are well recognized risk factors for severe COVID-19 outcomes among both general population and PLWH [5, 6]. HIV infection, even when well-controlled, interplays with aging and comorbidities. HIV infection may accelerate the aging process [7, 8], and PLWH also tend to develop a variety of age-associated comorbidities (e.g., cancer and cardiovascular disease) at a younger age than non-PLWH [9,10,11]. While HIV and aging process may directly increase the risk of severe COVID-19 outcomes, the accelerated aging process of PLWH may also lead to higher prevalence of age-associated comorbidities which may all impact the risk of adverse COVID-19 outcomes. With these interweaving effects of HIV, aging, and comorbidities, it is challenging to untangle their effects on COVID-19 outcomes observed among PLWH. Most existing evidence of elevated risks of COVID-19 related death or hospitalization among PLWH are from observational studies [3,4,5, 12]. Although some studies demonstrated the heightened risk of severe COVID-19 outcomes in older PLWH [5], some other studies only included young PLWH [4], or were not be able to adequately control for aging and comorbidities that may impact COVID-19 outcomes [13]. Thus inference cannot be made regarding the unique impact of accelerated aging, or HIV on COVID-19 outcomes [4].

The overall life span for PLWH has been lengthened due to the worldwide implementation and early initiation of ART, but many PLWH display signs that resemble premature aging, reflected by the higher rates of age-related comorbidities. In other words, PLWH’s biological age might be greater than their chronological age. Epigenetic aging (i.e., aging process that is associated with altered epigenetic mechanisms of gene regulation), is recognized as a key to the understanding of biological aging. Using epigenetic models of aging, Gross and colleagues found that chronic HIV infection led to an average aging advancement of 4.9 years (95% CI 3.4–7.1 years) in PLWH on ART [14, 15]. Similar age advancement was observed in another study (5.2 years; range:3.7–6.7 years) [16]. Among untreated PLWH, the age advancement can increase up to 14 years of difference [17]. Biological aging was more pronounced in PLWH who had CD4 counts < 200 cells/mm3, ranging from additional 1.8–3.6 years of aging, compared to PLWH with CD4 counts > 200 cells/mm3 before ART, and some accelerated aging persisted even after two years on ART [18]. Different studies have shown different results of the association between VL and epigenetic age acceleration [15, 16]. Given the accumulation of evidence of accelerated aging in PLWH, teasing out the impact of accelerated aging when investigating the association between HIV infection and COVID-19 outcomes could further our understanding of the role of HIV in COVID -19 outcomes among PLWH, especially those with pronounced immunodeficiency.

To disentangle the complex relationship between age, HIV, and COVID-19 clinical outcomes while controlling for sex, race, ethnicity, and comorbidities, we conducted a retrospective cohort analysis to examine: (1) the effect of HIV infection on severe COVID-19 outcomes (e.g., hospitalization, all-cause death) using both exact matching and propensity score matching (PSM); (2) the impact of age advancement of HIV on COVID-19 outcomes among PLWH using PSM with varying age differences between PLWH and non-PLWH; and (3) whether the effects of HIV age advancement on COVID-19 outcomes differed by immunity level.

Methods

Data Source and Study Population

The National COVID Cohort Collaborative (N3C) collected and harmonized electronic health records (EHR) data from a large number of clinical sites across the nation and is the largest cohort of COVID-19 cases in the U.S. [19]. N3C adopted Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) version 5.3.1 as the canonical model to harmonize the EHR built in different formats. Following the N3C COVID-19 diagnosis definition [19,20,21], we defined COVID-19 patients as those who had a positive result from one of a priori-defined tests (including real-time polymerase chain reaction, antigen, and antibody tests) and diagnostic conditions based on relevant ICD codes [20]. In this study, we included all adult (aged ≥ 18 years) COVID-19 cases with any healthcare encounter from 66 clinical sites with data being deposited into the N3C from January 1, 2020, through October 18, 2021. The dataset also included patients’ historical health conditions and medical records (i.e., “retrospective data”) in the same healthcare system dated back to January 1, 2018 [20]. We excluded patients with missing data on age, sex, race, and ethnicity since these are key variables in our matching process.

PLWH Cohort Definition

PLWH were identified by meeting one of the following criteria: (1) having clinical diagnosis codes (ICD-10, SNOMED, etc.), (2) having laboratory positive results indicative of an HIV infection, or (3) having received combination antiretroviral medications. Note, individuals who were only exposed to pre-exposure prophylaxis (PrEP) medications (i.e., FTC + TAF and FTC + TDF) and without concurrent HIV infection diagnostic or laboratory results did not meet inclusion criteria as PLWH [22]. While source validation is not possible in N3C at this time, our methods are parallel to those conducted by others using validated EHR data for identifying cohorts of PLWH [23]. Among PLWH, CD4 count and viral load (VL) were defined from corresponding laboratory tests (see Supplementary Table S1 for the concept set). We retrieved the most recent, but within 180 days, CD4 count or VL preceding the initial COVID-19 diagnosis.

Key Variable Measures

The two severe COVID-19 outcomes of this study were all-cause mortality and any hospitalization. All-cause mortality was defined based on death records in N3C. We defined hospitalization status by ascertaining the visit encounter types as “inpatient visit” or “inpatient critical care facility” or “emergency room and inpatient visit” in the “selected critical visit” table. Pre-COVID comorbidities (between January 1, 2018, and the date of initial COVID-19 diagnosis) were identified by ICD codes in the Charlson Comorbidity Index (CCI) scoring instrument [24]. Demographics (e.g., age at initial COVID-19 diagnosis, sex, race, and ethnicity) were also extracted and please see the Supplementary for more introduction of the variable definitions.

Cohort Matching Process

Exact Matching

Among COVID-19 cases, we first matched PLWH with non-PLWH using 1:1 exact matching based on age, sex, race, and ethnicity. When more than one non-PLWH met the matching criteria for a case PLWH, we randomly selected one of the non-PLWH as the control case.

PSM with Comparable Age

While exact matching has its methodological strengths, it faces the challenge in identifying a match when adjusting a large number of confounders, such as comorbidities (e.g., liver diseases, cancer, etc.). In current study, there were 1678 (11.05%) PLWH who could not be exacted matched with non-PLWH if we consider all 13 comorbidities. Therefore, we adopted a nearest PSM approach [25] to retain all PLWH in our analysis when extending the matching process by adding selected comorbidities. In the PSM, a propensity score was produced for each individual based on their age, sex, race, ethnicity, and comorbidities using logistic regression, and then we match all PLWH with non-PLWH who had the nearest scores with 1:1 ratio.

PSM Matching with Varying Age Difference

To explore the effects of age advancement in HIV on severe COVID-19 outcomes, we matched PLWH 1:1 with older non-PLWH. Based on existing literature on accelerated aging among PLWH [14,15,16], we adopted 3 to 7 years as the potential age difference between PLWH and non-PLWH in the PSM. Specifically, we matched the PLWH at age ‘a’ with a non-PLWH at age ‘a + x’, where x is from 3 through 7. This PSM process generated 5 sets of matched cohorts with varying age difference between PLWH and non-PLWH, while kee** the other confounders (sex, race, ethnicity, comorbidities) comparable between the two groups.

Balance Diagnostics after Matching

To examine the balance of covariate distribution, standardized mean difference (SMD) of each matched variable between PLWH and non-PLWH were computed as a balance diagnostic technique for the data with exact matching and PSM. A lower SMD indicates a better balance of the specific variable. SMD greater than 0.1 is recommended as a threshold for declaring imbalance [26, 27].

Statistical Analysis

We used multivariable logistic regressions on each matched cohort to examine the impact of HIV infection. Since we did not match by the comorbidities in the exact matching, we adjusted the pre-COVID comorbidities in the logistic regression analysis for exact matching cohort.

Subgroup analyses by level of CD4 counts (CD4 ≥ 200 or CD4 < 200 cells/mm3) and VL suppression status (VL ≥ 200 or VL < 200 copies/mm3) were also conducted using similar procedures. Specifically, we repeated the PSM with both comparable age and varying age differences to match PLWH with each CD4 count level and VL suppression status and non-PLWH and conducted the logistic regressions to compare the severe COVID-19 outcomes between PLWH with different immunity levels and non-PLWH. For PSM with varying age differences for PLWH with CD4 < 200 cells/mm3, we further expanded the age difference by additional 3 years (i.e., from 3 to 10 years) as the literature suggests a faster HIV-accelerated aging process among PLWH with CD4 < 200 cells/mm3 [18]. We conducted sensitivity analyses by repeating the PSM with 1:2 ratio for both the entire PLWH cohort, the CD4 subgroups and VL subgroups. Odds ratio (OR) and 95% confidence intervals (CIs) were estimated. A p-value of less than 0.05 was employed to indicate statistical significance. All the analyses were conducted using SQL and R (version 3.5). We created reproducible pipelines on N3C Data Enclave using Code Workbook application.

Results

Sample Characteristics

Of the 2,422,864 COVID-19 patients between Jan 1, 2020, and Oct 18, 2021, 15,188 were identified as PLWH. A total of 6219 (40.9%) and 4217 (27.8%) PLWH had the most recent CD4 count and VL records (i.e., within 180 days prior to initial COVID-19 diagnosis), respectively, with 5347 (86.0%) having CD4 counts ≥ 200 cells/mm3 and 2892 (68.6%) having VL < 200 cells/mm3 (Fig. 1). The characteristics of variables are shown in Tables 1 and 2. In all the PSM datasets (with comparable age or varying age differences), all the key variables were balanced based on SMD measures (Table 3).

Fig. 1
figure 1

Study participants selection. *Per National COVID Cohort Collaborative Policy, we concealed the numbers of patients in each category because some categories included less than 20 individuals

Table 1 Characteristics of COVID-19 cases by HIV status in N3C data, January 1, 2020–October 18, 2021
Table 2 Characteristics of COVID-19 cases by levels of CD4 count and viral load
Table 3 Standard mean difference between PLWH and non-PLWH

Higher Odds of Severe COVID-19 Outcomes with Exact Matching

Using exact matching, PLWH had significantly higher odds of hospitalization (OR: 1.50, 95% CI: 1.42, 1.58) or death (OR: 1.48, 95% CI: 1.29, 1.69) compared to non-PLWH, adjusting for pre-COVID comorbidities. PLWH with CD4 count < 200 cells/mm3 had higher odds of mortality (OR: 2.86, 95% CI: 1.71, 4.80) and hospitalization (OR: 4.61, 95% CI: 3.67, 5.78) compared with non-PLWH. For PLWH with CD4 ≥ 200 cells/mm3, the odds of death were similar between PLWH and non-PLWH (OR: 0.83, 95% CI: 0.65, 1.06) while the odds of hospitalization were higher for PLWH (OR: 1.40, 95% CI: 1.27, 1.54) using exact matching. PLWH with suppressed VL had similar odds of hospitalization (OR: 1.07, 95% CI: 0.94, 1.23), yet lower odds of mortality (OR: 0.69, 95% CI: 0.49, 0.99) compared to non-PLWH. For the PLWH with non-suppressed VL, the odds of hospitalization were higher (OR: 1.51, 95% CI: 1.25, 1.81) while the odds of mortality were similar (OR: 1.00, 95% CI: 0.61, 1.62) compared to non-PLWH.

Higher, but Attenuated Odds of Severe Outcomes with PSM

When using the PSM with comparable age, the odds of all-cause mortality and hospitalization after COVID-19 positive were attenuated but still higher in the PLWH (death: OR: 1.33, 95% CI: 1.18, 1.49; hospitalization: OR: 1.36, 95% CI: 1.29, 1.43). PLWH with CD4 < 200 cells/mm3 had even higher mortality and hospitalization (death: OR: 1.82, 95% CI: 1.24, 2.67; hospitalization: OR: 3.36, 95% CI: 2.76, 4.09). On the contrary, PLWH with CD4 ≥ 200 cells/mm3 had a similar mortality with non-PLWH (OR: 0.91, 95% CI: 0.72, 1.14) although a higher, but relatively smaller risk of hospitalization (OR: 1.26, 95% CI: 1.16, 1.37). PLWH with VL suppression had similar odds of death and hospitalization (death: OR: 0.95, 95% CI: 0.69, 1.31; hospitalization: OR: 1.04, 95% CI: 0.93, 1.18) while PLWH with VL non-suppression had higher odds of both outcomes (death: OR: 1.85, 95% CI: 1.16, 2.96; hospitalization: OR: 1.59, 95% CI: 1.35, 1.89).

Odds of Severe Outcomes when Considering Age Advancement in HIV

Using PSM with age difference from 3 to 7 years, PLWH showed a higher risk of death until the age difference reached 6 or 7 years (i.e., non-PLWH were 6 or 7 years older than PLWH) (Fig. 2). PLWH had persistently higher risk of COVID-19 related hospitalization regardless of predefined age differences (Fig. 3). For PLWH with CD4 < 200 cells/mm3, when they were matched with non-PLWH by varying age difference from 3 to 10 years, the odds of both mortality and hospitalization were persistently higher regardless of age difference (Fig. 4). PLWH with CD4 ≥ 200 cells/mm3 were more likely to be hospitalized than non-PLWH until the age difference reached 6 years, although the risk of morality was similar with any predefined age difference (Fig. 5). The detailed results are shown in Supplementary Table S2a. For PLWH with VL suppressed, the odds of either mortality or hospitalization was not higher than non-PLWH with age difference between 3 and 7 years while PLWH with non-suppressed VL persistently had higher COVID-19 hospitalization compared to non-PLWH. The results of subgroup analyses by VL suppression status are shown in Supplementary (Table S3a-b, Fig. S3).

Fig. 2
figure 2

OR of mortality using 1:1 PSM data with age differences

Fig. 3
figure 3

OR of hospitalization using 1:1 PSM data with age differences. Note The dots are estimated OR of specific outcome. The valid and dashed curves are OR and 95% CI along with the changing of age difference

Fig. 4
figure 4

OR of different COVID-19 outcomes using 1:1 PSM data for PLWH with CD4 < 200 cells/mm3

Fig. 5
figure 5

OR of different COVID-19 outcomes using 1:1 PSM data for PLWH with CD4 ≥ 200 cells/mm3Note The dots are estimated OR of specific outcome. The valid and dashed curves are OR and 95% CI along with the changing of age difference

The sensitivity analysis using 1:2 ratio PSM showed similar results to 1:1 matched cohort. Even there was a small difference in 95% CI for odds of death at age differences of 4 and 10 among low CD4 subgroup analysis, the conclusion of significance remained the same. The results of sensitivity analyses are presented in Supplementary (Table S2b and Figs. S1–S2, S3c-d).

Discussion

Using both exact matching and PSM to create a counterfactual comparison group of non-PLWH from real-world EHR data, this study provides robust evidence regarding the role of HIV infection on severe COVID-19 outcomes when the impact of age advancement in HIV is considered. Using age differences in the PSM process to mirror the accelerated aging effect of HIV, our data suggest that the higher risk of severe COVID-19 outcomes in PLWH might be independent of the accelerated aging, especially for mortality.

In addition, the range of age difference that became statistically meaningful for comparability of risk of severe COVID-19 outcomes between PLWH and non-PLWH in the current study was comparable to the ranges of biological aging of PLWH established in the literature [14,15,16]. This consistency in part supports our hypothesis on the role of HIV-accelerated aging’s impact on severe COVID-19 outcomes among PLWH. To achieve better clinical outcomes of COVID-19 among PLWH, clinical practice and intervention efforts for COVID-19 positive PLWH need to take the accelerated aging into consideration. For example, we may consider a relatively younger age threshold when develo** COVID-19 vaccine guidance to promote the booster shots for virally non-suppressed PLWH in the future.

Despite the role of age advancement, our data showed that HIV may still independently contribute to severe COVID-19 among PLWH. This was evidenced by the results among PLWH with CD4 < 200 cells/mm3 and VL ≥ 200 copies/ml. The risks of hospitalization and death were higher among these PLWH in comparison with non-PLWH regardless of the predefined range of age differences. Our findings might suggest that HIV-reduced immunodeficiency may have a specific role in predisposing PLWH to more severe COVID-19 outcomes. These results support the findings from other observational research that PLWH with lower CD4 counts had more severe COVID-19 outcomes [3, 5]. Future studies regarding the mechanisms (e.g., chronic inflammation) of heightened risk of severe COVID-19 outcome among immunocompromised PLWH are warranted.

It is methodologically challenging to disentangle the complex relationship among aging, HIV, and COVID-19 clinical outcomes while controlling for a large number of potential confounders. PSM provides an efficient way to explore the independent impact of the exposure of interest using observational data and has been used widely in producing more precise estimation of treatment effects. In this study, we first adopted the standard PSM approach to match PLWH and non-PLWH. To take the age advancement among PLWH into consideration, we revised the matching criteria with varying age differences to create multiple cohorts to address the question “how can we compare PLWH with their counterfactual comparison group (or older counterparts)?” Based on the modified approach, we matched PLWH with older non-PLWH with pre-defined age differences using the propensity score generated based on predefined age difference, sex, race, ethnicity, and comorbidities. With rigorous methodology of exact matching and PSM, our current study has produced robust evidence for a better understanding of the complex relationship among aging, HIV and COVID-19 outcomes. We believe these methodologies have wide-ranging implications for better understanding the impact of accelerated aging in HIV for a variety of outcomes [28].

There are limitations in our research. First, while being the largest COVID-19 EHR repository in the world, the N3C cohort is mostly from the southeast, mid-Atlantic and mid-west regions of US, thus geographical bias may exist in the data. Second, dates of service in the dataset are algorithmically shifted to protect patient privacy, which means the number of patients during our study period might not accurate, i.e., we are potentially missing patients on the entry or exit time point of this cohort. In addition, data quality of different contributing sites varies and some of the contributing sites did not upload their most recent data during our study period, which may create systematic bias and potentially skew the analysis. Third, the data on some key variables, such as CD4 counts and VL counts, are not available for some patients which reduced the sample size for the subgroup analysis. In addition, we excluded patients with missing data on key variables in our matching process (age, sex, race, ethnicity), although in a very few proportions, which could potentially introduce bias in the findings as existing data suggested that missingness in race/ethnicity might affected Black, Indigenous, and people of color (BIPOC) more than non-BIPOC communities. Fourth, we only matched key demographic variables and comorbidities in our matching process, and other important variables, such as other indicators of social determinants of health, treatment histories for these comorbidities, could also influence both accelerated aging and COVID-19 outcomes.

In conclusion, our study, with rigorous methodology, explore the association between HIV infection, HIV-accelerated aging, and severe COVID-19 outcomes by controlling many potential confounders. We found that PLWH had a greater risk of COVID-19 related all-cause mortality and hospitalization. The risk of mortality among PLWH was comparable to non-PLWH who were at least 6 years older. In other words, the elevated risk of COVID-19 mortality among PLWH might be attributed to the age advancement, while HIV infection may still impact COVID-19 outcomes independent of the advanced aging, especially among PLWH with profound immunodeficiency.