FormalPara Take-home message

In patients with COVID-19 and severe hypoxemia, dexamethasone 12 mg compared with 6 mg did not result in statistically significant improvements in mortality or health-related quality of life at 180 days, but the results were most compatible with benefit from the higher dose.

Introduction

Critical coronavirus disease 2019 (COVID-19) is characterised by severe pulmonary inflammation and high rates of death despite anti-inflammatory treatment [1]. Survivors from critical COVID-19 suffer from reduced health-related quality of life (HRQoL), including physical and mental problems, for months after hospital discharge [2].

Dexamethasone 6 mg daily for up to 10 days is recommended for patients with critical COVID-19 [3] based on the results of a meta-analysis of 7 randomised trials reporting reduced short-term mortality with the use of systemic corticosteroids [1]. Subsequently, the results of the COVID STEROID 2 trial suggested that dexamethasone 12 mg as compared with 6 mg may result in more days alive without life support at 28 days in patients with COVID-19 and severe hypoxaemia [adjusted mean difference 1.3 days (95% confidence interval 0.0–2.6)] [4]. In a pre-planned Bayesian analysis of the COVID STEROID 2 trial, the probability of benefit with 12 mg versus 6 mg was 94% for days alive without life support at 28 days and between 84 and 96% for all secondary outcomes assessed up to 90 days [5].

The effects of dexamethasone dosing on longer-term outcomes, including HRQoL, in patients with COVID-19 and severe hypoxaemia are important for patients and should inform clinicians, guideline committee members and policymakers. Here we present the 180-day mortality and HRQoL results, which were pre-specified secondary outcome measures of the COVID STEROID 2 trial [6].

Methods

Trial design

The COVID STEROID 2 trial was an investigator-initiated, international, parallel-group, stratified, blinded randomised clinical trial. The trial protocol was approved by the relevant medicine agencies and ethics committees [4]. The trial protocol, statistical analysis plan and the primary report have all been published [4, 6] [also presented in Electronic Supplement Material (ESM 1)]. We prepared this report in accordance with the CONSORT checklist (ESM 2).

Trial sites and patients

Patients were enrolled between August 27, 2020 and May 20, 2021 at 31 sites in 26 hospitals in Denmark, India, Sweden, and Switzerland.

Eligible patients were 18 years or older, had confirmed SARS-CoV-2 infection and received (i) supplementary oxygen at a flow of at least 10 L/min independent of delivery system, (ii) non-invasive ventilation or continuous positive airway pressure for hypoxemia, or (iii) invasive mechanical ventilation. We mainly excluded patients for whom consent could not be obtained, who had received systemic corticosteroids for COVID-19 for 5 days or more or received systemic corticosteroids for other indications than COVID-19 in doses higher than 6 mg dexamethasone equivalents, and those with invasive fungal infection or active tuberculosis. The exclusion criteria and trial definitions are presented in the protocol (ESM 1) and in the primary publication [4, 6]. We obtained informed consent from the patients or their legal surrogates according to national regulations before enrolment. If consent was withdrawn or not granted, permission was sought from the patient or relatives to continue collection and use of trial data.

Procedures

Enrolled patients were randomised 1:1 to intravenous dexamethasone 12 mg or 6 mg (as dexamethasone phosphate 14.4 mg or 7.2 mg, respectively, in isotonic saline to a total bolus volume of 5 ml in identical syringes prepared by unblinded trial staff from shelf medication at each hospital) for up to 10 days. Treatment assignments were concealed from patients, their relatives, clinical staff, and the investigators assessing the outcomes.

Outcome measures at 180 days

All-cause mortality and HRQoL at 180 days after randomisation were pre-specified secondary outcome measures in the protocol (ESM 1) [4, 6]. Data were obtained from patient’s medical records and contact to patients or relatives by phone or e-mail. As soon as possible after day 180, surviving patients were interviewed over the phone by blinded, trained and qualified trial staff using the EuroQol (EQ)-5D-5L questionnaire [7]. The EQ-5D-5L consists of a descriptive system and the EQ visual analogue scale (VAS). The descriptive system comprises 5 dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. The respondents were asked to tick one of 5 boxes (5 levels) for each domain that best described their health today: no, slight, moderate, severe, or extreme problems [7]. On EQ VAS, the respondents were asked to mark how good or bad their health was today on a scale from 100 (‘the best health you can imagine’) to zero (‘the worst health you can imagine’). If a patient was unable to answer, a relative was approached to do so on behalf of the patient; if so, the version of the questionnaire for proxies was used. Trial sites made several attempts for at least 4 weeks to obtain answers from patients and relatives, a process that was centrally monitored by the coordinating centres in Denmark and India, who supported and encouraged sites to obtain replies. The defined outcome measures were the EQ-5D-5L index value (a summary score based on the 5 domains reflecting health state according to the preference of general population; it ranges for 1.0 (perfect health) to values below zero (health states valued worse than death with zero defined as a state equivalent to death)) and EQ VAS [7]. We used the country specific value sets to calculate the index values for Danish [8], Indian [9] and Swedish [10] patients and the German one [11] for those enrolled in Switzerland because there is no Swiss value set available. We also calculated index values using the Danish value set for all patients (as most patients were enrolled in Denmark) as recommended [12].

Statistical analysis

The analyses were done according to the predefined statistical analysis plan for the 180-day outcomes (ESM 1) in the intention-to-treat population defined as all randomised patients (n = 1000) excluding the 18 patients who withdrew consent for the use of any data resulting in 982 patients to be analysed. We present descriptive baseline data (stratified by intervention group, 180-day mortality and HRQoL-respondent status) and outcome data as medians with interquartile ranges (IQRs; for numeric data) and numbers with percentages (for categorical data).

As per the analysis plan (ESM 1), we performed multiple imputations of the HRQoL data, because more than 5% of the patients had missing data (6.1% non-respondents for EQ-5D-5L index values and 5.9% for EQ VAS scores). We used predictive mean matching with 25 datasets imputed separately in each treatment group. We included all stratification variables, all variables used in the HRQoL analyses, important baseline prognostic variables (age, co-morbidities, use of life supportive measures at baseline), and all outcomes available in the imputation model (ESM 1). We also analysed HRQoL in a dataset with best–worst and worst–best imputation of missing data (using the highest or lowest observed values) and in a complete case dataset.

Analyses were adjusted for the stratification variables (trial site, age below 70 years and use of invasive mechanical ventilation). We analysed landmark mortality at 180 days using G-computation with an adjusted logistic regression model and 50,000 bootstrap resamples with results presented as adjusted relative risks and risk differences (as the planned log-binomial models did not converge) supplemented with unadjusted relative risks and risk differences (using the planned log-binomial models) and Fisher’s exact tests. Time to death was presented as Kaplan–Meier survival curves and compared post hoc using Cox regression with results presented as an unadjusted hazard ratio as in the primary report [4]. The differences in adjusted means of EQ-5D-5L index values and EQ VAS scores were analysed using the Kryger Jensen and Lange test [4]).

Results

We obtained vital status 180 days after randomisation for 963 (98.1%) of the 982 patients in the intention-to-treat population (Fig. 1). We obtained data for EQ-5D-5L index values for 922 (93.9%) patients and for EQ VAS scores for 924 (94.1%) patients at a median of 187 days (interquartile range 182–201) after randomisation in the 12 mg group and 186 days (182–202) in the 6 mg group. The HRQoL questionnaire was answered by relatives in 36 of 300 (12%) respondents in the 12 mg group and by 37 of 276 (13%) in the 6 mg group.

Fig. 1
figure 1

Patient flow in the COVID STEROID 2 trial. The details up to 90 days were presented in the primary report [4]. Eighteen patients withdrew consent for the use of any data (12 patients before the first dosing of trial medication and 6 after); the intention-to-treat (ITT) population therefore consisted of 982 patients in total. There were patient withdraws at three levels because of repeated follow-up of patients. *The primary HRQoL analyses were done in the ITT population (n = 982) with deceased patients assigned zero and missing data (n = 60 for EQ-5D-5L index values and n = 58 for EQ VAS scores) multiply imputed

Table 1 reports the baseline characteristics of all patients in the intention-to-treat population by status at 180 days as patients who had died at 180 days, the respondents to HRQoL assessment and those who had missing HRQoL data; there appeared to be no major differences in the characteristics between the 12 mg and 6 mg groups in any of the populations. Overall, the patients received dexamethasone for median 1 day (1–2) before randomisation and 7 days (6–9) after randomisation.

Table 1 Baseline characteristics in all patients in the intention-to-treat population by status at 180 days and intervention group (12 mg or 6 mg of dexamethasone) in the COVID STEROID 2 trial

180-day mortality

At 180 days after randomisation, 164 of 486 (33.7%) patients in the 12 mg group had died compared to 184 of 477 (38.6%) patients in the 6 mg group (adjusted risk difference − 4.3%; 99% CI − 11.7 to 3 and adjusted relative risk 0.89; 0.72–1.09; P = 0.13) (Table 2 and Table S1, ESM 3). Figure 2A presents the Kaplan–Meier mortality curves up to 180 days in the two intervention groups.

Table 2 Outcome measures at 180 days
Fig. 2
figure 2

Time to death or censoring and distribution of HRQoL data at 180 days in the two intervention groups. A Mortality curves in the two intervention groups up to 180 days. Patients who withdrew consent for further data registration or were lost to follow-up were censored at the time of the withdrawal or loss to follow-up. The time to death was compared post hoc using Cox regression with results presented as an unadjusted hazard ratio (HR) with 99% confidence interval (CI) and P value. B Distribution of the HRQoL data as horizontally stacked proportions in the two intervention groups. Patients who died within 180 days after randomisation were assigned the value 0, corresponding to a health state equal to being dead for EQ-5D-5L index values and the worst possible value for EQ VAS. Data from non-respondents were multiply imputed (n = 60 for the index values and n = 58 for EQ VAS scores). Red represents worse outcomes, and blue represents better outcomes. For EQ-5D-5L index values, < 1% of the values in each group in the imputed datasets were below 0, corresponding to health states worse than being dead. These values are displayed together with the value zero

Health-related quality of life

In the primary analysis including the patients who had died at 180 days and those with missing data (multiply imputed), median EQ-5D-5L index values for patients who received 12 mg vs. 6 mg were 0.80 vs. 0.67 (adjusted mean difference 0.06 (99% CI − 0.01 to 0.12; P = 0.10), respectively (Table 2 and Fig. 2B). The median EQ VAS scores were 65 vs. 55 (adjusted mean difference 4 (− 3 to 10; P = 0.22), respectively (Table 2 and Fig. 2B). The results did not change noticeably in the sensitivity analyses (Table S1, ESM 3). In 180-day survivors, the EQ-5D-5L index values and EQ VAS scores were similar between the 12 mg and 6 mg groups (Table 2). The data from the 180-day survivors for each of the 5 domains in EQ-5D-5L are presented in Fig. 3 and Table S2 in ESM 3.

Fig. 3
figure 3

Distribution of single HRQoL domain levels among the 180-day survivors in the two intervention groups. Values are from the responding survivors only (patients (n = 503) and relatives on behalf of patients (n = 73; for these 73 patients, relatives were unable to respond for one patient in the 12 mg group in the usual activities domain and for one patient in the 12 mg group in the anxiety/depression domain)). Patients or relatives answered one of 5 levels (no problems or slight, moderate, severe, or extreme problems) for each of the 5 domains in the EQ-5D-5L survey. The corresponding numeric data are presented in Table S2, ESM 3

Discussion

In this international, randomised clinical trial of patients with COVID-19 and severe hypoxaemia, we observed no statistically significant differences in mortality or HRQoL at 180 days among patients assigned to dexamethasone 12 mg versus 6 mg for up to 10 days. Our estimate of the effect of dexamethasone 12 mg vs 6 mg on mortality at 180 days was consistent with a 12% absolute reduction to a 3% absolute increase. Taken together, the results are in line with those observed at 28- and 90-day follow-up in our trial [4].

At least three trials have assessed the effects of a higher daily dose of dexamethasone versus the recommended 6 mg in patients with severe or critical COVID-19. A higher dose of dexamethasone may improve short-term outcomes among COVID-19 patients receiving oxygen supplementation [14], in those with severe hypoxaemia [4, 5], and those with acute respiratory distress syndrome (ARDS) [15]. In our trial, the survival curves separated between the two intervention groups between days 20 and 60 with no apparent changes before and after that. The reason for this cannot be assessed directly from our data or analyses, but the point estimates of all outcomes assessed in this time period favoured the 12 mg group, including days alive without life support and the occurrence of serious adverse reactions/events [4, 5]. Uncertainty remains because none of these 3 trials observed statistically significantly improvements in patient-important outcomes, e.g. mortality, days alive out of hospital or HRQoL, with higher daily doses of dexamethasone as compared with the 6 mg dose. In addition, there was heterogeneity in settings, disease severity, use of co-interventions, outcome measures and time of follow-up. Importantly, less than 1300 patients were included in the three trials in total.

The HRQoL values observed in the survivors in both intervention groups in our trial were high as compared to those observed in other studies of COVID-19 survivors [2]. In previous studies of COVID-19 patients after hospital discharge in the UK, Norway, Belgium, and Iran, both EQ-5D-5L index values and EQ VAS scores appeared lower than those observed in our patients. In those studies, populations of mixed disease severity were surveyed 4–10 weeks after hospital discharge. A UK study assessed HRQoL in COVID-19 patients, who had been in critical care, 3 to 7 months (median 135 days) after hospital discharge and found both EQ-5D-5L index values and EQ VAS scores that appeared lower than those observed by us [16]. The reasons for these potential differences are less clear, but our HRQoL data were obtained in the context of a clinical trial that had a large sample size, long follow-up and high response rate. Also, we allowed relatives to answer and imputed missing data. Compared to a previous corticosteroid trial (ADRENAL) in patients with septic shock [17], both the EQ-5D-5L index values and EQ VAS scores appeared higher in our trial patients, which may be explained by differences in patient populations. The patterns in impairment among the 5 domains appeared somewhat similar between the two trials (more moderate and severe problems in the usual activities and pain/discomfort domains and less problems in the self-care domain) [17].

There are several strengths to our trial. Our trial is the first large interventional trial to report long-term outcomes in patients with COVID-19. It was investigator-initiated, blinded, enrolled patients in both Europe and India, and 94 to 98% of patients had outcome data reported and analysed. We assessed HRQoL by generic scales, EQ-5D-5L and EQ VAS, which have been widely used in critically ill patients [18]. These factors increase the internal and external validity of our results.

Our results come with some limitations. First, while EQ-5D-5L has been used in COVID-19 survivors [2], it has not been fully validated in this population. Second, as no value set is available for Switzerland, we used the German value set to calculate index values for Swiss patients. Of note, a sensitivity analysis using the Danish value set for all patients did not change the results, indicating consistent effects across countries in the trial. Third, as 6% of patients had missing HRQoL outcome data, we performed multiple imputation and did the final analyses in the imputed datasets supplemented with best–worst, worst–best, and complete case analyses according to the protocol and recommendations for the handling of missing data [6, 19]. Even though 6% missingness is low for studies of HRQoL, this may have affected the results. Fourth, in the primary analyses of HRQoL, we included deceased patients according to the intention-to-treat principle and assigned them the value zero. While zero in EQ-5D-5L index values is valued by the public as a health state equal to being dead, this is not the case for EQ VAS. In general, there is no optimal solution to this problem, but in a trial with potential difference in mortality at HRQoL follow-up analysing HRQoL in survivors only may bias results [20]. We therefore post hoc analysed HRQoL in survivors only to ease interpretation. Fifth, only 10% of our trial patients received interleukin-6-receptor antagonists (IL-6-RA) at baseline, which reduces the generalisability of the results in settings where IL-6-RA are used.

For clinicians who use higher rather than the standard dose of dexamethasone for patients with COVID-19 and severe hypoxaemia, our results should be reassuring because all results were mostly compatible with benefit from 12 mg, and because we could reject at the 99% confidence level an absolute increase in mortality of 3% or more, absolute reductions in EQ-5D-5L index values of 0.01 or more, and EQ VAS score of 3 or more at 180 days with 12 mg versus 6 mg. As described, uncertainty remains, some of which may be reduced with the results of a planned prospective meta-analysis of trials assessing higher versus standard dose dexamethasone in patients with COVID-19 [21] and those of the higher dose dexamethasone domain in the Randomised Evaluation of COVID-19 Therapy (RECOVERY) trial [22].

As corticosteroids are cheap, easily available and recommended for patients with COVID-19 and hypoxaemia, even a small difference in mortality or other patient-important outcomes may result in important clinical and health economic benefits at the population level. Whilst our data do not provide unequivocal evidence that dexamethasone 12 mg is superior to 6 mg, the adjusted absolute difference in mortality at 28, 90 and 180 days was consistently 4–5 percentage points lower in the dexamethasone 12 mg group than in the 6 mg group without suggestions of harm (i.e. no increased serious adverse effects or reduced HRQoL) [4]. If this also applies in patients who also receive IL-6-RA is less certain, because of the low use of IL-6-RA in our trial patients.

In conclusion, dexamethasone 12 mg as compared with 6 mg did not result in statistically significant improvements in mortality or HRQoL at 180 days in patients with COVID-19 and severe hypoxaemia, but the results were most compatible with benefit from the higher dose.