Introduction

Acute exacerbations of chronic obstructive pulmonary disease (AECOPD) refers to the aggravation of respiratory symptoms in patients, which is the leading cause of hospitalization and medical expenditure of COPD patients [1,2,3], is a leading cause of substantial mortality, readmission and poor quality of life worldwide [4, 5]. In recent years, there has been a growing interest in understanding the outcomes of AECOPD and improving the management of these exacerbations. Identifying the risk factors and predict the outcome of AECOPD patients is vital and clinically useful to guide early intervention.

Studies about risk factors of outcomes of AECOPD patients found male sex, comorbidities, smoking status, the number of acute exacerbations in the previous year and abnormal laboratory findings (such as lower blood eosinophils) were associated with poor outcomes of AECOPD patients [6,7,8,9]. Recently, some studies also found new predictors of higher neutrophil-to-lymphocyte (NLR), platelet-to-lymphocyte ratio (PLR) and lymphocyte-to-monocyte ratio (LMR) were associated with outcome of AECOPD patients [10, 11].

Besides identifying risk factors, one crucial aspect of studying AECOPD outcomes is the estimation of survival rates among affected individuals. Predicting the survival of AECOPD as early as admission is help to identify people who at high risk of a poor outcome. Recently, more and more studies focus on predicting the risk of exacerbations in chronic obstructive pulmonary disease (COPD) [12, 13], and somes studies also focus on predicting the survival probability of AECOPD patients [14]. The survival probability among AECOPD patients is a critical measure of the severity and impact of these exacerbations. Understanding and emphasizing the importance of studying survival probability can help healthcare providers better assess the effectiveness of interventions, optimize treatment strategies, and ultimately improve patient outcomes. However, most of these studies only focus on one predictor, lacking of validation or sample size was small, effective models to predict the survival probability of AECOPD patients are still lacking, and value of new predictors such as NLR are also needed to be testified. Models based on multiple variables, validating and large sample size are needed.

Nomogram is based on multivariate regression analysis of multiple indicators, then represented by line segments with scores, so as to predict a certain clinical outcome or the probability of a certain type of event based on the value of multiple variables, and now well used in medical research [15, 16]. We aimed to establish a nomogram which contained multiple variables through a large sample size to predict the survival probability of AECOPD patients.

Materials and methods

Study population

Data of patients hospitalized for AECOPD from January 01, 2012 to December 31, 2022 were anonymized collected from Biobank of First Affiliated Hospital of **. A multivariate Cox proportional hazard model were established to identify the significant prognostic factors associated with survival of AECOPD patients based on factors selected in LASSO-Cox regression, simultaneously estimate the hazard ratio (HR) and 95% confidence intervals (95% CI) of these prognostic factors, a forest map was used to visualize. Then, according to results of multivariate Cox regression analysis, factors with prognostic significance were utilized to establish a survival probability model and a nomogram was used to visualize the model. A time-dependent receiver operating characteristic (ROC) curve enables us to assess the predictive performance of the model over time, area under the ROC curve (AUC), and C-index were used to evaluate discrimination of the model in both training and validation cohorts. We fitted the model using the Cox proportional hazards model, followed by survival analysis using functions from the “survival” package in the R language. Subsequently, we calculated the survival probabilities at specific time points as needed. This approach enabled us to evaluate the model’s predictive ability for survival at different time points. The calibration plot was used to graphically evaluate the calibration of the nomogram in both training and validation cohorts. The value of the C-index ranges from 0.5 to 1.0, with 0.5 indicating random chance and closer to 1 indicates better model discrimination. The performance of the model was also evaluated with 10-fold cross-validation in the training cohort. Finally, the Decision curve analysis (DCA) of the nomogram was used to show the net clinical benefits that could be achieved under different risk thresholds in the training and validation cohorts. All analyses were conducted using R software (version 4.3.2). P-value < 0.05 (two sides) was considered statistic significant.

Results

Patients enrolment and establishment of training cohort and validation cohorts

A total of 8692 patients hospitalized with AECOPD, 4091 patients were excluded for the following reasons: (1) patients aged < 40 years (n = 40); (2) asthma, interstitial lung disease, bronchiectasis, active tuberculosis, pulmonary embolism, lung malignancy or pleural effusion (n = 2454); (3) missing data (n = 1597), details were shown in Fig. 1. 4601 patients were enrolled in this study finally, among whom 81 (1.77%) patients died and 4520 (98.23%) survived. 2760 patients (60%) were randomly allocated to the training cohort and 1801 (40%) to the validation cohort. There was no significant difference in most characteristics between the training cohort and validation cohort (all P > 0.05) (Table 1).

Fig. 1
figure 1

Patient enrollment flowchart. Legends: The flowchart of study population inclusion. a Including patients coexisting multiple diseases

Table 1 Characteristics of the study population in the training and validation cohorts

Clinical characteristics of overall enrolled AECOPD patients

The patient characteristics in the overall population were shown in Table 2. Mean age of total patients was 71.18 years, majority were male (3490; 76%), 1273 (28%) patients occured of respiratory failure during hospitalization. Nearly half of patients coexisting with heart failure (1514; 41%) followed by hypertension (33%), coronary heart disease (28%), arrhythmia (18%), diabetes (12%), chrionic liver disease (3.4%) and chronic kidney disease (2.5%). Details of laboratory findings at admission were shown in Table 2. Regarding treatment, up to 90% patients used oxygen therapy and antibiotics, more than half of patients (56%) used systemic corticosteroids, 1158 (25%) patients admitted to ICU, 943 (20%) patients required MV, among whom 783 (17%) patients required NIMV and 160 (3.5%) patients required IMV.

Table 2 Baseline characteristics of the study population

Comparison between death and discharge AECOPD patients

Univariate analysis between the death and survival group were showen in Table 2. The mean time from admission to death was 9.6 days. Patients who died tended to be older (74 vs. 71 years, p = 0.007), more patients occured of respiratory failure, shock and acute kidney injury during hospitalization (p < 0.001), coexisting arrhythmia, heart failure, hypertension, diabetes (p < 0.05). Laboratory findings included lower lymphocyte counts and platelets counts (all p < 0.001), higher NLR (p < 0.001) and LMR (p = 0.011), prolonged activated partial thromboplastin time and prothrombin time (p < 0.005), as well as higher level of lactose dehydrogenase, total bilirubin, alanine aminotransferase, aspartate aminotransferase, albumin, creatinine, blood urea nitrogen, creatine kinase and creatine kinase-MB (all p < 0.05). Unexpectedly, patients who died had lower eosinophil counts and eosinophil percentage (p < 0.001). Regarding treatment, patients who died more admitted to ICU (70% vs. 24%, p < 0.001), more required MV (81% vs. 19%, p < 0.001), both NIMV (51% vs. 16%, p < 0.001) and IMV (31% vs. 3%, p < 0.001), as well as more systemic corticosteroids usage (p < 0.001).

Predict factors of AECOPD patients survival probability

Consideration clinical relevation and previous studies, 47 variables (listed in Table 1) from training cohort were selected as potential prognostic factors affecting survival probability and were included in LASSO-Cox regression to screen out prognostic factors which were associated with survival probability of AECOPD patients, including general characteristics, comorbidities, laboratory values and treatment. 8 variables (coexisting arrhythmia or chronic kidney disease, requiring oxygen and IMV usage, systemic corticosteroids and antibiotics usage, values of hemoglobin and albumin) were associated with survival probability when the optimal λ value was 0.07 (Supplementary Fig. S1). These 8 variables were then included in the multivariate Cox regression analyses and HR (95% CI) was shown in forest map (Fig. 2). Results showed that coexisting arrhythmia, IMV usage and lower serum albumin values were significantly associated with lower survival probability of AECOPD patients.

Fig. 2
figure 2

Hazard Ratios and 95% Confidence Intervals of 8 variables associated with AECOPD survival in the training cohort. Legends: The forest map of Hazard Ratios and 95% Confidence Intervals of 8 variables associated with AECOPD survival in training cohort. N indicates total number of patients in training cohort. Coexisting arrhythmia, IMV usage and lower serum albumin values were significantly associated with lower survival probability of AECOPD patients

Nomogram establishment and validation

Coexisting arrhythmia, requiring IMV and serum albumin values were included to establish a predictive model for predicting of 7-day, 14-day and 21-day survival probability of AECOPD patients. Figure 3 shows the nomogram of the model, the usage of which is every specific value of these factors was allocated a score on the points scale, the total score was calculated by adding up these scores. Using a case example of a patient with arrhythmia requiring IMV during hospitalization, with an admission serum albumin level of 30 g/L (Fig. 3, vertical red lines). Points for arrhythmia, IMV usage, and serum albumin were 26, 39, and 70, respectively. The total points added up to 135 for this patient, which represents approximately 0.86, 0.7 and 0.57 of 7-day, 14-day and 21-day survival probability.

Fig. 3
figure 3

Nomogram for predicting the AECOPD patients survival probability based on training cohort. Legends: The nomogram consisting of 3 variables: arrhythmia, IMV and serum albumin values. To use the nomogram, the specific Points of individual patients are located on each variable axis. Lines and dots are drawn upward to determine the points received by each variable. The sum of these points is located on the Total Points axis. A line is drawn downward to the ‘7-day Survival Probability, 14-day Survival Probability, and 21-day Survival Probability’ axes to determine the survival probability of AECOPD patients. The unit of albumin is g/L

The performance of discrimination ability and calibration of this nomogram in both training and validation cohorts were evaluated by C-index, AUC value and calibration curve. The C-indexes of the nomogram was 0.816 in the training cohort and 0.814 in validation cohort. The AUC in the training cohort was 0.825 for 7-day, 0.801 for 14-day and 0.825 for 21-day survival probability, and in the validation cohort this was 0.796 for 7-day, 0.831 for 14-day and 0.841 for 21-day, indicating a good discrimination ability of this model (Fig. 4). The calibration curve showed excellent agreement between the nomogram-predicted probability of survival and actual observation in training and validation cohort, which indicates good calibration of the model (Fig. 5). The DCA indicates the net clinical benefits achievable at different risk thresholds of 7-day, 14-day and 21-day in the training and validation cohort were excellent (Fig. 6).

Fig. 4
figure 4

ROC curve of the nomogram in the training and validation cohort. Legends: The ROC curve and AUC of the nomogram in the training (A) and validation (B) cohort of 7-day, 14-day and 21-day survival

In addition, we forced the inclusion of three factors of significance for AECOPD prognosis, including age, blood eosinophil, and leukocyte. After incorporating these three factors into the predictive model, we found that the predictive abilities of the newly included single indicators were inferior to the original predictors. Upon adding age, blood eosinophil, and leukocyte to the established model, we observed a good AUC in the training cohort, however, the C-index and AUC significantly decreased in the validation cohort. The C-index of the nomogram based on age, arrhythmia, IMV, albumin, eosinophil, and leukocyte were 0.719 in the training cohort and 0.708 in the validation cohort. The AUC in the training cohort was 0.871 for 7-day, 0.858 for 14-day and 0.851 for 21-day survival probability, in the validation cohort this was 0.779 for 7-day, 0.720 for 14-day and 0.788 for 21-day (Supplementary Table 1). Nevertheless, age, eosinophil, and leukocyte are crucial for the prognosis of AECOPD patients. We developed a predictive model incorporating these three factors, evaluated the model’s predictive ability using ROC curves (Supplementary Fig. S2), and established a nomogram (Supplementary Fig. S3).

Discussion

Chronic obstructive pulmonary disease (COPD) is a disease state characterized by airflow limitation that is not fully reversible [17]. Patients of COPD have declined lung function, which affecting the life quality of patients seriously. AECOPD can lead to a further decline in lung function, aggravating the progression of the disease, increasing the risk of death [4]. Besides, one review of eleven studies estimated costs of exacerbations vary widely across studies from 88 to 7757 US dollars per exacerbation [20]. All of these studies revealed the importance of identifying the risk factors and predicting the outcome of AECOPD patients: guiding early intervention, improving outcomes and reducing financial burden.

LASSO regression is a regularization method for linear regression problems, which can be used to reduce the complexity of the model, prevent overfitting and select important characteristic variables. We use LASSO-Cox regression to screen for possible predictors, further multivariate Cox regression analysis showed coexisting arrhythmia, IMV usage and lower serum albumin values were significantly associated with lower survival probability of AECOPD patients for 7-day, 14-day and 21-day survival, and the model showed a good performance by assessed the C-index, AUC, and calibration plots.

Coexisting diseases such as cardiovascular disease (CVD) is common in COPD patients [21, 22]. Consistent with previous studies, our univariate analysis found died patients more coexisted with heart failure and hypertension. But alomost these studies did not include arrhythmia. One study found COPD exacerbation is associated with a high prevalence of cardiac arrhythmias [23], another study found patients with COPD are at significantly higher risk for refractory supraventricular arrhythmias. However, there is less study to investigate whether arrhythmias are associated with mortality among AECOPD patients, effect of arrhythmias on outcomes of AECOPD patients is less studied and is neglected to some extent. Arrhythmias accounted for 18% of total and was be found to be risk factor of AECOPD death in our study, which reminds us to be vigilant of arrhythmia in AECOPD patients, early identification of arrhythmia and intervention is helpful to improve prognosis of AECOPD patients.

Mechanical ventilation (MV) is helpful for patients to overcome respiratory failure caused by underlying diseases and create conditions for the treatment of underlying diseases. Invasive mechanical ventilation (IMV) is the primary choice of treatment for 5.9–8.7% of AECOPD patients [24]. In our study, there was 943 (20%) patients used IMV totally, and death patients were 2.6-folds using IMV than survivors. Indeed, death patients were in more worse situation such as more comorbidities and more ccurrence of respiratory failure in our study. Our study suggested that patients who using IMV are more likely to suffer a poorer prognosis compared with those who do not use IMV, poor condition of patients should be improved early, and more attention should be paid to AECOPD who use IMV to improve the prognosis.

Albumin is the most important protein in human plasma, maintaining the body’s nutrition and osmotic pressure, is a biomarker for nutritional status of body. Malnutrition is a consequence of reduced nutritional intake and muscle loss. Previous studies have found that death AECOPD patients had poorer nutritional status than survivors [25, 26]. Consistent with previous studies, we found patients who died having lower serum albumin level, which also indicating poorer nutritional status. COPD is a chronic inflammatory lung disorder and often combined with digestion and absorption dysfunction and high energy consumption, causing the COPD patients to suffer from malnutrition, especially in the acute exacerbation of COPD [27]. AECOPD is aggravation of respiratory symptoms in patients, and systemic inflammatory response is also aggravated, which contribute to a decrease in the albumin levels in serum [28, 29], leading to a poor outcomes of patients. Besides, COPD is also associated with CVD [29], and there were 1241 (27%) patients coexisting with CVD in our study. A variety of complex factors lead to poor nutrition in COPD patients. Clinically, we should pay more attention to the nutritional status of AECOPD patients, improving nutritional status as early as possible, maintaining muscle mass, angainst systemic inflammatory response to improve outcome.

Inflammatory response is aggravated in AECOPD. Values of anti-inflammatory biomarkers (leukocyte, neutrophil and monocyte) were abnormal in AECOPD patients unavoidablely, moreover, some studies regarded NLR, PLR and LMR as indicators of systemic inflammatory response, and further studies found higher NLR, PLR and LMR were associated with outcome of AECOPD patients [10, 11, 30], but were limited by small sample sizes. Our univariate study found AECOPD patients who died had higher NLR and PLR, and lower LMR. NLR and LMR showed statistical differences between survivors and non-survivors, however, finding of association between LMR and outcome of AECOPD patients showed contrary to previous studies. Overall, these inconsistent results of association between outcomes of AECOPD patients and indicators (PLR and LMR) are needed to be further studied.

This study has some strengths. First, this is a study based on a large sample. Second, we used LASSO regression and multivariate Cox regression analysis to screen for possible predictors and risk factors which prevent overfitting. Third, our model uses 3 predictors which are easily acquired and simplify the assess process, is of great value for clinical reference and use. Finally, we found the NLR and LMR were indeed associated with bad outcome of AECOPD patients through a large sample study.

This study has some limitations. Because of this is a retrospective, single-center study, some biases were inevitable. Firstly, Due to the extensive time span covered by the data extracted from the database (2012.01.01-2022.12.31), there were missing data for some early admission patients, with many serum indicator variables having missing values > 10%. As multiple imputation was not feasible to address this issue, we were compelled to exclude the data of these patients. This limitation may have had some impact on the study outcomes, a factor we acknowledge and prudently consider when analyzing and interpreting the research findings. Despite these constraints, we have made diligent efforts to ensure the reliability and accuracy of the study conclusions, recognizing the need for more meticulous handling of potential challenges in data collection processes in future research endeavors. Secondly, the low death rate raises concerns about the stability of the Cox regression model, to address this vital concern, during model evaluation, we employed rigorous cross-validation and resampling techniques to ensure the stability and generalizability of the model. And despite the limited number of outcome events, our study results demonstrated consistently high performance across different evaluation metrics such as C-indexes and AUC. This suggests that even in the context of a low event rate, our model can effectively differentiate patient. We also conducted an assessment of the model’s confidence intervals to ensure the reliability and stability of the results. Lastly, some important records lacked such as smoking history and pulmonary functions which are closely related to outcomes of AECOPD patients.

Fig. 5
figure 5

Calibration curve of the nomogram in the training and validation cohort. Legends: The calibration curve of the nomogram in the training cohort of 7-day (A), 14-day (C) and 21-day (E) survival, and validation cohort of 7-day (B), 14-day (D) and 21-day (F) survival. The overlap between solid and dashed lines in the line graph demonstrates the consistency between the nomogram-predicted 7-day, 14-day, and 21-day survival probabilities of AECOPD patients and the actual survival probabilities of AECOPD patients

Fig. 6
figure 6

Decision curve analysis of the nomogram in the training and validation cohort. Legends: The DCA of the nomogram in the training cohort of 7-day (A), 14-day(B) and 21-day (C) survival, as well as in the validation cohort of 7-day (D), 14-day (E) and 21-day (F) survival. DCA depicted in the line graph illustrates the clinical net benefit achievable at various risk thresholds. The threshold range for DCA is determined based on the model’s sensitivity and specificity derived from the training and validation cohorts. Interventions are targeted towards patients within the threshold range to assess and manage risks effectively. The net benefit surpasses that of intervening for all patients or not intervening at all

Conclusions

Coexisting arrhythmia and chronic kidney disease, lower hemoglobin and albumin values, requiring oxygen therapy, systemic corticosteroids, antibiotics and IMV were associated with lower survival probability of AECOPD patients. We established a nomogram based on 3 predictors (coexisting arrhythmia, IMV usage and lower serum albumin values) for predicting the survival probability of AECOPD patients and the nomogram showed good performance.