Introduction

Stroke is the leading cause of death and long-term disability worldwide [1]. 2019 global burden of disease study (GBD) data [2]shows that stroke remains to be the second leading cause of death (11.6% of deaths) and the third leading cause of disability (5.7% of total disability-adjusted life years) in the world. Hemorrhagic stroke (HS) accounts for 37.6% of all stroke types and causes 5.5 million deaths per year approximately, with about half of deaths caused by stroke due to HS. The risk of death from HS is higher compared to ischemic stroke (IS) [3], with a 30-day mortality of 13-61% [4]. In recent years, more and more stroke patients are admitted to the intensive care unit (ICU) for neurological monitoring or management of complications, and 10-30% of them are in critical condition [5]. Hence, it is of great significance to optimize the allocation of medical resources by identifying and managing high-risk groups.

Predicting the occurrence of adverse outcomes is the prerequisite for risk stratification. Risk scores are helpful tools for prediction. Many investigators have developed diverse disease risk scoring systems. Traditional scoring systems commonly used in clinical practice include acute physiology and chronic health evaluation(APACHE II) [6], sequential organ failure assessment(SOFA) [7], Oxford acute severity of illness score(OASIS) [8], and simplified acute physiology score(SAPSII) [9], which include various variables with their respective point assignment scheme [10]. However, these traditional scores are applicable to a wide population, whose effectiveness in predicting specific diseases’ prognosis is not always satisfactory [11, 12], the application of these scores in HS is limited. Many scholars have made efforts to construct predictive tools for HS. Ho and Smith et al. [13, 14] built a prediction model of HS death in the ICU by logistic regression, and stratified the risk degree of patients by calculating risk scores. However, with the increasing number of clinical examinations and diagnostic items, clinical data often present multidimensional, highly correlated, and nonlinear characteristics [15], which limits the application conditions of traditional clinical modeling methods such as logistic and Cox regression [16]. To compensate for the shortcomings of traditional analytical methods, machine learning algorithms have emerged in the era of big data [17]. Lin and Trevisi et al. [18, 19] employed common machine learning algorithms, such as support vector machine, random forest, and neural network to predict poor functional outcomes in HS patients in the hospital. Howerer, those studies only considered the probability of survival without incorporating the time dimension, by which model prediction is often imprecise [19, 20]. Random survival forest (RSF) is a derivative of the random forest algorithm in survival analysis, which can not only handle complex right-censored survival data but also analyze interactions between variables, and has been applied to pancreatic cancer [21], sepsis [22], and breast cancer [45], and Luo et al. showed a steep linear relationship between reduced blood creatine levels and increased risk of in-hospital and 1-year mortality in patients with intracranial hemorrhage when blood creatine values were < 1.9 mg/dL [46]. We found a 1.3-fold increase in the risk of in-hospital death in HS patients for each range of temperature change, which was consistent with previous studies [47]. Iglesias Rey et al. conducted a retrospective study of 887 patients with non-traumatic cerebral hemorrhage and found that patients with hypertensive cerebral hemorrhage had the highest body temperature and the greatest increase in body temperature within 24 h. Patients with hypertensive cerebral hemorrhage who developed hyperthermia after 3 months had a 5.3-fold increased risk of poor prognosis, moreover, the amount of edema within 24 h was positively correlated with body temperature in patients with cerebral hemorrhage due to hypertension [48]. Anion gap reflects the acid-base balance in body fluids and plays an important role in the identification of metabolic acidosis [49]. Previous studies have shown that anion gap is an important short- and long-term prognostic marker in patients with IS [50], however, its use in patients with HS is less studied. Shen et al. found that HS patients experienced a decrease in the mini-mental state examination, GCS and other indicators of neurological and cognitive function as the anion gap increased at the time of admission [51]. A meta-analysis had shown that high sodium intake was positively associated with stroke risk, with a 23% increase in stroke risk for every 86 mmol/d increase in sodium intake [52]. Wang et al., who included 64,909 patients with non-traumatic HS in the United States, showed that spontaneous cerebral hemorrhage patients with abnormal serum sodium had a 1.11-fold increased risk of 30-day readmission compared to patients with normal serum sodium [30].

We have some strengths in this study. We not only compared the predictive efficacy of RSF and Cox, we also compared the models constructed by RSF and Cox with the clinical traditional scoring systems, in addition, we also found the variables that had a strong influence on the occurrence of patient’s deaths in the ICU and ranked the variables in terms of importance, which may provide guidance for further practical applications. However, this study has several limitations. Firstly, the MIMIC-IV database is a single-center database, which may limit the applicability of the study results to patients in other centers, so future inclusion of clinical data from multiple centers is desired for external validation. Secondly, due to the limitations of the MIMIC-IV database features, some important indicators such as bilirubin, lactate and albumin could not be included in the analysis because of serious missing values. Finally, only demographic information, laboratory indicators, and comorbidity information were included in this study, and some important information such as medication and imaging tests were not included, which reduced the predictive performance of the models.

Conclusion

We constructed the RSF and Cox models based on the survival data of patients with HS in the ICU. The results showed that the prediction performance of RSF was better than Cox regression for 7-day and 28-day mortality, with creatine, temperature, anion gap and sodium ranking in the top 10 important variables in both models. RSF can provide new ideas for clinical decision-making of HS patients.