Introduction

Falls during hospital admission are serious adverse events [1]. Approximately 3% of all patients and 5.9–6.4% of patients 70 years and over fall during their hospital stay [2, 3]. Fall-related injuries occur in 26–42% of in-hospital falls [4, 5]. The length of stay and costs of patients with a fall-related injury was more than double that of non-fallers [6]. In patients (≥ 65 years) treated in a hospital for a fall-related problem, more than 25% experienced another fall within a year [7]. These recurrent falls (2 or more per year) are associated with serious injuries [8]. Prevention of in-hospital falls is a safety goal in many hospitals. The first step is to identify patients at risk of falling by screening all hospitalized patients. Subsequently, the Dutch falls guideline advises to start personalized multifactorial interventions based on an individual fall-risk assessment [9]. In multifactorial interventions, the treatment team starts two or more interventions [10]. A tool that accurately predicts fall risk is of interest to healthcare workers and hospital management. Specifically for older patients, complexity such as multimorbidity and cognitive decline challenge fall prevention strategies. Several tools for fall-risk prediction have been developed. For older inpatients, the accuracy of these tools leaves room for improvement and there exists a need for better fall prediction tools [11,12,13].

The Johns Hopkins Fall Risk Assessment Tool (JHFRAT) for predicting in-hospital fall risk was published in 2005 [14]. A team developed a scoring model based on literature, tested this in potential patient scenarios and adjusted the model based on group consensus [14]. In 2007, the tool was revised based on literature and experiences [15]. Currently, the performance of the JHFRAT in older inpatients is, however, largely unknown. To assess a prognostic tool, discrimination (how well does JHFRAT discriminate between non-fallers and fallers) and calibration (agreement of predicted probability to fall and actual fall probability) are important aspects [16, 17]. Previously-published studies (investigating JHFRAT performance, but not specifically in older patients), conducted in Korea, the US and Brazil, did not report calibration [18,19,20]. In external validation, the prognostic tool is evaluated in new data—different than the original study sample—to determine whether the performance of the tool is sufficient in this new set of patients. [16]

Hospital populations and healthcare needs change over time. For example, more complex patients are admitted to academic medical centers. Specific periods such as seasonal influenza and COVID-19 lead to changing care demands [21,22,23,24]. Those changes could potentially impact JHFRAT scores, the percentage of completed JHFRAT and the actual occurrence of falls. Up to now, none of the published studies assessed potential time-related effects on the prognostic performance of JHFRAT.

Therefore, the objective of our study was to assess the performance of JHFRAT in a population of older Dutch hospitalized patients using a large electronic health records (EHR) retrospective cohort. We studied the association of JHFRAT with inpatient falls, and the performance in test accuracy, discrimination, and calibration. We also assessed the potential time-related effects including changes in population/healthcare needs due to seasonal influenza, COVID-19, spring, summer, fall or winter.

Methods

We followed the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting statement [25].

Ethics approval

The Medical Ethics Review Committee of our hospital reviewed the study plan (reference number W18_027#18.043) and decided that according to the Medical Research Involving Human Subjects Act (WMO) approval was not needed.

Study population and data collection

The setting was a university medical center with 1002 beds in the Netherlands (Amsterdam). Data were extracted from the EHR and included ≥ 24 h admissions between 2016 and 2021 of patients ≥ 70 years with ≥ 1 JHFRAT score. The first JHFRAT score was used in the analyses. Patients that fell during admission, and only had a JHFRAT score after the fall, were excluded. We extracted medical diagnoses, age, admission/discharge dates, gender, medication administrations, daily functioning, JHFRAT scores, problem list and free text.

JHFRAT

The fall-risk protocol in our hospital requires using JHFRAT within 24 h of hospital admission (patients ≥ 18 years) with exception of the intensive care unit. The 2007 JHFRAT instrument contains 7 categories: age (0–3 points); fall history (0 or 5); mobility (0, 2, 4 or 6); cognition (0–7); elimination/ bowel/ urine (0, 2 or 4); patient care equipment (0–3); and medication (0, 3, 5 or 7). [15]. Fall history includes “A fall in the past 6 months”: yes (5 points) or no (0 points). For mobility, the assessment is done with three criteria: assisted (2 points), unsteady (2 points) and/or impaired (2 points). For cognition, assessment is conducted based on the following categories: altered awareness (1 point), impulsive behavior (2 points), and/or lacks understanding (4 points). For patient care equipment, assessment is done by counting 1, 2, or ≥ 3 lines, drainages or catheters. The highest possible score is 35 points. Patients with scores > 13 are rated as high risk to fall, between 6–13 medium risk and < 6 low risk. The interrater reliability of the Chinese translation of JHFRAT was 97,14% in 20 older inpatients [26]. A comprehensive study in 2018 studied the intraclass correlation coefficient (ICC) to measure interrater agreement and showed an ICC of 0.78 both at time 1 (1615 patients) and time 2 (1275 patients) [20].

Falls

In-hospital falls (with date and time) were extracted from free-text data and the problem list. Similar to our previous study, we created a regular expression using Dutch terms and synonyms for falls to identify falls in the problem list [27]. The problem list is a list, which is composed of the attending medical doctors, within the EHR containing diagnoses and other problems affecting the patients’ health. To identify falls in the free-text data, we used the program CTcue version 2.0.10 and made two search queries. Free-text data is unstructured data derived from the answers that doctors, nurses and other healthcare professionals write in the clinical notes or other documents in the EHR. Patients identified by CTcue were manually reviewed by D.M., K.R., and D.S. to select admissions with a fall. In case of doubt, patients were also reviewed by B.D. We used the WHO falls definition: “A fall is an event which results in a person coming to rest inadvertently on the ground or floor or other lower level” [28] The free-text searches and regular expressions are provided in the appendix of our previous study [27].

Statistical analysis

Prevalence of inpatient falls was expressed as the percentage of admissions with ≥ 1 fall during a hospital stay. We used univariable logistic regression to investigate the association of JHFRAT total score with inpatient falls (yes/no). As the total score is a continuous independent variable, we checked that it had a monotonous increasing relationship with falls. We also used univariable logistic regression to estimate the association between each individual JHFRAT subcategory (except for age, as all patients were > 70 years with a score of 2) with inpatient falls. To adjust for patients that were admitted multiple times, we used generalized estimating equations in all logistic regression analyses.

We calculated the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) of JHFRAT for two cut-off values: (1) > 13 (high risk) and (2) > 5 (medium or high risk). Our hospital uses the 2007 version of the JHFRAT, except that patients ≥ 80 years receive 2 rather than 3 points. Therefore, we also conducted a sensitivity analysis wherein patients ≥ 80 years were given an additional point. We applied the Wilcoxon rank sum test for differences in median JHFRAT scores between admissions with and without a fall. Subsequently, we used the area under the receiver operating characteristic curve (ROC_AUC) to quantify discrimination. Furthermore, we plotted the precision-recall curve which shows the PPV for different sensitivity thresholds [29]. We also calculated the area under this curve (PR_AUC), which is a relevant but less-commonly applied statistic to assess the performance of a prognostic model. For calibration of JHFRAT, we plotted the predicted probability of falling per admission against the actual probability of falls.

To assess potential time-related effects, we calculated the ROC_AUC per 6 months (next to the ROC_AUC over 5 years). The ROC_AUCs per 6 months were plotted to investigate differences over time. To assess the potential effect of seasonal influenza, COVID-19 periods or of the four seasons in the Netherlands, we conducted six sensitivity analyses: (1) We compared the logistic model (JHFRAT and falls) to the logistic model (JHFRAT and falls) adjusted for seasonal influenza periods (“yes” or “no”), as published by the national institute for public health and the environment, using ANOVA [30]. (2) We repeated the same sensitivity analysis for admissions during the COVID-19 period. (3, 4, 5, and 6) We repeated the same sensitivity analyses for admissions during one of the four seasons (spring, summer, fall or winter). The months of the seasons were selected based on the meteorologists of the Royal Netherlands Meteorological Institute. [31]

For all analyses, we used R (version 1.3.1093). A p-value of < 0.05 was considered significant.

Results

Characteristics

In the period of 2016–2021 (5 years), 21,406 hospital admissions (≥ 24 h) were included in the database. Of these admissions, 4143 were excluded (no (completed) JHFRAT or JHFRAT after fall). Percentages of admissions with a JHFRAT varied between 74 and 89% per month. Table 1 shows the characteristics of the 17,263 admissions (11,947 unique patients) with ≥ 1 JHFRAT and includes patients between 70 and 102 years of age. A fall occurred in 2.5% (427) of the admissions, 404 falls were found using free text and 68 using the problem list.

Table 1 Patient characteristics and JHFRAT scores

Median time to first fall was 6.2 days (IQR = 2.6–14.2) and median time to JHFRAT was about 3 h (IQR = 1–20 h). When comparing the total population (n = 21,406) with the included population (n = 17,263), more patients died in the total population (5% versus 3.4%), median length of stay was shorter in the total population (4.0 versus 4.8 days), and gender and age were similar.

Association of JHFRAT with inpatient falls

The JHFRAT total score varied from 2 to 29. The association of JHFRAT with inpatient falls was OR = 1.11 (1.03–1.20) for a 1-point JHFRAT increase. All subcategories of JHFRAT were significantly associated with falls (Table 2). Admissions with high fall-risk scores (> 13) had OR = 4.5 (4.2–4.9) and admissions with medium fall-risk scores OR = 2.4 (2.2–2.6) in comparison with admissions with low fall-risk scores.

Table 2 Association of JHFRAT (sub)scores with inpatient falls using univariable regression analysis

Test accuracy

For high fall risk (> 13), JHFRAT had a sensitivity of 25%, specificity of 89%, PPV of 6% and NPV of 98% (Supplementary table 1). For medium fall risk (> 5), JHFRAT had a sensitivity of 73%, specificity of 51%, PPV of 4% and NPV of 99%. In our sensitivity analysis (using the 2007 JHFRAT as intended, with patients ≥ 80 years ranking 3 points) for JHFRAT > 5 gave a sensitivity of 74%, specificity of 49%, PPV of 4% and NPV of 99%.

Discrimination

The median JHFRAT score in the group with inpatient falls was 9 points, and in the group without falls 5 points (Table 1). These groups were significantly different (< 0.001). Figure 1 shows the ROC curve and precision-recall curve. The ROC_AUC was 0.67 and PR_AUC was 0.049. Figure 1b shows the PPV for each sensitivity and three dots give Sn = 0.30, PPV = 0.05 (dot 1), Sn = 0.56, PPV = 0.04 (dot 2), Sn = 0.79, PPV = 0.03 (dot 3).

Fig. 1
figure 1

a Receiver operating characteristic (ROC) curve b precision (PPV)—recall (sensitivity) curve with yellow dot 1 (Sn = 0.30, PPV = 0.05), dot 2 (Sn = 0.56, PPV = 0.04), dot 3 (Sn = 0.79, PPV = 0.03)

Calibration

Figure 2 shows the calibration plot with predicted probability versus the actual probability of inpatient falls. The (yellow) dotted line represents the perfect model performance. The black line represents the actual calibration plot. Over-prediction occurs when the (black) solid line is under the yellow-dotted line. Most admissions (16,919) had a predicted probability below 7.5%.

Fig. 2
figure 2

Calibration plot “predicted probability versus the actual probability”

Potential time-related effects

The ROC_AUC (per 6 months) varied between 0.62 and 0.71 (Fig. 3). The model (JHFRAT and falls) with covariate seasonal influenza was significantly (p = 0.02) different than the model without covariate. The model with covariate COVID-19 was not significantly (p = 0.11) different than the model without covariate. The models with covariate spring (p = 0.10), summer (0.17), fall (0.14), or winter (0.24) were not significantly different than the models without covariate.

Fig. 3
figure 3

Area under the receiver operating characteristic curve (ROC_AUC) calculated per 6 months, plotted over time

Discussion

Our study examined the performance of JHFRAT in a large population of older hospitalized Dutch patients. The results show that JHFRAT and subcategories were significantly associated with in-hospital falls. The overall AUC of JHFRAT was low and calibration of JHFRAT showed over-prediction. Due to the study period of 5 years, we were able to assess the AUC over time, which varied between 0.62 and 0.71. Our results did show an effect of seasonal influenza periods on the association between JHFRAT and falls, but did not show an effect of COVID-19, spring, summer, fall or winter periods.

Strengths of our study were the use of a large EHR dataset with hospital admission data over 5 years. We used a comprehensive set of performance measures and were, therefore, able to report different aspects of JHFRAT performance. We found that the JHFRAT total score and six subcategories were associated with falls. The findings were similar for 4/6 subcategories in another study [32]. That study did not find an association between elimination problems or fall history and falls [32]. The differences with our study could be related to differences in the population (≥ 18 years vs ≥ 70 years), geography (Korea vs The Netherlands), study design (case–control 1:4 vs no case–control), falls detection system (adverse event reporting system vs detection through problem list and free text) or sample size (1050 patients vs 17,263 admissions). The positive association of JHFRAT subcategories with falls implies these categories are important predictors for in-hospital falls but this does not imply causality.

We found a sensitivity of 73% and a specificity of 51% for JHFRAT > 5 (medium/high fall risk) which lies within the broad ranges of sensitivity (46–87%) and specificity (28–71%) reported in previous studies [20, 32]. These studies used a case–control study design and included a smaller sample size (~ 1000, vs ~ 17,000) compared to our study. Ideally, the sensitivity of a fall-risk tool would be higher and more patients with a fall would be included in the medium/high-risk group without a substantial reduction in specificity.

The low PPV reported in our study is similar to the PPV of 4% found by Klinkenberg et al.[19] That study, conducted in the US, also used a retrospective chart review, but with 1.5% falls (our study: 2.5% falls). The PPV reports the percentage of patients with a high fall risk who actually did fall during their hospital stay [33]. The PPVs reported by other studies were much higher (PPV = 34–55% for JHFRAT > 13 and PPV = 21–36% for JHFRAT > 5) compared to our PPVs of 6% and 4% [18, 20, 32, 34]. Those studies used a case–control design of 1:4 or 1:2 or reported a very high fall-incidence of 20%. This explains the high PPV, as the PPV depends on the prevalence of the outcome [33]. The PPV in a case–control study cannot be used to estimate the PPV in practice. The very low PPV is problematic for JHFRAT use in clinical practice as it limits the efficiency of fall prevention strategies.

For interpretation of the AUC, a rough guide can be used with 0.6–0.7 indicating “poor”, 0.7–0.8 “fair”, 0.8–0.9 “good” and above 0.9 “excellent” discrimination between admissions with and without falls [35, 36]. The AUC of fall prediction models are not expected to have an excellent AUC. A fall is a result of the culmination of complex phenomena at the individual (micro), social (meso) and environmental (macro) levels, even in a more controlled environment like a hospital. Most of these predictors are unknown and very hard to collect at any of these levels. For example, we do not know all lifestyle factors, the interaction with the nurses or the condition of the patient's bed. Although the AUC is not likely to be very large, its interpretation is still useful in terms of its extent to discriminate between fallers and non-fallers. Of the previous studies, one reported a lower AUC (0.58) and the other studies had a higher AUC (0.69, 0.70, and 0.71) compared to our AUC (0.67) [18,19,20, 32]. One of these studies conducted a sub-analysis for patients ≥ 65 years (with an unknown sample size of older patients; the sample size of patients ≥ 18 years was 1050) reporting a lower AUC (0.61) in this subpopulation [32]. Differences in the AUC might be explained by differences in the timing of performing the JHFRAT (first score during admission, last score before the fall/before discharge). For example, in the study of Hur et al., the AUC of the first fall risk score was 0.63 and the AUC of the last fall risk score was 0.70 [32]. We conducted an extra sensitivity analysis to assess the effect of JHFRAT timing. In total, 14,323 admissions had a JHFRAT within the first 24 h of admission. The AUC (0.66) in this group was slightly lower than the AUC (0.67) in all patients with a JHFRAT. Other differences between our study and previously-published studies were differences in sample size, age, fall incidence (1–20%), and study design (prospective and retrospective cohorts or case–control) [18,19,20, 32]. We found the lowest AUC (0.62) in the period from July to December 2018 with 1732 admissions and a fall prevalence of 3.2%. The highest AUC (0.71) was found in the period from July to December 2016 with 1712 admissions and a fall prevalence of 2.8%. Except for fall prevalence (both higher than the overall fall prevalence of 2.5%), we did not find differences in patient characteristics between the groups. The differences might be due to situations or characteristics not included in our dataset or due to chance. For example factors on the previously mentioned micro, meso and/or macro levels.

Seasonal influenza did affect the association of JHFRAT and falls. The seasonal influenza periods had a higher fall prevalence (2.8%) and more patients with a medium/high fall risk (53%), compared to the non-seasonal influenza periods (2.3% and 48%). More research is needed to assess the potential effect of seasonal influenza on falls and on the performance of fall prediction tools.

We identified falls using not only the problem list but also free-text searches. The fall prevalence in our study period was 2.5%. This percentage is lower compared to the ~ 6% fall prevalence observed in previous studies [2, 3]. This might be due to the mean age of the population in these studies was higher, compared to our population. It might also be due to falls not being documented in the EHR, or possibly only the more serious falls, or that our free-text search did not detect all fall incidents. We found 95% of the falls using free text. This is a high percentage compared to the 34% in the study of Baus et al. and slightly lower compared to 100% in the study of Toyabe and emphasizes the need to use multiple sources including free text to identify falls in a real-life setting [37, 38].

Limitations of our study are related to the retrospective study design. JHFRAT was used in practice and clinicians might have started interventions to reduce the risk of falls. This might have biased the results, as in the study of Klinkenberg et al. [19] In our hospital, JHFRAT is mandatory and departments receive feedback on their performance on the percentage of patients with a JHFRAT. However, in our previous (qualitative) study in the same hospital as in the current study, we found that interventions to prevent falls are mostly not personalized and not based on JHFRAT due to a lack of awareness and motivation [39]. According to the Dutch falls guideline, several fall-risk-decreasing interventions should be considered based on the individual fall risk, including a mobility program, medication review, nutrition and knowledge transfer [9]. In our previous study, we found, that for most older patients the physiotherapist is consulted for a mobility program [39]. The mobility programs could have influenced the fall incidence in our study population. However, this consult was not based on the JHFRAT. Based on our previous study and the experience of the involved geriatrician with falls specialization (co-author NV) working in the same hospital as our study, we concluded that JHFRAT had little influence on fall-prevention actions during the study period. Participants of our previous study preferred electronic decision support for preventive intervention based on JHFRAT to improve the use of JHFRAT [39]. Other limitations include that we manually extracted the date and time of the falls because the program CTcue that was authorized to use at our hospital does not support data extraction. We did not know whether falls were with or without injuries. The study of Hnizdo et al., conducted in-home health services, implies that JHFRAT might perform better in identifying patients with a high risk for injurious falls [40]. Furthermore, we did not know whether patients fell multiple times during their hospital stay. Reasons for admission, including falls, were not included in our dataset. Furthermore, we found that JHFRAT was not reported for all admitted patients. In our previous study, nurses stated this was because of a lack of time or other priorities [39]. To assess if our population (17,263) was (locally) representative, we conducted a sensitivity analysis on all admissions (21,285 admissions, excluding 121 admissions with a JHFRAT registered more than 1 h before admission). For this analysis, we imputed the missing JHFRAT scores by Multivariate Imputation by Chained Equations using the mice package in R (with three imputations). The AUC (0.67) in this sensitivity analysis was similar to the AUC in our main analysis indicating that the population of 17,263 was well-chosen.

The results of our study increase knowledge on the performance of JHFRAT in older patients. Currently, nurses manually screen all patients but they prefer that information already known would be automatically filled in [39]. Administrative and clinical data offer opportunities for automatic fall-risk assessment without manual data entry. Future research might focus on develo** a prediction model using data already available in the EHR and/or requiring minimal additional data collection. Developed prediction models show promising initial results [41,42,43,44]. A model that re-uses EHR data would diminish the workload of nurses and make this work process more efficient.

JHFRAT was associated with in-hospital falls in a large Dutch observational EHR dataset of older patients. The discrimination between fallers and non-fallers was low, but stable over time. The calibration of JHFRAT showed over-prediction. Improvements in fall-risk assessment are necessary to improve the efficiency of the workflow for nurses. Future research can focus on develo** fall-risk prognostic models using EHR data or on improving the JHFRAT by automatically filling in already-known information.