Background

High healthcare utilizers are a small group of patients who impose a disproportionately high burden on the healthcare system due to their elevated resource use, and often have unmet care needs or receive unnecessary care [1]. To design policies to address these issues, high healthcare utilization and its drivers have been studied extensively in recent years. The definition of high utilization has been heterogeneous. The choice of metric used to measure utilization often differs and depends on the disease or health service context. As distributions of healthcare cost and utilization incurred by patients are often skewed [2,3,4], the approach of defining high utilizers (HUs) as patients in the top percentiles of healthcare cost has been commonly adopted. Most studies use cost to identify HUs as it can be regarded as a measure of utilization intensity [5, 6]. It also gives a direct economic perspective (e.g. potential cost savings and impact on government funding) to the problem at hand [7]. The percentile threshold for cost used to identify HUs varies between studies, ranging from the top 5% of patients [7,8,9,10,11,12,13,14,15,16] to the top 20% [17], with top 10% being the most common definition used [1, 7, 11, 15, 18,19,20,21,22,23,24,25,26]. Cost is a good measure of utilization and can serve as a proxy for utilization across different resource types (e.g. inpatient admissions, outpatient visits and procedures). However, as cost would be heavily influenced by the number of inpatient bed days incurred by a patient, looking at cost alone may not provide a complete picture of utilization volume. Other metrics commonly used to identify HUs include outpatient visits to clinics [27], emergency attendances [28,29,30], and inpatient utilization such as readmissions within a certain period [31,32,33] or length of stay (LOS) [34,35,36]. There are few papers that examine multiple metrics simultaneously [23, 37,38,39]. Examining other metrics of utilization in tandem with cost will allow policymakers and clinicians to look at multiple dimensions of resource use [37, 40], and understand the different underlying drivers to get a more comprehensive understanding of healthcare utilization. Furthermore, segmentation of a patient population using multiple metrics would create smaller groups of patients with largely similar utilization patterns and characteristics, facilitating targeting and tailoring of interventions for effective use of resources [41].

With the increasing adoption of electronic medical record (EMR) systems in hospitals [42,43,44], comprehensive administrative cost and utilization data over multiple years are now more readily available. Researchers and health systems can use this information to segment the general patient population and address the diverse needs of each patient segment [41]. Segmentation will help identify homogenous patient subpopulations and provide knowledge on their characteristics, needs and trajectories over time. This knowledge would then support development and implementation of interventions targeted at each subpopulation, such that the interventions are more tailored to individual needs, and likely to be of greater impact [41, 45,46,47,48]. In the long run, this would also facilitate program evaluation and outcomes tracking for each group [48].

While examining high utilization in a cross-sectional manner allows us to understand the profiles of HUs, observing HUs longitudinally would provide valuable information on how utilization per patient accumulates over time, how patients transit between HU groups and how patients’ utilizations change with their transitions. The definition of persistence of HU behavior differs widely between studies, with one definition being recurrence as a HU in the subsequent year [4, 9, 19, 49]. Identification of persistent users is pertinent as interventions on HUs have not been shown to be efficacious or cost-effective potentially due to regression to the mean in majority of patients [20, 50]. Hence, insights from these longitudinal analyses could potentially inform how healthcare systems can better design and target interventions for HUs.

This study will demonstrate the use of cost and utilization metrics to segment a patient population into groups of high healthcare utilizers, based on 1 year’s patterns of hospital-based resource use in an Academic Medical Center (AMC) in Singapore. The groups’ socio-demographic characteristics, utilization patterns and medical history will be described for comparison. The comparisons will illustrate the benefits of using multiple metrics to identify different HU profiles, highlight the healthcare needs associated with each profile and their subsequent longitudinal behaviors. This work provides multifaceted insights on the characteristics of high healthcare utilizers, which will inform program and policy development, and the identification of the correct subgroups for more targeted interventions.

Methods

Data analyzed was from a hospital administrative database in an AMC in Singapore for the period of 2006 to 2013. Ethics approval was obtained from the review board of the healthcare cluster. Details of preparation and processing of the database are described elsewhere [1. Non-HUs constitute 83% of all patients and 25% of all costs during the first observed year. Few patients had inpatient utilization, and outpatient utilization was an average of 1 SOC visit or ED attendance. Cost HUs accounted for almost 16% of total costs despite constituting less than 4% of the cohort. The median costs for these HUs was S$16,591 and median inpatient utilization was 1 inpatient admission and 7 days of LOS. Similarly, most LOS HUs only had 1 inpatient admission, but their median bill was lower at S$9073 and LOS was longer at 19 days. LOS-SOC HUs incurred similar bill sizes and inpatient utilization as LOS HUs, but had additional high SOC usage (LOS HU: median: 1 visit; LOS-SOC HU: 9 visits). SOC HUs, due to the large group size (6.7%), accounted for 8% of all cost despite having zero inpatient utilization and ED attendances on average. Cost-SOC HUs generally incurred more utilization in comparison to SOC HUs (median inpatient admissions: 1; LOS: 6 days; SOC visits: 12). Cost-LOS HUs incurred more cost and inpatient utilization than Cost HUs, at a median cost of S$31,762, and median inpatient utilization of 2 admissions and 27 days of LOS. The Cost-LOS-SOC patients incurred the highest utilization across all metrics (median cost: S$49,248; inpatient admissions: 3; LOS: 29; SOC visits: 13; ED attendances: 2).

Table 1 Cost and utilization patterns of high utilizer (HU) groups

The socio-demographic profiles of the HU groups are presented in Table 2. Overall, most Non-HU patients were aged below 40 and had low multi-morbidity (median age: 36; median CCMI: 0; median PPS: 2). The majority were male, Chinese, sought at least some subsidised services or stayed in 3-room public housing or larger (male: 59.6%; Chinese: 58.0%; only unsubsidised treatment: 11.7%; 3-room and larger: 62.6%). Death within the observed year and persistence was low (death: 1.4%; persistence: 2.2%). Cost HUs in comparison were older, had a larger proportion of patients who sought only unsubsidised treatment and a tenth of the patients died during the year (median age: 55; only unsubsidised treatment: 19.7%). LOS HUs were also older than Non-HUs, mostly female, and a larger proportion were Chinese (median age: 55; male: 39.2%; Chinese: 72.8%). Almost all sought at least some subsidised services (99.5%), and 9.2% of patients died during the year. LOS-SOC HUs were similar to LOS HUs in terms of race, ethnicity and SES, but were younger with a median age of 32, and all patients survived. SOC HUs were similar in demographic profile to Non-HUs, but had more female patients and the highest proportion of patients who sought only unsubsidised treatment (male: 37.3%; unsubsidised treatment: 31.0%). Persistence was also prevalent in 15% in SOC HUs, which was substantially higher than in the Non-HUs. Cost-SOC patients were older than SOC HUs, had more males and exhibited more persistence in comparison (median age: 51; male: 52.7%; persistence: 25.2%). Cost-LOS patients were the oldest, had the highest multi-morbidity and the highest proportion of patients living in 1/2-room flats (median age: 66; median CCMI: 2; median PPS: 25; 1/2-room flat: 6.2%). They also had the highest death rate among all groups within the first year (27.1%). Cost-LOS-SOC HUs had the highest multi-morbidity, and a third of patients persisted as HUs into the second year (median CCMI: 2; median PPS: 29; persistence: 35.4%).

Table 2 Characteristics of Year 1 high utilizer (HU) groups

Table 3 illustrates the five most common conditions patients in each HU group that were ever diagnosed in the first year. External injuries were common primary diagnoses in the Non-HU group. Cardiovascular disease was prevalent among the Cost HUs (Coronary atherosclerosis: 20.1%; Acute myocardial infarction: 19.6%). SOC HUs were commonly diagnosed with routine ambulatory conditions, with predominantly pregnancy related conditions (Normal pregnancy: 13.3%; complications: 4.4%), while Cost-SOC HUs were commonly diagnosed with complex ambulatory conditions such as cancer and female infertility (Cancer of breast: 7.4%; Female infertility: 5.0%). Most LOS and LOS-SOC HUs were diagnosed with at least one mental health condition, with mood disorders highly prevalent (LOS: 24.0%; LOS-SOC: 52.3%). For Cost-LOS and Cost-LOS-SOC HUs, common conditions were cardiovascular disease, acute cerebrovascular disease as well as pneumonia (Cardiovascular: Cost-LOS: 8.2%, Cost-LOS-SOC: 10.9%; Cerebrovascular: Cost-LOS: 17.5%; Cost-LOS-SOC: 8.3%; Pneumonia: Cost-LOS: 13.9%; Cost-LOS-SOC: 7.9%). Common diagnoses ranked by visit frequency revealed similar trends. The only exception was the prevalent conditions in Cost-LOS-SOC HUs were lymphoma, leukemia, colon cancer and renal failure (Additional file 1).

Table 3 Top 5 common conditions in Year 1 high utilizer (HU) groups

We then sought to identify factors associated with persisting as a HU into the subsequent year. Generally, persistent and non-persistent HUs differed in socio-demographic characteristics and prevalence of common HU conditions (Additional file 2). Of all patients, 16,052 (4.2%) patients were HU in a subsequent year. From Table 4, Model 0 revealed that HUs across all groups were more likely to persist as a HU in any group in the subsequent year, compared to Non-HUs. The weakest association was seen in the Cost HUs, while the strongest association was seen in the Cost-LOS-SOC HUs (Cost: OR = 2.73, 99% CI: 2.45–3.03; Cost-LOS-SOC: 31.59, 99% CI: 29.02–34.37), with generally stronger associations seen in groups with high SOC utilization (LOS-SOC: 16.72, 99% CI: 6.26–38.93; Cost-SOC: 16.30, 99% CI: 15.31–17.35). After adjusting Model 0 for all socio-demographic factors and multi-morbidity scores, housing type was removed from the model due to lack of statistical significance. The resulting model, Model 1, revealed that the trends in persistence among the HU groups had remained but decreased in strength across all groups except for LOS-SOC HUs (Model 0: OR: 16.72, 99% CI: 6.26–38.93; Model 1: 17.28, 99% CI: 6.39–40.94). The LRT revealed that Model 1 exhibited significantly better fit than Model 0 (p < 0.001). Adjusting Model 1 for the 26 common HU conditions and further refining the model to retain only statistically significant factors, only 13 conditions remained in Model 2. The same trend in varying tendencies in persistence among the HU groups was observed. A diagnosis of breast cancer, mood disorders, hypertension or female infertility was also associated with a higher likelihood of persistence (Cancer of breast: 1.28, 99% CI: 1.09–1.51; mood disorders: 1.45, 99% CI: 1.11–1.85; hypertension: 1.53, 99% CI: 1.36–1.71; female infertility: 1.72, 99% CI: 1.39–2.12). The LRT revealed that Model 2 exhibited significantly better fit than Model 1 (p < 0.001).

Table 4 Factors associated with persistence as a HU into a subsequent year

Discussion

HUs of healthcare services are a diverse group and can be further segmented into different subgroups based on utilization metrics available in most hospital administrative databases, such as cumulative cost, length of stay or outpatient visits. Segmentation by these utilization metrics revealed differences in socio-demographic characteristics, varying persistence in high utilization and distinct variations in disease profiles of patients. Our results showed that the HU groups exhibit differences in age and comorbidity. High-cost groups were generally older and of higher multi-morbidity in comparison to the low-cost groups, which is consistent with other studies associating older age and multi-morbidity with higher healthcare costs [1, 3, 71,72,73]. As few studies define high utilization using multiple metrics simultaneously [37, 40], our study adds meaningful insights into the characteristics of patients in different HU groups, such as the variation in extent of multi-morbidity between the groups. However, while socio-demographic factors have been shown to be associated with HUs in other populations [9, 14, 17, 74], this was not as apparent in our population as housing type, as a proxy of SES, does not appear to be a differentiating factor across the groups.

Our results also revealed that the HU groups have different disease profiles. The disease profile of the Cost-LOS and Cost-LOS-SOC HUs, together with the older average age, higher CCMI and substantial death rate, suggest a frail elderly archetype similar to clusters with advanced age and high prevalence of complex chronic conditions found in recent segmentation studies [39, 75, 76]. The finding of acute cardiac events as a one-off high-cost condition was consistent with other studies. Similarly, common resource-intensive conditions such as cancers and renal failure were observed to be among the most prevalent conditions in the Cost-SOC and Cost-LOS-SOC HUs when the number of visits per condition was taken into account [13, 19, 26]. The LOS and LOS-SOC HUs were found to be primarily patients with diagnosed mental illness. The underlying drivers of high utilization and required interventions for patients admitted to the psychiatric wards and patients admitted to the general wards would differ. For patients admitted for psychiatric conditions, interventions such as rapid psychiatric review upon admission could potentially reduce inpatient stay in the psychiatric wards [77, 78]. For the patients admitted for non-psychiatric diagnoses, the long stay may be driven by factors such as poor access to appropriate psychosocial care [37], suggesting that instead of cost reduction measures, patients may instead benefit from an integrated model of care to reduce the burden on acute inpatient care [79, 80].

A key finding is that most HUs in our patient cohort do not persist in their high utilization, precluding intervention after identification in the first observed year where they incur substantial resource use. This finding is consistent with other studies that have shown that even within HUs, there exists a small group of high-risk persistent users incurring a disproportionate amount of cost [7, 20, 73, 81, 82]. HUs were more likely to recur as a HU in any group in the subsequent year, and this phenomenon was more pronounced in groups with high SOC utilization in the first observed year. Our multivariable model suggested that patients incurring high resource usage in the SOC setting, such as for treatment of female infertility, should be prioritized for further longitudinal analyses to better understand their utilization trajectories, with the aim of develo** programs with these specific characteristics in mind. In addition, current disease management processes for hypertension and mood disorders should also be flagged for further analyses and refinement to address the tendency for persistence in these patients. Future studies would also seek to examine persistence of HU behavior over a longer duration and the different trajectories of each subgroup to inform intervention design and targeting.

As utilization patterns may be driven by patients’ disease type, progression and management [73], interventions to reduce excess resource use are currently disease-specific and in context of the usual disease management process [83,84,85]. However, we found that certain conditions are prevalent across multiple HU groups, suggesting that the traditional disease-centred programs may be capturing a group of patients with the same diagnosis but with heterogeneity in utilization patterns and by extension, care needs. Such disease-centric programs may then be limited in effectiveness due to the inherent variation present in the patient populations treated and the hospital setting in which the disease is treated. For instance, diagnosis of acute cerebrovascular disease was found to be prevalent across the Cost, Cost-LOS and Cost-LOS-SOC HU groups. Care pathways have been commonly adopted for stroke management to improve patient care quality and outcomes [86]. While these programs may include outpatient treatment as part of the pathway, they generally focus on inpatient-related care during the acute phase. However, it is clear that there is a group of patients with cerebrovascular disease that have high outpatient needs, suggesting the need to look at a more holistic program that focuses not only on the inpatient aspect of stroke care, but extends to outpatient care as well for this group.

An effective program design should either accommodate the variation in the patient profiles, or target only a particular subgroup of patients. As patients at risk of high utilization often have high prevalence of multiple complex chronic conditions and not just one disease, new integrated models of care that are generic and disease agnostic, and that address the cross-cutting needs of a patient, may be more appropriate and effective in addressing high resource use across the different archetypes of HUs. Interventions such as case management, care planning and bundling of care have already been implemented in specific high-risk groups with complex needs such as older patients and patients with chronic diseases [87,88,89,90,91,92]. However, with increasing age, chronicity and complexity in the general population, applying this patient-centric approach across the different segments of the population will be better able to address the diverse health and social needs of each group [92, 93]. In parallel, the empirical approach in segmenting the patient population we have proposed would facilitate targeting of certain subgroups, by increasing within-group homogeneity in utilization profile and subsequently the relevance of any new interventions targeted at reducing high resource use.

Our findings highlight the importance of selecting the correct metrics in population segmentation. Selection can either be hypothesis-driven, with the intention to zoom in on a particular type of patient group, or pragmatically motivated by availability and access to information. Segmentation of patient populations is commonly achieved using clustering, but the segments have to be labelled post hoc given the characteristics of the identified clusters [39, 75, 76, 94, 95]. On the other hand, Cost, LOS and SOC data are convenient starting points for segmentation since they constitute the basic data collected for hospital databases and can be readily processed to generate intuitive and reproducible HU groups based on the 90th percentile of the cohort. While cost is a straightforward metric of resource use, broadening the definitions to include other metrics and further stratifying these HUs unveils the elevated resource use in other areas that would have been obscured. This representation of other non-high-cost HU groups highlight potential areas for improvement in current care processes which would have otherwise been missed, should the focus only be on high-cost groups.

Originating from one of the only two AMCs in Singapore, the data and analysis offers an important overview of HUs in an AMC in an Asian population. All patients above the age of 21 were included for analysis and information was collected at point of visit, minimising selection and information bias. Socio-demographic information was only available for the last visit in the system, and as changes to gender and ethnicity of the patients over the study period would have been minimal, only non-differential misclassification biases due to changes in housing type would be an inherent limitation in interpreting the information on SES of the patients. The reported healthcare costs in this study were estimated using patient bills as a proxy of cost, and do not reflect the true costs incurred by the hospital. However, as patient bills have been demonstrated to be positively correlated to costs across various studies [96,97,98], these billed charges are nonetheless a valid measure for the purpose of identifying high utilizers in our study. The use of observation years instead of calendar years allowed us to better account for resource use arising from disease progression over time. The associations seen between the first observed year HU groups and persistence of HU would only be generalizable to patients who survived the year. This study provides an extensive but incomplete comparison and description of HUs, as primary care data was not included, and we were not able to examine the implications of segmentation on primary care utilization. As the healthcare system in Singapore was reorganized in 2017, a group of polyclinics was integrated into the healthcare cluster and the inclusion of this primary care data for future work would complete the picture of HU groups in the cluster. Utilization of patients in other healthcare clusters was also not available, which would underestimate the total healthcare utilization accumulated by patients who seek care across multiple hospitals. A local study on three regional hospitals found that the rate of patients visiting all three hospitals was 8%, suggesting the need to take into account potential cross-utilization of patients in interpretation of our findings [99]. Generalizability of the characteristics of HUs to non-tertiary care settings would also be limited given that the study was based on an AMC. Taking into consideration the abovementioned limitations, this study nonetheless adds invaluable insight into the use of administrative data to segment a hospital-based patient population, and the profiles of patients with varying utilization patterns across the different hospital settings.

An extension of the segmentation approach illustrated in this study would be to segment a specific clinical subpopulation, examine the HU group distribution in this subpopulation, and compare these distributions across different clinical diagnoses. Further studies would also seek to expand on the persistence of HUs into subsequent years and distinguish the trajectories for each HU group. Effective identification and targeting of persistent users would maximise the use of resources channelled to these interventions, as patients who will revert to low resource use on their own over time will be omitted, and only patients who remain within the system and require the intervention will receive the program. These persistent users could be characterised and distinguished from the transient HUs, with the aim of informing program design to detect and target persistence in context of each group’s utilization patterns and disease profile. In addition, as we have examined the HU behaviour from the health system perspective in this study, a follow-up study examining patients with high out-of-pocket expenditure would be conducted to provide insight on high utilization from the patient’s perspective.

Conclusion

High utilizers are a heterogeneous group of patients and there is a need to move beyond a one-size-fits-all metric to measure high utilization. We demonstrated the use of healthcare cost, as well as LOS and SOC utilization as metrics to identify different HU groups in a cohort of patients followed for 1 year. Differences in socio-demographic characteristics, multi-morbidity and disease profile were detected between the HU groups. Persistence of HU behavior in our study was pronounced in groups with high SOC utilization, and this trend was evident even after accounting for socio-demographic and clinical characteristics. These groups with high SOC utilization would be prime candidates for in-depth analysis of longitudinal behavior to distinguish persistent HUs from transient HUs, track their transitions to different HU groups in subsequent years, and determine groups feasible for intervention. Intervention design tackling excess resource use should take into consideration the inherent variation in utilization patterns among the patients and address the specific needs of each subgroup when develo** an effective and targeted program. Segmentation of a patient cohort using these utilization metrics will enable policy makers to better identify the diverse needs of patients, detect gaps in current care and focus their efforts in delivering care relevant and tailored to each segment.