Introduction

Medication errors are a common problem in health care and a frequent cause of mortality and morbidity [1,2,3]. Due to inconsistent definitions and classification systems, differences in populations studied and varying outcome measures, the reported prevalence of medication errors and adverse drug events (ADE) varies widely (from 2% to 94%) across different studies [1, 2, 4,5,6]. Given the high number of prescriptions in primary care, medication errors have the potential to cause considerable harm [7,8,9], contributing to substantial health and economic consequences, including an increased utilization of health care services and, in the worst case, patient death [10,11,12].

The use of digital health technologies can help overcome shortcomings at each stage of the medication management process [13]. Digital health technologies have the potential to reduce medication errors and adverse drug reactions (ADR), improve patient safety and thus contribute to higher quality and efficiency in health care [14, 15]. In particular, Clinical Decision Support Systems (CDSS) are used to improve medication safety by providing direct medication related advice to physicians, pharmacists or other participants involved in the medication process [16, 17]. Current research demonstrates the potential of CDSS to enhance health care processes [18,19,20,21,22,23]. In particular, CDSS that are integrated into the clinical workflow and include messages or alerts that are automatically presented during clinical decision making can have beneficial effects [24].

While a variety of studies have examined the effects of CDSS on medication safety, significant heterogeneity exists concerning the outcome measures used, leading to an ambiguous body of evidence [16, 25, 26] – particularly in primary care [27,28,29] and long-term care (LTC) [29,30,31]. According to Seidling and Bates [32], outcomes used by studies investigating the impact of digital health technologies on medication safety can be grouped into three categories: process-related, harm-related, and cost-related outcomes. These categories differ regarding their relevance for patient health [32]. In particular, harm-related outcomes are more directly relevant for patient health than process- or cost-related outcomes.

As of yet, no review has comprehensively summarized the outcome measures used in studies on medication safety-related CDSS effectiveness in primary care and LTC. Therefore, the primary objective of this systematic review is to summarize and categorize the outcome measures used in these studies. Thereby, we contribute to a more standardized approach in the evaluation of CDSS and facilitate future research in this field. A secondary aim is to compare the main empirical findings of these studies.

Methods

Our systematic review followed the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) Statement [33] (see Supplementary Tables S1-S2, Additional File 1). This systematic review was registered with PROSPERO (CRD42023464746) [34].

Search strategy

We systematically searched PubMed, Embase, CINAHL, and the Cochrane Library for papers published before September 20th, 2023. The search strategy included terms about the character and type of intervention (digital decision support), the aim of these interventions (medication safety) and the targeted setting (outpatient/primary and LTC). Relevant MeSH-terms were considered (see Supplementary Table S1, Additional File 2). We developed the search strategy in accordance with published CDSS-related systematic reviews [25, 26, 28, 35]. Further publications were searched manually via hand search and automatically using forward and backward citation of the Spider Cite tool [36].

Eligibility criteria

We included English and German language full-text publications that report data on interventional studies evaluating CDSS to improve the medication safety in the primary/outpatient and LTC setting. Only studies reporting medication-, patient- or cost-related outcomes were included, while studies reporting only outcomes related to healthcare providers attitude or acceptance regarding CDSS and studies focusing only on performance or quality indicators of CDSS (e.g. sensitivity, specificity) were excluded. Studies were also excluded if the intervention was conducted in inpatient care, did not automatically engage in the medication process (e.g., via automated alerts), or included only a simple reminder function. Furthermore, studies were not eligible if they focused only on a single potentially problematic drug or only on one specific indication. Finally, studies were excluded if they did not primarily aim at the improvement of medication safety. There were no restrictions regarding the comparator of the intervention (see Supplementary Table S2, Additional File 2). Two investigators (DL and DGR) independently screened search results and assessed the eligibility of potentially relevant studies according to the predefined inclusion and exclusion criteria. Discrepancies (n = 131) were resolved by consensus. Another investigator (BA) was consulted if consensus could not be reached.

Data extraction, categorization and synthesis

We extracted the following data from the included studies: study design, study period, sample, and setting, type of intervention and comparator (Table 1), primary and secondary outcome measures (Table 2), outcome levels (Table 3), and main empirical findings (Table 4). Two investigators (DGR, JG) jointly performed the data extraction, which was verified by a third investigator (BA). We grouped types of interventions and comparators into the following categories:

Table 1 Study characteristics
Table 2 Overview of extracted primary and secondary outcomes including outcome (sub-)categories and levels of operationalization
Table 3 Overview and frequency of outcome levels used by included studies per outcome category and subcategory (n = number of studies)
Table 4 Empirical findings of studies (primary outcomes)

Computerized physician order entry

Computerized Physician Order Entry (CPOE) is defined as any system that allows health care providers to “directly place orders for medications, tests or studies into an electronic system, which then transmits the order directly to the recipient responsible for carrying out the order (e.g. the pharmacy, laboratory, or radiology department)” [27].

Electronic prescribing

Electronic Prescribing (e-prescribing or eRx) can be seen as a special form of CPOE [79]. For example, Donovan et al. show that the implementation costs of hospital-based CDSS are rarely reported and the methods used to measure and value such costs are often not well described [80]. Thus, intervention costs, as well as costs that may have occurred in other (health care) sectors, are often not considered in economic evaluations of CDSS [81]. Since the quality of the current health economic literature on health information technology in medication management is poor [81], future studies should follow established standards of health economic evaluations [78, 82, 83]. Additionally, since the economic impacts of improved medication safety may occur on different levels, economic evaluations of CDSS should take into account not only the payers’ perspective, but also financial effects at the provider level.

To summarize: CDSS evaluations should include multiple outcomes from each of the three outcome categories [32, 76]. However, we found that none of the included studies conducted a comprehensive evaluation of all three outcome categories. Furthermore, two-thirds of studies did not consider any harm-related outcomes. Those studies that did use harm-related outcomes mostly used ADE or other injuries; very few used morbidity or hospitalization. Although process-related outcomes were by far the most used outcomes, this is mostly due to the large number of studies using error rates. In contrast, response rates and alert rates were used less commonly, making it difficult to fully investigate and interpret CDSS activity and use. Finally, only three studies used cost-related outcomes. This finding is consistent with the sparse and conflicting evidence regarding the financial impact and cost-effectiveness of CDSS [16, 81, 84]. The studies that used cost-related outcomes included only a small subset of direct costs and did not consider indirect costs.

Defining outcome measures

We have seen that the included studies differ in the outcome categories they use. However, studies also differ in their definition and operationalization of outcomes even within categories (and subcategories).

While mortality and hospitalization are easily measured standardized outcomes, other harm-related outcomes (such as injuries) may be defined and operationalized in various ways, limiting the comparability of harm-related results between studies. Cost-related outcomes were only considered in three studies, which used significantly different (and therefore non-comparable) approaches.

Differences in outcome definition and operationalization between studies were most pronounced for process-related outcomes. First, these outcomes measured the occurrence of a number of different types of errors, responses, and alerts. For example, an error rate may refer to the number of PIM or the number of DDI. Second, these outcomes can be defined at different levels, including patient-level, encounter-level, prescription-level or alert-level. For example, an error rate may refer to the number of errors per prescription or the number of errors per patient-month. These differences in outcome definitions are in line with the literature: a review by Rinke et al. [85] also found differences in outcome definition and operationalization for evaluations of interventions to reduce paediatric medication errors.

Due to these differences in outcome definition, comparing results between studies can be difficult or even impossible [85], even if studies use the same outcome categories. Therefore, future research should work toward consensus definitions for key outcomes. This could increase the efficiency of evidence synthesis and reduce the risk of duplicated research efforts, thereby accelerating the improvement of care [86]. When agreed-upon definitions are unavailable, researchers can increase the comparability of their results by reporting multiple outcome definitions.

Importantly, this does not imply that all CDSS evaluation research should use a one-size-fits-all approach. Different healthcare systems, care settings, study populations, or CDSS types may give rise to different research questions, which will likely require the use of different outcomes and definitions. For example, an evaluation of a novel CDSS introduced in an LTC setting with a history of inappropriate medications may use a PIM/PIP-based error rate, while an evaluation of an existing primary care CDSS which has recently been upgraded to generate dosage alerts may instead measure the rate of dosage errors. However, studies with similar research questions concerning similar settings and populations should still strive to use comparable outcome definitions, when possible.

Finally, researchers should carefully consider at which level they define their outcomes. For many types of error rates, the prescription-level may be most appropriate. For example, the number of errors per prescription (or per encounter) reflects the total opportunities for errors more accurately than the number of errors per patient or per patient-month [85]. Similarly, it may be more appropriate to define response rates at the alert-level, rather than the prescription-level. As discussed above, the most appropriate outcome definition will depend on the context and specific research question.

Reducing the risk of bias

But even if the included studies had used a wider variety of outcomes from all outcome categories, with agreed-upon definitions and standardized operationalizations for each outcome, many studies would still have exhibited a risk of bias due to their study design and other methodological problems. In particular, most studies were cross-sectional designs without a sufficient follow-up period, many studies were not randomized or not controlled and most controlled studies did not demonstrate study group comparability. Finally, many studies did not specify a primary outcome, and only 12 studies reported power calculations.

To reduce the risk of bias, future research should rely on well-designed (cluster) RCTs including a sufficient follow-up period; study group comparability should be assessed and reported. Whenever possible, studies should be longitudinal rather than cross-sectional. Finally, studies should explicitly specify a clear (preferably harm-related) primary outcome and should perform and report sample size and power calculations for this outcome.

Empirical findings

Only 20 out of 32 included studies explicitly specified a clear primary outcome and, of these, only five studies used harm-related primary outcomes. While half of all studies with primary outcomes demonstrated a significant intervention effect, most studies finding significant effects did so for process-related primary outcomes. This result is in line with current research demonstrating significant intervention effects when using process-related outcomes [18,19,20,21,22]. In contrast, only one study found a significant intervention effect for a harm-related primary outcome. Overall, our results agree with prior reviews finding that the effectiveness of CDSS for medication safety in primary care [27,28,29] and LTC settings [29,30,31] remains inconsistent and future research on the harm-related effects of medication-related CDSS is needed.

To generate stronger evidence on the effectiveness of CDSS, future studies should follow the methodological recommendations outlined above. Furthermore, additional research should take place in LTC settings, as this setting was underrepresented in the included studies. Finally, insights from research using process-related outcomes to study CDSS activity should be used to improve on the design and functionality of future CDSS. While uptake levels are rarely reported in CDSS evaluations, available evidence indicates that uptake is low [87]. In addition to alert fatigue, high override rates are an increasingly important problem for CDSS interventions [88, 89]. If these overrides are inappropriate, they can lead to medication errors, patient harms and increased costs [90]. Comprehensive CDSS evaluations using a variety of outcomes and outcome categories are therefore needed to identify and remove barriers to user acceptance of CDSS.

Limitations

Compared to a recent review [26], we expanded our scope by including the LTC setting and focusing primarily on methodological aspects and outcomes used in CDSS evaluations. However, our systematic review still has several limitations. First, relevant studies that have not been indexed in the searched databases might be missing from this review, although we followed an extensive search strategy, including hand search and automated citation tools alongside the search of multiple databases. Second, due to the methodological heterogeneity of the included studies, we only compared whether or not studies found a significant effect for their primary outcome and did not compare levels of significance or effect sizes. We also did not consider outcomes related to user acceptance of CDSS. Finally, a sco** review may also have been an appropriate method for addressing our primary (methodological) aim, although the lines between these types of reviews are often blurred [91]. However, due to our secondary (empirical) aim and our performance of a risk of bias assessment, we decided to conduct a full systematic review according to the PRISMA, rather than PRIMSA Extension for Sco** Reviews [92], guidelines.

The included studies vary in terms of applied interventions and comparisons. Some studies compared the CDSS intervention to non-automated IT systems, while other studies used handwritten or paper-based prescription forms as a comparison. Consequently, the applied interventions and comparisons are not comparable, which could also have an influence on the differences in outcome measures and operationalizations. For example, comparing CDSS to other IT systems rather than handwritten prescriptions may allow alert rates or response rates to be calculated for both the intervention and control groups.

Furthermore, since 75% of the studies were from North America, the generalizability of the studies to other regions may be limited. Finally, the included studies’ high risk of bias (particularly for PPS and N-RCT studies), their lack of clearly specified primary outcomes and their weak reporting of sample sizes need to be considered when drawing conclusions from study results. Despite these limitations, our results give rise to a number of key recommendations for future studies researching the effect of CDSS on medication safety, summarized in Table 5.

Table 5 Recommendations for research on medication safety-related CDSS effectiveness

Conclusions

Our primary aim in this review was to summarize and categorize the outcome measures used in CDSS evaluation studies. Furthermore, we assessed the methodological quality of these studies and compared their key findings.

Although a variety of studies have evaluated the effectiveness of CDSS, we found that these studies face a number of (methodological) problems that limit the generalizability of their results. In particular, no studies used a comprehensive set of harm-related, process-related and cost-related outcomes. Definitions and operationalizations of outcomes varied widely between studies, complicating comparisons and limiting the possibility of evidence synthesis. Furthermore, a number of studies were not controlled, lacked randomization or did not demonstrate the comparability of study groups. Only 63% of studies explicitly specified a primary outcome. Of these, half found a significant intervention effect.

Overall, evidence on CDSS effectiveness is mixed and evidence synthesis remains difficult due to methodological concerns and inconsistent outcome definitions. Additional high-quality studies using a wider array of harm-, process- and cost-related outcomes are needed to close this evidence gap and increase the availability of effective CDSS in primary care and LTC.