Background

The implementation of innovations and evidence-based practices (EBPs) [interventions] is generally a long and complex process. Multiple phases have been identified and used to characterize complex implementation processes [1,2,3,4,5,6]. A common thread or goal amongst the derivations is a culminating phase wherein the intervention is “sustained” and integrated as routine practice. This final phase has been included as part of the implementation process and as an implementation outcome. For example, the Exploration, Preparation, Implementation, Sustainment (EPIS) framework identifies four phases with sustainment being the final phase [7].

Like many other dissemination and implementation (D&I) concepts, the range of terminology related to sustainment and sustainability has challenged the field. The most frequent terms include sustain, sustainment or sustainability, maintain or maintenance, continue or continued, and long-term or follow-up implementation [8, 9]. However, a multitude of other terms have also been used including adhere [9], penetration [10, 11], niche saturation [11], institutionalization [12], routinization [13], normalization [14], integration [15, 16], and community ownership [17]. These terms largely relate to the integration of an intervention into routine practice, however, their operationalization differs. For example, within Proctor’s implementation outcome taxonomy, penetration is defined as “the integration of a practice with a service setting and its subsystems” [10], which similarly aligns with the concept of niche saturation and institutionalization. However, the number of recipients of the intervention (similar to RE-AIM concept of reach) [18] or the providers delivering/using the intervention (similar to RE-AIM concept of adoption) [18] are used as measures of penetration.

Similarly, a number of definitions [8, 9, 19,20,21,22] and frameworks [13, 23,24,25] of sustainment exist. The common and key definitions [8, 9, 24,25,26,27,28,29,30] have been methodically developed and there is growing consensus and application of these conceptualizations. Of note, the definition by Moore et al. “after a defined period of time, a program, clinical intervention, and/or implementation strategies continue to be delivered and/or individual behavior change (i.e., clinician, patient) is maintained; the program and individual behavior change may evolve or adapt while continuing to produce benefits for individuals/systems” [21]. A further important definition along these lines is the definition by Shelton et al., “Sustainability has been defined as the continued use of program components at sufficient intensity for the sustained achievement of desirable program goals and population outcomes” [27].

Sustainment and sustainability definitions within the measurement literature are more disparate. The Society for Implementation Research Collaboration (SIRC) instrument review project (IRP) defines sustainability as “the extent to which a newly implemented treatment is maintained or institutionalized within a service setting’s ongoing, stable operations” [31]. The Dissemination and Implementation Grid-Enabled Measures database initiative (GEM-D&I) adds to this definition of sustainability with, “the existence of structures and processes which allow a program to leverage resources to most effectively implement evidence-based policies and activities over time” [32]. Other distinct definitions include: “sustainability as a process or ongoing stage, ” [33] “the extent to which an evidence-based intervention can deliver its intended benefits over an extended period of time after external support from the donor agency is terminated,” [11] “maintenance of treatment goals after discharge,” [20] “sustainability of those intervention outcomes or long-term effects,” [34] and “a suggested ‘adaptation phase’ that integrates and institutionalizes interventions within local organizational and cultural contexts” [25].

As indicated by the multiple review papers on sustainability and recent reconceptualization of sustainment in frameworks [8, 21, 25, 27, 35], there has been a general shift away from thinking of sustainment as an “end game” and more static conceptualizations of sustainment. Sustainment is a dynamic outcome that changes over time to meet the needs of changing context, needs, and evidence. Adaptation is central in the operationalization of sustainment. In light of this shift, there has also been discussion of the movement away from thinking about sustainability as synonymous with “institutionalization,” but that integration into routine practice must occur alongside adaptation to increase fit and to allow for continuous improvement.

There has also been debate on the distinction between sustainment and sustainability. One recent paper describes sustainment as “sustained use,” while sustainability as “sustained benefits” [36]. Another commonly used description highlighting the distinction was developed by Chambers [25] who refers to sustainability as the “characteristics of the intervention that may lend itself to be used over time” as compared to sustainment which is referred to as an “outcome—was the intervention sustained over time,” further saying that “sustainment is an outcome of a sustainability effort” [37]. This is an important concept whereby the process of planning for sustainment should occur throughout an implementation effort through considering the influences of sustainment, deemed sustainability. Palinkas and colleagues [38] also sought to conceptualize the concept of sustainability to encourage further examination and measurement within implementation efforts. They emphasize the elements of continuity and funding as well as conceptualizing sustainability as a process with determinants and an outcome (i.e., sustainment). While several common elements of sustainability determinants and outcomes emerged (e.g., including infrastructure, community buy-in, funding), they found that what distinguishes a determinant from an outcome varies, where outcomes are often distinguished as those elements that continued once funding and/or available support ended.

For the purpose of this review, we conceptualize sustainment as an outcome indicating that the intervention was continued over time. We summarize the specific components of sustainment based on our conceptualization and synthesis of the literature as being: (1) the input(s) (e.g., intervention, program, implementation strategy) continue to be delivered, through (2) routinization and institutionalization of the input(s), while adapting and evolving, and there being (3) ongoing capacity across levels (e.g., organizational, community and systems change) to support the delivery, so that (4) the outputs on the part of the health provider and/or health consumer (e.g., individual behavioral change, clinical benefits, value, impact) are maintained.

Sustainment has been comparatively less studied than other phases of the implementation process [39]. There are multiple explanations for the paucity of work on this topic [8, 24, 27, 28]. Firstly, the time-limited nature or parameters of research grant funding cycles greatly impedes the ability to examine sustainment, especially the effects of long-term sustainment. The nature of current funding mechanisms are such that the time and funding allocated often necessitate a primary focus on one implementation phase, most frequently implementation, and do not allow for complete assessment of the degree to which newly implemented practices are sustained in a way that leads to positive outcomes for patients [40]. As such, sustainment is often not considered or is deemed beyond the scope of a given implementation project. Further, those projects with a primary or secondary focus on sustainment are often limited to short-term examination (e.g., a few months to 1 or 2 years following implementation) or are separate projects and not linked to the prior implementation processes.

A second issue with investigating sustainment is methodological difficulties. Traditional study designs typically focus on more proximal or time-limited implementation outcomes such as fidelity, reach, or engagement [10]. Some designs are inherently limited in the length of prospective follow-up. Designs such as interrupted time-series, and roll-out designs may provide the opportunity to examine sustainment more efficiently than some other more common randomized designs [41]. It is important that these alternative study designs consider sustainment as the outcome that matters most in regard to improving public health impact, for return-on-investment, and in improving efficiency and effectiveness of implementation efforts [42]. Methodological difficulties also arise because planning for sustainment is difficult due to the unpredictability of contextual changes (e.g., new or amended legislation, leadership turnover) or new learning that may occur.

Third, despite multiple reviews of the concept of sustainment, measurement of sustainment has received minimal attention and there appears to be a lack of psychometrically sound and pragmatic sustainment measures [43]. Unlike other phases across the implementation spectrum, sustainment does not have a well-defined time period (i.e., the sustainment period can last indefinitely). Related, sustainment is ecologically complex, thus posing challenges to longitudinal measurement. Because the research examining intervention sustainment is still accumulating, it is not yet known what critical factors at each ecological level are essential to capture the dynamic process of sustainment.

To this end, this review synthesizes the literature on intervention sustainment measures and evaluates the qualities of each measure. We review and extend the current work on sustainment definitions, frameworks, and measures. This work highlights the strengths and gaps in existing sustainment measures to inform recommendations for the development of pragmatic, valid, and reliable measures of sustainment.

Methods

Known repositories of implementation measures, the Society of Implementation Research Collaboration (SIRC) instrument review project (IRP) [31, 43] and the Dissemination and Implementation Grid-Enabled Measures database initiative (GEM-D&I) [44, 45] were searched for measures of sustainment (or similar terms e.g., sustainability, maintenance). The SIRC-IRP aims to advance implementation science through measure development and evaluation. The IRP centers on the implementation outcomes framework put forth by Proctor and colleagues [10] and constructs outlined in the Consolidated Framework for Implementation Research (CFIR; Damschroder et al. [46]). The GEM-D&I is a project co-developed and introduced by the Cancer Research Network Cancer Communication Research Center at Kaiser Permanente Colorado and the National Cancer Institute’s (NCI) Division of Cancer Control & Population Sciences. The GEM-D&I database looks to create a growing and evolving community of users and resource to share standardized D&I measures that can lead to comparable datasets and facilitate collaboration and comparison across disciplines, projects, content areas, and regions. GEMS-D&I currently has 132 measures inventoried by construct. For this review, we looked at measures inventoried as sustainability.

In addition, we conducted a narrative review of studies purporting to measure sustainment to add to our catalogue of existing measures. Reference sections of literature reviews and definitions of sustainment and sustainability were screened and measures judged to be used frequently, comprehensive, and/or validated were extracted for analysis.

Data extracted from the measures included a description of the measure, the measure’s development process, respondent type(s), psychometric properties, timeframe examined, number of items and factors, scoring, and considerations. The generalizability of the measure and any other limitations or concerns were considered. For the measures that were quantitative scales, as opposed to open-ended questions or telephone surveys and that included ongoing delivery/use of an EBP and other dimensions of sustainment as an outcome, we also extracted the constructs of the scales. Extraction was performed inductively by two team members (JCM and AG). The extracted constructs were then collated under the components of our sustainment definition. Extraction continued until thematic saturation was reached and results were confirmed by a third team member (KSD). Team members and authors included national and international implementation scientists, including several with expertise in pragmatic measurement and measure development.

Results

Eleven measures targeting or that included sustainment were identified across the GEM-D&I, SIRC-IRP, and other sources as described above. Three of the measures were found in the GEM-D&I database, three in the SIRC-IRP (one measure was in both databases), and six came from other sources. We briefly describe each of the measures and summarize them in Table 1.

Table 1 Summary of sustainment measures

GEM-D&I had six measures indexed as targeting sustainability, and three of these met criteria to be included in our review. The included measures are as follows: The Reach Effectiveness Adoption Implementation Maintenance (RE-AIM) framework [18, 35, 47], Level of Institutionalization Scale (LoIn) [12], Program Sustainability Assessment Tool [52]. Three other measures linked to sustainability in the overall GEM database were excluded, with reasons as follows: The COMMIT questionnaire [55], as it is specific for the COMMIT study; the Hospital Elder Life Program Sustainability Measure [56], as it was a qualitative interview guide rather than quantitative measurement tool; and the Community-Based Participatory Research (CBPR) Model-Sustainability Measure [57], as it prospectively measures likelihood for sustainability.

The SIRC-IRP had thirteen measures rated as measuring sustainability. Three measures were included in our review: Evidence Based Practice Sustaining Telephone Survey [48], Program Sustainability Assessment Tool [52], and Program Sustainability Index [54]. The remaining ten were excluded: School-wide Universal Behavior Sustainability Index-School Teams [58] and the Amodeo Counselor Maintenance Measure [59], as they were specific for a single intervention; the Change Process Capability Questionnaire [60], as it is a measure of an organization’s capability for successful implementation rather than an outcome measure; Eisen Provider Knowledge and Attitudes Survey [61], as it measures prospective intentions; Knowledge Exchange Outcomes Tool [62], as it does not measure constructs of sustainment; the Organization Checklist [63], as it is unpublished; the Prevention Program Assessment [Maintenance Scales] [64] and the Sustainability and Spread Activities Questionnaire [65], as items were not available for review; the Sustainment Cost Survey [66], as it measures costs only; and the General Organizational checklist [Clinical Sustainability Assessment Tool (CSAT) [53]

The CSAT measure was adapted from the PSAT based upon a literature review and expert-informed concept map**. The resulting tool is a 35-item self-assessment that clinical staff and stakeholders complete to evaluate the sustainability capacity of a practice. The developers define clinical sustainability as “the ability of an organization to maintain structured clinical care practices over time and to evolve and adapt these practices in response to new information” [53]. While the PSAT intends to assess a wide range of public health programs, the CSAT intends to assess the sustainability capacity of a clinical practice. The CSAT assesses the following seven domains: engaged staff and leadership, engaged stakeholders, organizational readiness, workflow integration, implementation and training, monitoring and evaluation, and outcomes and effectiveness. Respondents rate practices using a 7-point scale indicating the extent to which the practices are supported by these domains of processes and structures hypothesized to increase the likelihood of sustainability. Information regarding the psychometric functioning of the CSAT is not yet available and developers are in the “process of validating the tool” [53].

Program Sustainability Index (PSI) [54]

The PSI was developed to assess the attributes of sustainability among community programs. The measure was informed by a model of community-based program sustainability consisting of three cascading and/or sequential and linked domains: program sustainability elements (e.g., leadership, collaboration), middle-range program results (e.g., needs met, effective sustainability planning), and followed by the ultimate result of program sustainability. Development was also informed by the results of mixed methods studies examining elements of community program sustainability. Comprised of seven subscales, the PSI assesses both outer and inner context factors, including leadership competence, effective collaboration, understanding community needs and involvement, program evaluation, strategic funding staff integration, and program responsivity. Two items from the Leadership Competence and Strategic Funding subscales mention a timeframe of at least 2 years for assessment by the PSI. The PSI is administered by qualitative interviews or as a web-based survey to multilevel administrators. It has been used in child welfare, Veterans Affairs medical centers, and hospitals. Informed by factor analytic examination of measure structure and fit, the final measure is comprised of 29 items rated on a 4-point Likert scale. Cronbach’s alpha ranged from .76 to .88, indicating strong internal consistency. Measure limitations include knowledge required of both inner and outer context to complete and the focus is on sustainability versus sustainment.

Part 2: Sustainment measure constructs

Six measures were quantitative scales measuring sustainment and therefore had their constructs extracted and reviewed. The constructs from the measures were thematically extracted, which inductively resulted in 13 construct domains (see Table 2). In addition to continued delivery, three constructs relate to sustainability processes and benefits of the intervention: (i) monitoring (including fidelity and benefits), (ii) adaptation and improvement, and (iii) reflection and team involvement. Two constructs relate to the integration of the intervention, (i) institutionalization and (ii) routinization, and three constructs relate to the outer context (as defined by EPIS) [5, 7], (i) external support and communication, (ii) partnership, and (iii) financial resources and funding. Outer level constructs were contained in five of the six measures. There were three constructs relating to the inner context (as defined by EPIS [5, 7]), (i) leadership support and organizational climate, whereby there was innovation-fit with the goals and direction of the organization, (ii) organizational capacity in terms of sufficient resources and workforce, and (iii) training provided. Inner context, organizational constructs were contained in all measures. Finally, one construct encompassed the individual-level factors of staff support, their behaviors, and attitudes towards the EBP/innovation. The inner context, individual-level construct was found in five of the six measures reviewed.

Table 2 Constructs of sustainment and sustainability measures

Discussion

To extend the evidence base on pragmatic sustainment measurement, we conducted a multistep narrative review of sustainment measures. Using two large implementation science measure databases (GEM-D&I and the SIRC-IRP) and a supplemental narrative review of the literature, we identified 11 sustainment measures that met our review criteria of being used frequently, assessing sustainability/sustainment, and judged to be comprehensive and/or validated.

The recent acknowledgement of the need for both psychometrically strong and pragmatic measures is highlighted in the results of this review [73]. Out of the eleven measures meeting criteria, three were deemed time-intensive (e.g., many including as many as 40 or more items) and/or complex to complete (e.g., suited for interview style data collection). A related issue was that some measures required stakeholders at multiple levels or knowledge across multiple levels to complete. In general, the items in most measures were not well-suited for frontline providers to complete, but instead, existing measures were best suited for or required high-level administrators or executives with knowledge of EPIS or CFIR outer context factors such as community partnerships, funding arrangements, contracting, or financial resources. The sustainment of interventions, and especially discrete EBPs, relies on the inner context of health organizations and service providers delivering care [74, 75]; however, the available sustainment measures primarily assess outer context influences. While critical for implementation, questions related to outer context may be challenging for stakeholders delivering care to answer and can only be completed by a small subset of stakeholders (e.g., government administrators, organizational executives) [8]. Pragmatic measurement of sustainment should include measures from different perspectives, including measures that may be completed by inner context stakeholders such as direct service providers and supervisors whose voice has not been fully represented.

The methodology for develo** the reviewed measures varied widely. While some measures were based on sustainability frameworks (e.g., the PSAT was based on the capacity for sustainability framework) [76], others were less explicit about the items’ theoretical origins. Along with the range of sustainability definitions, there is a range of sustainability frameworks (e.g., Dynamic Sustainability Framework [25]) and implementation frameworks that include sustainability (e.g., EPIS [5, 7], CFIR [46], PRISM [77]). Ideally measures of sustainment and sustainability should map onto these frameworks [27]. Furthermore, there are a number of reviews of sustainability influencing factors [8, 27, 78, 79]. It was positive to see our findings, extracted as the sustainment measure constructs (Table 2), broadly align with these reviews. Of note, while some constructs regarding sustainability processes such as monitoring, and adaption and improvement did appear in the reviewed measures, others including planning, technical assistance, and navigating competing demands did not arise. In addition, the structural characteristics of the organization and characteristics of the intervention, implementer(s), and population were not explicitly included.

It is important to consider the time in which an intervention has continued to be delivered/used when measuring sustainment. Results suggest that half of the examined measures did not specify the time period for measuring sustainment. Among the remaining measures, the time period varied from a 6-month follow-up period (the original RE-AIM Framework Dimension Items Checklist) [47] to over 2 years (evidence-based sustaining telephone survey [48] and PSI [54]). For one measure, the SIC [50], the sustainment period was variable and project specific, informed by the time of implementation initiation and completion. Guidance or inclusion of the timeframe of sustainment should be modeled off recommendations that suggest measurement should occur 12 months, but ideally 2 or more years after implementation [8]. However, it is also possible to measure sustainment earlier in the implementation process to capture its dynamic nature and as a formative tool to tailor implementation strategies and plan for sustainment. This is aligned with the extension of the RE-AIM framework’s focus on “evolvability” across the life cycle of implementation with a goal of contributing a sustainable and equitable health impact rather than on sustainment alone as a set end point [35].

While the majority of measures were validated questionnaires, one measure from the GEM-D&I—the RE-AIM Maintenance Measure—has an open-ended structure to examine the individual and organizational-level long-term effects of a program on outcomes. Combined use of a more qualitatively oriented measure along with a brief, validated questionnaire may be a method to consider for research or practice-based projects with greater resources. Further, of the 11 measures identified, there were no measures deemed to be applicable across settings and that can be tailored for particular EBPs. This is greatly needed because of the growing number of implementation efforts involving the implementation of multiple EBPs concurrently or sequentially [74, 80, 81]. While continued delivery and adaptations over time may require an intervention-specific sustainment measure, we feel a generic sustainment measure that captures the broad constructs of sustainment would assist in creating generalizable evidence within implementation science. Sustainment measure development may be aided by the timely questions posed in the extension of the RE-AIM framework [35] that focus on health equity and dynamic context (e.g., Are the determinants of sustainment the same across low-resource and high-resource settings?).

There are challenges in rating the pragmatic qualities of implementation science measures. We applied the guidance from the recommended criteria for pragmatic measures by Glasgow and Riley [73] and the Psychometric And Pragmatic Evidence Scale (PAPERS) [82]. However, we found that the former was better-suited for evaluating patient/client-directed measures while the latter required data that was not available for us to complete ratings (e.g., level and types of stakeholder involvement in measure development).

The review highlighted that there appears to be currently no pragmatic, psychometrically strong measure of sustainment that can be easily completed by inner context providers. For good reason, the reviewed measures contain both inner and outer context factors. The majority of the measures require knowledge of external or bridging factors such as communication and partnerships with community members and politicians, and securing funding. The requirement of multilevel knowledge creates issues for respondents and we feel separate scales completed by stakeholders at different levels within the context have a place within implementation science. The review of the constructs provides guidance on the key constructs for inclusion in a pragmatic inner context measure of sustainment (Table 2). In summary, we believe a measure of inner context sustainment provides an important perspective in measuring sustainment. Such a measure could be used in combination with intervention-specific assessment of core component continuation (e.g., sustained fidelity) and adaptation, measures of intervention outcomes (e.g., patient or provider benefits), and measures of outer context sustainment (e.g., funding stability).

Limitations

Our review was not intended to be systematic or sco** review of all sustainment measures. Rather, our purpose was to identify the currently available and accessible measures of sustainment for implementation researchers and practitioners. In doing so, we have identified several research considerations to advance pragmatic and psychometrically robust measurement of sustainment.

Conclusion

There is a lack of pragmatic and psychometrically sound measures that can be completed by implementation stakeholders within inner context settings (e.g., frontline providers, supervisors). Further, there are a limited number of measures specifically addressing sustainment versus similar constructs of sustainability. Among these, current measures of sustainment are specific for particular settings or interventions, focus on outer context factors, and may be complicated for stakeholders to complete because of the outer context knowledge required. Our next steps are to address this need for a pragmatic measure of sustainment for inner context influencers that focuses on the integration and support perceived at a provider level.