Introduction

The concept of resilience has evolved from studies of the natural environment, in order to better understand development and operation of human-made systems [1]. Resilience has become popular recently and multiple fields of practice use the term. The management literature (see [2]) refers to the resilience of individuals, groups, and organizations. The psychology and sociology literature (see [3]) uses resilience to refer to an individual’s ability to cope with traumatic events. Political science (see [4]) describes resilience as the ability of groups and communities to withstand challenges that range from natural to human-made disasters. Civil engineering (see [5]) views resilience as the ability of structures and infrastructure to withstand challenges such as earthquakes. Ecosystem science (see [6]) uses resilience to describe the effect of challenges to natural habitat stability. The safety literature describes resilience as “the intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions” [7] and preparation makes it possible “to manage the unexpected” [8]. This paper describes resilience, the practice of resilience engineering (RE), and their role in pediatric care from the safety perspective.

PICU care setting

The pediatric intensive care unit (PICU) can be accurately described as a “socio-technical” system [9] that brings together facilities, equipment, supplies, and people from care providers to support staff and family members in widely variable configurations. Over time, changes occur to system elements, conditions, goals, capabilities, and more. A system such as the PICU has to adapt to those challenges in order to remain reliable, efficient, and safe. Whether it can adapt and how to best adapt both rely on understanding the system and its context. This requires that PICU clinicians and supervisors understand their unit’s abilities, as well as influences on and challenges to its performance.

Methods have been designed to study socio-technical systems such as the PICU. The Naturalistic Decision Making (NDM) approach to research uses a range of methods in order to understand individual and team cognitive performance in actual (not laboratory) settings and learn how they make decisions. Three criteria have been used to describe research that counts as NDM study: It focuses on expertise, takes place in field settings, and reflects the conditions that complicate our lives (e.g., uncertainty). NDM research sympathetically inquires how individuals or teams and organizations make certain types of decisions in natural settings, asking “How, or why, do they do that?” The answers reveal new information, can be used to develop new knowledge, and can lead to the development of new theories about human behavior in complex work domains [10].

Many traits of the pediatric critical care setting reflect Weick’s [11] criteria for study using NDM methods. Time is pressured, which requires interactions among many elements to be synchronized within short periods. Stakes are high, because critical care outcomes can have a significant effect on morbidity and mortality. The skill and judgment needed to perform pediatric care means practitioners are experienced. Information is inadequate because no single person has all of the knowledge that is needed. Goals are ill defined and sometimes conflicted as a result of multiple agendas that various staff members pursue. Rather than the blunt (management) end care activity occurs at the sharp (clinician) end, which makes it a setting that has a rich context. Conditions are constantly dynamic, resulting from emergencies, cancelations, patients who are unprepared or absent, variation in demand such as the type and volume of needed care, supply including qualified staff, and the kinds of procedures that are needed. Team coordination is essential because multiple roles and specialties from nurses to technicians, physicians, and clerical staff must come together to plan and interact to perform services. Technical aspects also must come together at specific times in a certain state of readiness, including staff, facilities, equipment, procedures, and patients and their families. The growth of this complex, nonlinear, interdependent, emergent system requires a different approach to understand it other than traditional methods such as root cause analysis.

Resilience engineering and PICU safety

Hollnagel [12] directs attention away from what he refers to as “Safety I” habits such as blame and train and error detection/mitigation. Instead, his “Safety II” approach involves appreciating what goes well in spite of challenges, and building on success to protect against future challenges. Others [13] suggest that patient safety difficulties may lie “more in the lack of an agreed upon, commonly understood set of core competencies (knowledge, skills, and attitudes)” and recommends training to ensure each who performs in patient safety roles is a “professional trained in the competencies of patient safety and quality to lead the patient safety agenda within organizations and systems.”

Scanlon and Karsh [14] contend that it is people who create resilience in systems, by acting as the one element able to sense problems and anticipate and put solutions in place to avert or mitigate them. A setting such as a PICU relies on the expertise of clinician and staff to know what options might be feasible when providing a service that is as complex as pediatric intensive care. Making multiple options available before they are needed makes it more likely they will be available when they are needed [15]. This implies that experienced staff must be available, so that they can have a ready sense of what initiatives would be most effective to engage unforeseen challenges when they occur. Insights such as these can help to appreciate the risks of a unit that relies on a lone agency nurse on the night shift on an understaffed unit with a large high-risk patient census.

Methods

Methods to study human performance in field settings are part of the NDM approach referred to as cognitive systems engineering (CSE) [16]. As an example of using CSE methods, Nemeth et al. [17•] explain how observation, structured and semi-structured interviews, and artifact analysis led to understanding Burn ICU cognitive work and work practices.

Hollnagel has developed two methods that are specific to resilience engineering (RE): the Resilience Analysis Grid (RAG) and Functional Resonance Analysis Method. The RAG’s purpose is “to provide a well-defined characterization (or profile) of a system that can be used to manage the system and specifically to develop its potential for resilient performance.” Hollnagel [18] invites the user to consider a series of questions for each system of four system abilities: respond, monitor, learn, and anticipate. Horsley et al. [19••] reported on the results of using the RAG as a “team resilience” approach to improve the quality of care in a Critical Care Complex. They found using the method increased team ability to anticipate (e.g., handover discussions of unanticipated constraints/demands), monitor (e.g., being explicit about thresholds for action in an acute event), respond (e.g., creating a team structure that allows all team members to help solve the problem), and learn (e.g., debriefing of all clinical events, even when they went well.) One advantage to using the RAG is that it is comparatively simple. This makes it accessible for those who do not have the time to learn a more complex method. However, that simplicity also requires those who use it to already understand what RE is and how to study systems.

The FRAM [20] can be used to analyze how something happens and to build a model of elements and their interactions that represents the functions needed for an activity to occur. FRAM evaluation of how the components interact describes how each depends on the other so that things do (most of the time) go well or (occasionally) do not. It also shows their interactions according to six aspects from input to output, preconditions, resources, time, and control. The result can be used to understand what a system does, where variability may affect outcomes, and decide how to manage it. A FRAM model can be used to study situations in the past (How did it happen?), present (How is work done?), and future (How to do it?). For example, a FRAM can assess how a surgical procedure is performed, including functions from “initiate surgical procedure” to “complete surgical procedure.” In one instance, output varied among all nine functions, leading to the unwanted outcome of a surgical sponge being left in the patient’s abdomen. Rafie et al. [21] report their use of FRAM as part of a quality improvement (QI) program to better detect and treat pediatric sepsis. Their team developed a pediatric sepsis model, identifying time, control, precondition, and resource, applying FRAM models to six cases. Results of their effort revealed sources of variability, making it possible to target and improve specific aspects of how they delivered pediatric sepsis care.

Case study: PICU

Cook et al. [22] described the phenomenon of “bum**” patients from the ICU as a way to make beds available for sicker patients. While widely acknowledged by clinicians, no data had been collected to demonstrate how the practice occurred. The author (CN) collected data on a 13-bed pediatric ICU and 20 bed step-down unit to determine whether the practice occurred and, if it did, how. Each day over a month he visited the PICU noting which beds were occupied, which patients were on IV medications, vasopressors or a ventilator, and where discharged patients had been transferred. Figure 1 shows a section of the results from the month-long study.

Fig. 1
figure 1

Portion of 1-month PICU bed management study. Copyright © 2004 Cognitive Technologies Laboratory. Used by permission.

Two-letter codes indicate patient source (e.g., OR = operating room), or destination (e.g., FL = patient floor). Symbols used for each bed indicate patient acuity. A number shows how many intravenous medications the patient is receiving. A thick border indicates a cardiac patient, who is likely receiving vasopressors. Rounded corners indicate the patient is on a ventilator. Patients on a ventilator and receiving many IV medications would be considered to be the sickest on the unit, while those with no ventilator or cardiac indication and a low IV number would be considered to be in the best relative health. The complete diagram shows the entire month-long span of PICU and step-down unit occupancy and transfers into and out of the unit. Patients who become healthier do so one by one. If clinicians were making transfer decisions based on individual patients’ condition, their transfers would not be clustered. If Cook’s “bum**” premise was correct, the diagram should show these healthier patients being transferred in clusters in order to make way for sicker patients. It would likely occur early in each week when patients would be expected to arrive in the PICU after undergoing surgery. The diagram does show clusters of transfers, particularly early in the week. This is evidence that PICU management anticipated change in demand volume, monitored patients who could be transferred, and responded to the increased demand by making deliberate transfers.

Case study: hand-offs

Pediatric care relies on effective collaboration among many clinical specialties, including how hand-offs are used to convey information about PICU patients. Understanding how hand-offs occur makes it more likely that such exchanges will be of best benefit to pediatric patients under widely varying circumstances. Nemeth et al. [23, 24] studied 12 between-shift hand-offs at the research site between pediatric fellows who were responsible for PICU management. Accepted wisdom in healthcare information technology (IT) at the time assumed patient hand-offs consisted of a static set of data that could simply be transferred. Instead, we found that the fellows used their expertise to tailor their exchanges of information and insights in response to a number of conditions.

Hand-offs are used to address both what is, and is not, known about a patient’s condition and to assess expectations for the oncoming shift. PICU hand-off communication skill depends on a clinician’s ability to set priorities about what information is relevant and to effectively transfer their insights. Uncertainty about patient condition influences hand-off content and form. The ability to accomplish a relevant hand-off efficiently affects care provision at multiple levels, and across specialties from intensivists, to nurses and technicians. As important as they are to effective care, sign outs are not taught but are instead learned on the job. Both care providers and patients are likely to benefit from more formalized attention to how sign-outs are conducted.

Using content and form, the study described how fellows conducted their hand offs on three different days on the same unit. Hand-off content indicates the relative length of time spent discussing each patient, and any side discussions such as socializing or unit resources. Hand-off form indicates how the conversation switched from a monolog (typically by the off-going fellow) with interjections (typically the on-coming fellow), to colloquy (dialog shared between off-going and on-coming fellow). The Saturday discussion about a patient (bed 5) lasted the longest due to uncertainty about that patient’s condition. The fellows concluded Wednesday’s exchange quickly when they noticed grand rounds were about to start. These exchanges showed that the fellows who conducted them were highly sensitive to context and used compact gestures and references and stylized expressions. They allocated time spent discussing individual patients according to the perceived severity and stability of each patient’s condition. They also adjusted the time spent on hand-offs based on workload factors such as impending rounds or procedures. This example shows how fellows created resilient responses to variability in patient condition, and workload.

Discussion

Complex systems such as the PICU consist of many interdependent elements that are difficult to accurately understand.

While pediatric care relies on norms, the care settings, care type and volume, care providers, and external influences all conspire to make each PICU unique. Circumstances in the unit change, sometimes in ways that are not evident. Care providers adapt in many ways in order to ensure their goals are met. The ways that circumstances change and the decisions clinicians and support staff make to adapt to them create outcomes, usually successfully and on some occasions not. Close attention to work as it is actually done can reveal much more than the way is expected to be performed. The examples described above show how pediatric care clinicians adapt. These well-grounded studies of everyday operations make “the limitations and opportunities of the system design, organizational structures, procedures, and training” evident and make it possible to understand what enables a system to adapt [25].”

The practical implication for pediatric care providers is: understand your pediatric care setting. Understanding can come from reflective practice, and methodical research. What Schön [26] terms “reflection-in-action” is consistent sensitivity to what one does and the results it produces that “can surface and criticize the tacit understandings that have grown up around the repetitive experiences of a specialized practice, and can make new sense of the situations of uncertainty or uniqueness which he may allow [him/herself] to practice.” Cook [27] suggests that an awareness of resilient performance when it occurs can protect a system from compromising the ability to adapt (becoming “brittle”). “Pre-accident hints that a system is becoming more brittle are often discounted or rationalized because of the benefits, e.g., speed in conducting tasks, that brittleness provides. The resulting tension between the desire for resilience and the gains derived from brittleness is the basis for resilience engineering.” He goes on to provide ten signs of resilience in action:

  • Recognizing altered situations

  • Anticipating possible trajectories

  • Assessing consequences, probabilities, and significances

  • Creating and deploying buffers and reserves

  • Hedging against high-loss outcomes

  • Mobilizing and directing resources

  • Sacrificing lower level goals

  • Switching tactics in escalating settings

  • Balancing recovery and rescue

  • Restoring capacity

The use of NDM and RE methods as this paper described can also make it possible to learn what goes well, and what circumstances may conspire to put the care setting at risk. Human performance and the real world of work in high hazard settings are difficult to learn [28]. That is the reason to study them using methods that are suited to a socio-technical system, such as those this paper has mentioned.

Summary

A well-grounded understanding of how care is actually provided will make it much more likely that pediatric clinicians and support staff can perform resiliently through successful anticipation and adaptation to unexpected demands.