1 Introduction

Uncertainty in health and altruistic behavior by doctors are two fundamental aspects in the organization of health care (Arrow 1963). Individuals are uncertain about the incidence of and recovery from diseases and illnesses, and about the availability of treatments and their effectiveness for health improvements. Naturally, physicians have greater knowledge about the alternative treatments and their consequences for recovery. This information asymmetry between patients and physicians is a traditional principal–agent problem. Physicians will to a large extent, if not fully, influence the quantity and quality of the health care provided, and their effort is influenced by their preferences for own income as well as altruism toward the patient, which are together affected by the prevailing payment system (e.g., McGuire 2000, for a general overview of payment systems). Due to budget constraints, a central issue in the provision of health care is allocation. Many countries have implemented explicit prioritization rules based on general ethical principles, such as equal access, cost and clinical effectiveness, and severity-graded valuation of health gains. Physicians are then incentivized to provide medical treatment that is in each patient’s best interest given these rules. As physicians are the gatekeepers to the health care system, it is important for policy makers to understand how physicians trade-off own income and altruism and how their behavior in this way corresponds to general principles for priority setting under different payment systems and in the presence of uncertainty about health outcomes. Our objective is to investigate whether and to what extent a physician’s altruistic behavior depends on the payment system and uncertainty in health outcome. To this end, we conduct an experiment building on the novel framework introduced by Hennig-Schmidt et al. (2011). We contribute to filling important gaps in the literature concerning the heterogeneity of physician altruism and the effects of introducing risk and ambiguity in health outcomes for patients.Footnote 1

Existing evidence concerning the heterogeneity of physician altruism is relatively scarce.Footnote 2 Hennig-Schmidt et al. (2011) found that altruism was an important component of physicians’ choices, and Godager and Wiesen (2013) and Brosig-Koch et al. (2016, 2017) showed that the level of altruism varied across both physicians and patients. We extend these results both conceptually and empirically by categorizing physicians according to how well their treatment decisions align with three different ethical principles for priority setting: severity of illness, capacity to benefit, and ex post equality. We find that many physicians are altruistic toward their patients and that the degree of altruism varies across patients with different medical needs. Interestingly, the type classification of physicians based on ethical principles is unaffected by payment system; the common categorization is that physician altruism is guided by severity of illness, both under capitation and fee-for-service. We replicate the previous finding that patients are undertreated in capitation payment systems and overtreated in fee-for-service systems, and we show that this result is independent of whether patient’s health benefit is deterministic, risky or ambiguous. This is our main contribution and a robustness result for the particular methodology introduced by Hennig-Schmidt et al. (2011) indicating that results from previous studies relying solely on deterministic patient’s health benefits can be extended to risky and ambiguous situations. In more detailed analyses, we find substantial individual heterogeneity in physicians’ responses to the introduction of risk and ambiguity in patient health. This heterogeneity partly depends on payment system because there is a significant correlation between physicians’ behavior and their generic risk and ambiguity preferences under capitation but not under fee-for-service.

2 Background

Medical costs are increasing over time and this is largely driven by an aging population and rapid technological growth in the health care sector (e.g., Newhouse 1992; Cutler 2002). This poses a challenge for health care systems around the world and raises important questions regarding how to organize the health care system, with efficiency and priority setting often being the most important issues to consider (Culyer and Wagstaff 1993; Zweifel et al. 2009). A central element is, of course, that physicians provide medical treatment that is in patients’ best interest, given existing rules and regulations for the health care sector. Thus, an important question is how to design physician payment systems to achieve the best possible outcome for patients. The optimal design of these payment systems will naturally depend on how physicians trade-off own income and altruism when making their decisions (e.g., Woodward and Warren-Boulton 1984; Ellis and McGuire 1986; Chalkley and Malcomson 1998; Ma and Riordan 2002; Galizzi et al. 2015).

One common way to understand physicians’ behavior is based on Ellis and McGuire (1986), who model physicians as motivated by a combination of altruism and a preference for own income. In this context, ‘altruism’ (toward the patient) captures the extent to which a physician values the patient’s health. Under capitation, the physician receives a fixed payment related to the people she is assumed to be responsible to care for, and then makes an effort to improve their health. Thus, the emphasis is on patient health and the altruism of physicians, who must give up own income to improve the health of the patients. In contrast, under fee-for-service, physicians are paid for each amount of health care provided. Thus, these two systems consider the physician’s preference for own income differently. In the extreme case of a physician motivated exclusively by own income maximization, the two systems provide completely different amounts of treatment. Under capitation, there will not be any treatment and hence the patient will be undertreated, while under fee-for-service the patient will receive the maximum treatment, or until the point where the marginal utility of effort equals the utility of income provided by that unit of effort for the physician. In a fee-for-service system, this could result in harmful treatment levels, which relates to the concept of supplier-induced demand (e.g., Fuchs 1978; Dranove and Wehner 1994; Gruber and Owings 1996). In the other extreme case, where the physician is only affected by altruism, both payment systems result in optimal provision of health care for the patient. More likely, however, is that we have intermediate cases where the physician derives utility both from own income and altruism, as discussed by for example Arrow (1963). The relative degree of physicians’ preference for own income in relation to their altruism influences which of the two remuneration systems that is preferred from a societal perspective.

Despite the increased reliance on monetary incentives in health care, little is known empirically about the effects on physicians’ decision making at patient level. A key reason is the difficulty to find exogenous variation in physician payment system using naturally occurring data. It is very challenging to conduct field experiments, not least from an ethical perspective since the physical health of the participants would be affected (without their consent), and naturally only a few studies exist.Footnote 3

An emerging literature in health economics has used laboratory experiments to facilitate detailed investigation of the causal impact of payment system on physician behavior. Hennig-Schmidt et al. (2011) introduced a novel framework where subjects take on the role of physicians and decide on the provision of medical care for different types of patients, who are identical in all respects other than the degree to which a given level of medical treatment affects their health. One of the key benefits of using a lab experiment is that there exists a theoretically known and unique optimal quantity of medical care that maximizes the health benefit in each patient, and furthermore that it facilitates ceteris paribus variation in physician payment systems in a controlled environment. A robust finding in this experimental literature is that physicians provide more medical care under fee-for-service than under capitation, to the extent that patients are overtreated in fee-for-service and undertreated in capitation (Hennig-Schmidt et al. 2011; Brosig-Koch et al. 2013; Keser et al. 2014; Hennig-Schmidt and Wiesen 2014; Brosig-Koch et al. 2015, 2017). Moreover, Brosig-Koch et al. (2013) and Brosig-Koch et al. (2015, 2017) investigate whether physicians provide better medical treatment under capitation or fee-for-service, and capitation seems to be the marginally better system in this respect. Using the data in Hennig-Schmidt et al. (2011), Godager and Wiesen (2013) estimate the relative weight attached to own income and altruism. They find that most subjects assign a weight to both components, but there is large heterogeneity in the weights assigned. An interesting possibility concerning physician altruism is that it might also vary across patients with different illness conditions both in terms of severity and benefit from treatments, as discussed by Brosig-Koch et al. (2017).

Physicians might be more altruistic toward patients with more severe illnesses, or they might be more altruistic toward patients with a high capacity to benefit from treatment. These are two different principles for allocation of health care based on medical need: The capacity to benefit principle implies a goal to maximize the overall level of health in the population whereas the severity of illness principle implies that health maximization should be weighted by medical condition (e.g., Dolan and Olsen 2001; Dolan et al. 2005). In practice, capacity to benefit has been the overarching goal in for example the UK, using maximization of quality-adjusted life years (QALYs) as an explicit criterion, whereas severity of illness has been important in Norwegian, Swedish, and German health policy (Shah 2009). How physicians’ behavior corresponds to different principles for priority setting, therefore, has implications for how well the quality and quantity of health care provided correspond with the existing rules and guidelines established for the health care sector. Different principles for allocation according to medical need yield different behavior, which can be traced back to different rules for priority setting (Culyer and Wagstaff 1993; Olsen 1997; Cookson and Dolan 2000). Based on the taxonomy in Williams and Cookson (2000), we can identify three general principles for provision according to medical needs (two of which have already been mentioned). First, the severity of illness principle gives precedence to the patient with the worst no-treatment profile, i.e., the one with the worst outlook in the absence of treatment. Justifications for this principle can be made on grounds of equality (e.g., Williams 1962) and on a “rule of rescue” (Cookson and Dolan 2000). Second, a capacity to benefit principle places the greatest weight on patients who will benefit the most from treatment, justified on grounds of maximizing the overall level of population health. Finally, one could aim to equalize ex post health, justified by for example the “fair innings” argument, which stresses everyone’s right to a similarly normal span of life years in good health (Williams 1997).

For a systematic and detailed investigation of the influence of patients’ medical needs on physician decision making, an experimental approach is suitable since it facilitates controlled ceteris paribus variation in patient characteristics and medical conditions, something that would be extremely difficult to accomplish using naturally occurring data (see, e.g., Falk and Heckman 2009 for a general discussion on causality and experimental and survey methods). Based on an extension of Hennig-Schmidt et al. (2011), we introduce a type classification of physicians in terms of conditional altruism, capturing how well their treatment decisions in the experiment align with the different principles for priority setting discussed above.

Uncertainty is prevalent in almost every situation involving clinical decision making, since it is difficult to pin down the exact probability of a successful treatment outcome due to, for example, the complexity of many illnesses and expert disagreement regarding the effect of many treatments (for a discussion, see, e.g., Berger et al. 2013). The health outcome for the patient can differ across conditions and interventions even when the expected health gain for a single patient is quite similar. In some rare cases, the outcome for the patient is deterministic, i.e., it is known for sure, but in other cases there is a risk in the outcome with known probabilities, for example, based on a large sample of previous treatment outcomes. In other cases, however, there is ambiguity in treatment outcomes, meaning that the probability of a success of a treatment is unknown. Previous experimental research has shown that individuals make different choices under risk than under ambiguity, and that they tend to prefer situations characterized by risk rather than ambiguity even when one takes their subjective beliefs about probabilities into account (Ellsberg 1961; for a review see, e.g., Camerer and Weber 1992). Previous studies that investigated risk taking on behalf of others using neutral or non-medical frames have indicated mixed results compared to risk taking for oneself (e.g., Andersson et al. 2016; Chakravarty et al. 2011; Heinrich and Mayrhofer 2018; Vieider et al. 2016), while similar comparison for decisions under ambiguity is scarce (König-Kersting and Trautmann 2016). Arrieta et al. (2017) investigated attitudes toward risk taking for others in different medical contexts and found that subjects on average were risk averse but also that the magnitude of risk aversion varied across different contexts. Since previous experimental studies on physicians’ decision making abstract away from risk and ambiguity in the effect of medical treatment on patient’s health, we introduce these features in our experiment as a robustness check for the particular methodology developed by Hennig-Schmidt et al. (2011). This allows us to investigate if uncertainty could affect physician behavior and altruism toward the patient.

3 Experiment

The experimental design builds on the framework introduced by Hennig-Schmidt et al. (2011) and extended by Brosig-Koch et al. (2016). We stay close to the latter version. Subjects in the role of physicians decide on the quantity of medical treatment \( q \in \left\{ {0,1, \ldots , 10} \right\} \) for different types of patients. Physicians incur (convex) costs when treating a patient and they also receive a payment, either a fixed sum up front (capitation) or a sum that varies with the level of medical treatment they provide (fee-for-service).

Physicians met patients with five different health profiles (A–E), each characterized by a health benefit function map** the amount of medical treatment (q) into health benefit. Figure 1 displays the benefit functions for the health profiles used in the experiment. For instance, if the physician gives a patient with health profile A five units of medical treatment (q = 5), the resulting patient’s health benefit is eight Taler (the experimental currency used, later translated into euros at a 1:1 conversion rate). For each patient, there is a unique interior level of medical treatment where the health benefit is maximized, and in either payment system the physician is faced with a trade-off between patient health and own payoff maximization. Under capitation, physician profit (π.) is given by \( \pi \left( q \right) = R - c\left( q \right) \), where R is the capitation payment set to \( R = 10 \) and \( c\left( q \right) = q^{2} /10 \) is the convex cost function. Under fee-for-service, physician profit is instead given by \( \pi \left( q \right) = 2q - q^{2} /10 \), where \( 2q \) is the fee-for-service payment. Experiment parameters are, thus, set such that a selfish physician would choose q = 0 under capitation and q = 10 under fee-for-service. A physician who is also motivated by altruism would choose closer to the level of treatment that is optimal for the patient, and if sufficiently altruistic would provide precisely at the optimum. To facilitate comparison across capitation and fee-for-service, the absolute values of marginal profits and the maximum profits are equivalent under both systems. Moreover, the benefit functions are symmetric, which is a simplification of reality but it ensures that the marginal effects (for the patient) of undertreatment and overtreatment are identical (Brosig-Koch et al. 2017).

Fig. 1
figure 1

Patient benefit functions (medical treatment (q) on the horizontal axes and health benefit in monetary terms on the vertical axes). Note: the health benefit (on the vertical axes) for patients is expressed in Taler, the experimental currency that was translated into euros at the end of the experiment

There are no real patients present in the lab, but the amounts resulting from physicians’ decisions are transferred to a charity funding medical treatments. Thus, the health benefits in the experiment were expressed in monetary terms, and in the experiment they were labeled Talers. These features were key parts in Hennig-Schmidt et al. (2011) and have since then been adopted by subsequent studies building on that design. The donation is crucial since it means that decisions taken for (abstract) patients in the experiment have consequences for real patients outside the lab. To ensure participants that the donations would in fact be made after the experiment, they were informed that receipts of each transfer would be published on the experimental laboratory’s blackboard on a given date.Footnote 4

To meet our research goals, we extend the design in Brosig-Koch et al. (2016) along two dimensions: (i) allowing for explicit tests for principles for priority settings and (ii) uncertainty. First, our design enables us to investigate physician behavior and altruism toward the patients based on systematic pairwise comparison of different types of benefit functions. Properties of the benefit functions capture how patients differ from each other in terms of medical need, i.e., with respect to severity of health problem (intercept), capacity to benefit from medical treatment (slope), and optimal level of medical treatment. Health profiles C and E are carefully designed to facilitate pairwise comparison in this respect. Thus, C patients are worse off without medical treatment (severity of illness principle) and they require treatment closer to the level where their health benefit is maximized to equalize health ex post (equality principle). Then, we develop a classification of physicians based on how well their treatment decisions for C vs. E align with different principles for priority setting.Footnote 5 These profiles are similar in that the health benefit is maximized at the same level of medical treatment (q = 5) and that the functions have the same intercept, but the slope of the benefit function is steeper for E than C. The basis for this classification is summarized in Table 1. We can see in the table that the equality principle requires that treatment to C is closer to the level where the health benefit is maximized, since compared to E the benefit function of C is flatter. On the contrary, by the capacity to benefit principle, E patients should be provided treatment that is closer to the maximizer of the benefit function, since the marginal effect of medical treatment is higher for E than C. Finally, since C and E have the same no-treatment profile, we classify physicians by the severity of illness principle if they provide C and E with equal amounts of medical treatment, but at the same time not at the profit-maximizing level (purely selfish) or at the level that is optimal for the patient (purely altruistic).

Table 1 Classification of physicians based on priority principles

The second novelty of our design is that we allow for risk and ambiguity in the outcome of medical treatment. Previous studies only considered deterministic outcomes. We introduced risk as follows: for each deterministic health benefit function, a given level of medical treatment (q) yields a good outcome for the patient in terms of health benefit with a 0.5 probability and a bad outcome for the patient with a 0.5 probability. The good outcome was always one Taler above the corresponding outcome in the deterministic case, and the bad outcome was always one Taler below the corresponding deterministic outcome.Footnote 6 Ambiguity was introduced in a similar manner, but in contrast to risk the probabilities of the outcomes were unknown. Risk was represented by an opaque bag consisting of five black and five white balls, and ambiguity by an opaque bag consisting of ten black and white balls with an unknown composition. At the end of the experiment, a draw from each of these bags determined how risk and ambiguity were resolved at the level of the patient, i.e., whether the patient’s benefit resulting from a physician’s treatment decision would be the good outcome or the bad outcome (see below for more details about “Experiment” procedures).Footnote 7 Generally speaking, given the health benefit functions and assuming risk aversion, we expect physicians in the experiment to on average provide medical treatment closer to patients’ optimal treatment levels when the treatment outcome is risky or ambiguous than when it is certain.Footnote 8 The reason is that physicians cannot offset risk for the patient by choosing a safer treatment, but they can offset the downside risk for each level of medical treatment by providing a little closer to the patient optimum than they thought was ideal when patient’s health was deterministic. Of course, this effect depends on how physicians weigh their own profits and patient benefits; the effect is likely strongest for intermediate levels of altruism. Physicians who care mostly about own profits will probably provide close to the profit-maximizing level regardless of risk and ambiguity in patient’s health. Similarly, physicians who care mostly about patient’s health will always provide at treatment levels that are close to optimal for the patient.

We conducted two types of conditions, with the only difference that a capitation payment system was used in CAP condition and a fee-for-service payment system in FFS condition. In each condition, physicians first made five treatment decisions in the case of deterministic patient health and each patient belonged to a different health profile A–E; then followed another five treatment decisions when patient health was risky; and the last five treatment decisions concerned ambiguous patient health.Footnote 9 Table 2 shows the timeline for the experiment investigating the behavior of physicians and describes each condition in more detail. It is important to ensure that subjects understand the experiment and do not learn during the course of the experiment. To consider the familiarity with the decisions in all treatments, we decided to explain all three treatments before the subjects made any decisions. In addition, we included extensive comprehension questions for the different treatments, where the subjects were asked questions about costs and benefits if certain quantities of medical treatments were provided.Footnote 10 We made the comprehension questions deliberately tricky by asking open-ended questions and the subjects could only pass a set of questions if all were answered correctly. If at least one question was incorrect, they were not informed which question(s) that was answered incorrectly. A subject could not begin the experiment before all comprehension questions were correct and all subjects passed several sets of comprehension questions. Thus, when the experiment began, the subjects had information about all treatments and passed several extensive comprehension tests. Given our instructions including the comprehension questions, we do not think that the order of treatments should influence the decisions. Thus, we kept the order of the treatments fixed (deterministic, risky, and ambiguous), which was the same order we introduced them in the introductions.Footnote 11

Table 2 Conditions and timeline for the experiment investigating physician behavior

The experiment was conducted at MELESSA, University of Munich. It was computerized with z-Tree (Fischbacher 2007) and participants were recruited using ORSEE (Greiner 2015). The choice of subject pool needs to be carefully thought of in relation to the research question. Galizzi and Wiesen (2018) conclude their discussion on subject pool on page 10 as: “Taken together, the experimental studies that systematically account for potential differences in the subject pools indicate that the direction of a treatment effect does not differ between (medical or non-medical) students and medical professionals. Importantly, however, the intensity of a behavioral effect might vary across subject pools”. Our key research question focuses on if there is a difference in treatments, if the health outcome is deterministic, risky or ambiguous, i.e., we are primarily interested in the direction of treatments on behavior rather than the intensity per se. To sum up, our subject pool serves the purpose to allow us to investigate our research question.Footnote 12 Subjects were randomized into cubicle workstations and given plenty of time to read the instructions (see Online Resource) and ask questions (in private) before the experiment began. They had to answer several control questions, and the experiment did not start until everyone had answered all questions correctly.

In the experiment that investigates the behavior of physicians, subjects made 15 decisions in total, on how much medical treatment to provide different patients. At the end of the experiment, one of these decisions was randomly drawn and paid out to subjects. Risk was represented by an opaque bag consisting of five black and five white balls, and ambiguity by an opaque bag consisting of ten black and white balls with an unknown composition. At the end of the experiment, one randomly selected individual drew a ball from each bag (and another individual announced the outcome). Before the drawing, all subjects had to guess (in private) the color of the ball drawn from each bag and a correct guess resulted in a good outcome and an incorrect guess resulted in a bad outcome for the corresponding decision of that particular individual. Subjects knew that they could verify the results from the public drawing (and check the composition of each bag) at the end of the experiment.

In a second part of the experiment, subjects’ individual risk and ambiguity preferences were elicited in a generic setting. We followed Sutter et al. (2013) and used incentivized choice lists where subjects decided between a fixed prospect and a sure amount of money. The fixed prospect consisted of a 50–50 lottery where the outcome was either 0 or 5 Taler.Footnote 13 Risk was represented by an opaque bag consisting of five black and five white balls. We also test for ambiguity preferences using a choice list where the ambiguous outcome was either 0 or 5 Taler and the payoff was represented by an opaque bag consisting of ten black and white balls, exactly as in the first part of the experiment where subjects made decisions in the role of physicians. One of the choices was randomly selected and played out for real. The money won added to the subjects’ total earnings in the experiment. In addition, subjects answered a few general questions related to health care and medical decision making and a questionnaire on socio-economic characteristics. In total, 64 undergraduate students participated in the FFS condition and 66 in CAP. We evaluate the randomization into the two conditions using non-parametric tests. The differences in age, gender, years at the university, and individual risk and ambiguity preferences are insignificant at conventional levels of significance, indicating that the randomization was successful.

4 Results

We begin the analysis by investigating the average quantities of medical treatment chosen for each patient and the optimal quantity required by each patient (Fig. 2). We can see a strong effect of payment system. For example, patient 1, where optimal treatment is 7 units, is on average provided with 8.03 units of medical treatment in FFS and 2.73 units of medical treatment in CAP. The experimental results are the same for the other patients. These findings are in line with results from previous experiments. Interestingly, the pattern prevails under risk and ambiguity. For example, patients with health profile A (patients 1, 6, and 11) are provided close to 8 units of medical care in FFS irrespective of whether patient health is deterministic, risky, or ambiguous. Taken together, physicians do not change the level of medical treatment for risky and ambiguous health outcomes compared with the case of treating the same types of patients with deterministic health outcomes.

Fig. 2
figure 2

Average quantity of medical treatment per patient. FFS condition with fee-for-service payment (N = 64), CAP condition with capitation payment (N = 66). Error bars represent 95% confidence intervals calculated from independent sample t tests at the level of each patient (x-axis)

Table 3 shows the effect of payment system on the average deviation from optimal medical treatment. On average, patients are overtreated by 2.48 units in FFS and undertreated by 2.46 units in CAP (Mann–Whitney U test; P < 0.01, N = 130). Since benefit functions are symmetric, the absolute deviation from optimal treatment is what influences the realized health benefit for the patient. We can see in the table that in this respect there is no significant difference across the two payment systems (P = 0.82). Overall, patients are treated equally well (or badly) in either payment system. The results are similar when disaggregated by health profile (A and B vs. C, D, and E) (see Table 9 in the Appendix). The final three columns of Table 3 show the results disaggregated by deterministic, risky, and ambiguous health benefits. We can see that there is a small increase in the deviation from optimal treatment when the outcome in patient health is risky or ambiguous instead of deterministic, but the differences are generally insignificant, except in FFS when health benefits are ambiguous.Footnote 14 Taken together, the introduction of risk and ambiguity in the effect of medical treatment on patient health does not affect physicians’ provision behavior in the experiment.

Table 3 Effect of payment system on medical treatment

Result 1 The overall pattern of provision behavior is unaffected by the introduction of risk and ambiguity in the effect of medical treatment on patient’s health.

The effects of risk and ambiguity are investigated in more detail using subjects’ generic risk and ambiguity preferences, which were elicited in a later (second) part of the experiment. Table 4 compares deterministic and risky health benefits: medical treatment measured as the absolute deviation from optimal treatment is regressed on a dummy for the five patients with risky benefit functions, on the measure of individual risk preference, and on the interaction between these two variables and several control variables (including age, gender, and health status). Standard errors in the regressions are clustered on individuals. In FFS (columns 1 and 2), there is a small effect (significant at the 10% level) of the introduction of risk for patients, such that physicians on average increase their overtreatment by 0.105 units of medical care; yet there is no effect of physicians’ risk preferences. In contrast, in CAP, physicians’ risk preferences are correlated with both how well they treat their patients on average (column 3) and how they change their behavior when patient health is risky rather than deterministic (column 4). As indicated by the positive and significant interaction term, introducing risk in the effect of medical treatment on patient’s health makes physicians who are more risk averse provide medical treatment closer to patients’ optimal treatment levels. This effect is significantly stronger in CAP than in FFS (Appendix Table 12).

Table 4 Generic risk preference and medical treatment (linear regression)

Table 5 shows the results from the corresponding analysis of ambiguity in the effect of medical treatment on patient health. We can see an overall and marginally significant effect of ambiguity on physicians’ provision behavior in FFS (columns 1 and 2), similar to the effect of risk investigated in Table 4. In CAP, the effect is correlated with physicians’ generic ambiguity preferences (column 4) and this effect is similar to the one observed when patient health is risky.

Table 5 Generic ambiguity preference and medical treatment (linear regression)

Result 2 Physicians’ generic risk and ambiguity preferences affect their reactions to risk and ambiguity in the effect of medical treatment on patient health under CAP. In particular, more risk-averse physicians provide treatment closer to patients’ optimal treatment levels following the introduction of risk in patient health; and more ambiguity-averse physicians provide treatment closer to patient optimum following the introduction of ambiguity in patient health.

Next, we turn to the analysis of altruism and conditional altruism of physicians. Table 6 displays the proportion of subjects who made a selfish decision at least once and the proportion who were selfish in all their decisions. To make the presentation easier to overview, we restrict attention to the case of deterministic benefit functions given small differences compared to risky and ambiguous benefits. In FFS, 20.3% of physicians always made a selfish decision and 31.3% did so at least once. By comparison, in CAP, 10.6% were always selfish and 27.3% were selfish at least once. It is interesting that physicians who were always selfish when deciding for C–E patients were also selfish in their decisions for patients with profiles A and B, even though it is indirectly less costly (in terms of foregone profit) to provide A (in FFS) and B (in CAP) with optimal treatment.Footnote 15 Table shows a similar analysis for the proportion of physicians who made optimal decisions for their patients: 46.9% in FFS and 53.0% in CAP decided optimally at least once, and 10.9% in FFS and 3.0% in CAP provided optimal treatment for all their patients. This indicates that many physicians are altruistic but also that the degree of altruism varies across different types of patients and payment system. This is further explored in Table 7, where physicians’ treatment of patients from health profiles C and D is compared. The comparison is interesting since we know that C patients are in equal or greater medical need than D patients. While C and D patients’ capacity to benefit from medical care is identical, C patients are worse off without treatment (severity of illness principle) and they require treatment closer to the level where their health benefit is maximized to equalize health ex post (equality principle). We can see in the table that C patients indeed receive better medical treatment than D patients do, and that the difference is most pronounce under fee-for-service.

Table 6 Proportion of selfish and altruistic physicians when patient’s health is deterministic, by treatment and health profile
Table 7 Provision of medical care to patients from health profiles C and D

Result 3 Many physicians are altruistic toward their patients. The degree of altruism depends on the medical need of the patient: patients in greater need of medical care receive better treatment, i.e., they are given treatment closer to the level where their health benefit is maximized.

The type classification of physicians based on how their behavior and altruism vary across patients with different medical needs is based on a comparison of medical treatment provided to patients with health profiles C and E. We can identify three broad categories of physicians: (1) purely selfish physicians who always choose so as to maximize own income, (2) purely altruistic physicians who always choose optimally for the patient, and (3) physicians who choose a mix between full altruism and own income maximization. Table 8 shows the distribution of physicians over these three broad categories, based on their decisions when health outcomes are deterministic. Interestingly, there are more purely selfish physicians but also more purely altruistic physicians in FFS than in CAP, which thus has a greater share of physicians motivated both by own income and altruism. The difference is not statistically significant, however (Chi-square test; P = 0.13, N = 139).

Table 8 Distribution of physician types

We compare treatment decisions for patients with health profiles C and E, to categorize subjects based on how their treatment decisions align with different principles for priority setting (explained in Table 1). The distribution of physician types based on this classification is shown in the bottom panel of Table 8. Among those who choose a mix between full altruism and own income maximization, the altruistic part is mostly described by concerns for severity of illness (38.5% and 47.1% of FFS and CAP subjects, respectively). This principle implies priority to patients with worse no-treatment profiles, even if their capacity to benefit from medical treatment is comparatively low. The capacity to benefit principle follows quite closely (35.9% of subjects in FFS and 39.2% in CAP), and only a minority of subjects behave in accordance with the principle of ex post equality. We can also see that there are some differences across the two payment systems, in particular that the distribution is more balanced in FFS than in CAP, but the difference is not significant using a Chi-square test.


Result 4 There are more pure altruists but also more pure selfish physicians in FFS than in CAP, but the more detailed distribution of physician types according to how well their treatment decisions correspond to the different principles for priority setting is not affected by payment system. In both CAP and FFS, altruistic behavior toward the patients is mostly guided by severity of illness.

5 Discussion and conclusion

We use a lab experiment to investigate the altruistic behavior of physicians and whether this behavior is affected by payment system, and risk and ambiguity in health outcome. The experiment shows that many physicians behave altruistically toward their patients but also that the degree of altruism varies across patients with different medical needs. This variation across patients is best described by concerns for severity of illness. There are more selfish physicians but also more purely altruistic physicians under fee-for-service than under capitation. Interestingly, the type classification of physicians based on ethical principle is unaffected by payment system. Moreover, we replicate the strong effect of payment system on physicians’ provision behavior found by previous studies using a similar design, and show that this result is robust to the introduction of risk and ambiguity in patient health. There is, however, substantial variation across individuals, and this effect partly depends on payment system. Under capitation, greater generic risk aversion means that subjects on average make larger adjustments toward patients’ optimal treatment levels when deciding for patients with risky health benefits. There is a similar effect for ambiguity aversion, but neither of these effects is present under fee-for-service payments. Although the overall differences between treatments were not statistically different, a natural extension is to investigate the same research question in a natural setting to investigate the differences in intensity.Footnote 16

Medical need is a key aspect in selecting socially preferred allocations of scarce medical resources in both normative analyses and in health policies. Discussing how physician behavior and altruism correspond to different principles for priority setting based on medical need, the capacity to benefit principle can be seen as the benchmark since it is directly connected to the marginal health benefit of the patient.Footnote 17 It is interesting to note that a majority of physicians in the experiment do not behave in accordance with this principle, but rather seem to factor in other aspects of medical need as well, such as the severity of illness. This is in line with the public view in many countries, that health improvements are more valuable for people with more severe illnesses (e.g., Dolan et al. 2005; Shah 2009; Nord and Johansen 2014). This is also reflected in health policy. For example, the NICE Citizen Council in the UK has concluded that severity of illness should be taken into account when making decisions, alongside the already established criteria based on cost and clinical effectiveness (NICE 2008).Footnote 18 Similarly, in the early 1990s, the State of Oregon increased the coverage of the publicly funded Medicaid health program. Prioritization based strictly on cost effectiveness was emphasized to contain costs, but the public reactions were negative and the policy was later adjusted so that other factors, including the severity of illness, were also allowed to influence prioritization (Ham 1998; Tinghög 2011).

For health policy, the main takeaway message from our experimental findings is that capitation and fee-for-service payment systems seem to affect neither the distribution of medical care nor the overall physician response to risk and ambiguity in patient health for reasonably large probabilities. However, our results, as well as results from most experiments, are based on a narrow set of parameters. For example, many cases in medical decision making involves cases with small probability for a very adverse outcome, and empirical studies on decision making under uncertainty indicate that subjects overestimate losses with small probabilities (e.g., Abdellaoui 2000). Moreover, real-life decisions might also be sensitive to contextual factors, where case-based studies would be needed to investigate such an issue. These important extensions of our work, we leave for future research. One aspect in favor of the capitation system is that it resulted in fewer cases of physicians only interested in own income. This payment system also generated a greater proportion of physicians motivated by a mixture of own income and altruism corresponding to one of the three principles for priority setting introduced by Culyer and Wagstaff (1993), and these physicians were generally responsive to patients’ medical need in their treatment decisions. However, physicians in the capitation system were also more heterogeneous in how they responded to the introduction of risk and ambiguity in patient health, where their responses seemed to be modulated by their own risk and ambiguity preferences. We reasoned beforehand that introducing risk in patient health should push treatment levels toward the patient optimum if physicians were risk averse, and this is consistent with the effect that we see for risk-averse physicians under capitation, but not under fee-for-service. One interpretation of this difference between payment systems could be that monetary incentives under fee-for-service are more salient to physicians, thus leaving less room for decision making modulated by individual risk preference and other factors. The heterogeneity within each payment system regarding how physicians’ respond to patient risk and how their behavior corresponds to general principles for priority setting calls for careful consideration when designing payment systems in the future. In the future, it is also important for research to compare actual behavior with different principles for priority setting, and of course to fine-tune and test for different definitions of principles for priority settings (e.g., Williams and Cookson 2000). An important aspect in this respect could be to develop clearer clinical guidelines, something that has been emphasized, e.g., in the UK. Another suggestion is to test mixed or performance-based payment systems (e.g., Brosig-Koch et al. 2013, 2017), which could be further investigated in future research.