Background

Pressures on public sector budgets lead to concerns about the efficient allocation of public sector spending [1], along with searches for mechanisms to assess the ‘value for money’ of options considered [2]. Decisions on budget allocation between or across departments are generally made at a central political (Cabinet) level, but the factors, weights, and quantification of Cabinet level decisions are not readily publicly available. Possibly as a consequence of that, over recent decades, academic research effort has focussed on mechanisms to achieve a more efficient allocation of funds within specific departments or spending programmes [3,4,5,6,7].

In countries with a publicly funded universal healthcare system, there has been substantial discussion and academic debate about the efficiency of funding within the healthcare sector [8,9,10]. The opportunity cost (OC) of spending within health has been a focus of both policy and research. Much of the literature is dedicated to identifying “thresholds” above or below which the OC of spending is either acceptable or not. The methods used are, however, based on a core assumption that budgets are fixed [3, 11, 12]. Consequently, in healthcare, new projects, at least in theory, ‘compete’ for funding within a restricted budget for health, so that only those which are the most cost-effective should be funded [13,14,15].

In more pragmatic terms, countries around the world apply a form of health technology assessment (HTA) based on the consideration of an explicit or implicit cost-effectiveness threshold to recommend if new treatments should be funded for patients. The most common benchmarking metric used by HTA agencies is ‘cost per QALY’, where QALY stands for quality-adjusted life year. QALYs measure years of life, adjusted for quality of life linked to the health status during those years [16]. In England, for example, the National Institute for Health and Care Excellence (NICE) sets, for most technologies, £30,000 as the maximum cost to be incurred by the NHS to obtain an additional ‘unit of healthy life’, measured in terms of QALYs [17].

Historically, the measurement of quality of life has also been a research focus in a number of non-health sectors where health is an important metric in the outcome space, such as transport or the environment [18]. However, that research neither evolved into the development of a sector-specific quality-of-life related measure, nor facilitated the introduction of (already developed) QALYs as a generic outcome measure for those sectors. In non-health sectors, cost–benefit analysis is the most commonly used method to compare alternative interventions ‘competing’ for funding [19]. The assessment of benefits usually involves a monetary valuation of the outcomes produced by these interventions. In this context, the thresholds attain a different interpretation: they reflect the within-sector value provided to that outcome (typically addressed as societal ‘willingness to pay’ or WTP). WTP measures are generally used to rank projects according to their value for society (‘consumption value’ of health), and they are not usually sufficient as a rule for rationing: decision-makers still need to establish a policy threshold to ration and make the acceptance/rejection decisions [20]. For instance, health outcomes in England’s Department for Transport (DfT) are frequently assessed against benchmarks such as the value of a life or the value of a prevented fatality, used as WTP thresholds. However, not all projects for which the WTP (based on the value-of-a-life threshold) exceeds cost (i.e., for which the net present value estimated through the cost–benefit analysis is positive) are finally implemented. In reality, the actual rule used to decide on the implementation of the policy—i.e., the minimum net present value above which a policy will be implemented—is not made explicit in any form or shape.

Consequently, it is likely that the health-related outcomes of different interventions are measured and valued differently across public sector areas. We could think, for instance, of a policy that generates an amount of life gained which is considered as good value for money by one department but is rejected by another department that uses a different benchmarking threshold. For instance, the DfT in England is willing to pay up to £70K for a QALY gained [19], whereas the NICE threshold indicates that £20,000-£30,000 is sufficient to generate one QALY. These inconsistencies are a fundamental issue when considering trade-offs against a broader public sector spending budget. Recently, a number of authors have highlighted the importance of shifting the allocation framework and research towards an ‘all-encompassing’ multi-sector approach [2, 21,22,23]. This could begin to ameliorate these shortcomings.

While there is vast literature on the social value of health gains in specific sectors [8, 24,25,26], few researchers have sought to compare how health gains are appraised across public sectors [3, 27, 28]. Similarly, we identified in the literature a large number of papers on the empirical, monetary valuation of health from the OC perspective [29, 30], but these papers do not offer a pragmatic view of how much money is linked to an additional unit of comparable benefit (e.g. a QALY, a life year, a life) in different public sector departments or programmes. Finally, there is little research on identifying the (typically implicit) policy threshold that relates to the value of health that is considered when accepting or rejecting a policy with a direct impact on health. Note that a policy threshold may differ from the most accurate empirical estimate of OC (supply-side threshold) and from the social value of health (demand-side threshold). Explicit policy thresholds used in HTA arena are the clear exception, in the sense that, for some countries, the rationing rule is clearly described in policy documents (e.g., the £20,000-£30,000 NICE threshold); and yet, empirical retrospective analysis suggests that the actual policy threshold may considerably deviate from the ones described in the documents [9].

The aim of our paper is to assess if there are differences in the value given to health across country-specific departments; quantify these differences and identify the reasons underlying them; and explore the potential effects on policy decision making. For this purpose, we identify and compare estimates of the value of life and health used to inform resource allocation within three government departments: health, transport, and environment. We focus on these three departments for two reasons: improving health (or avoiding health losses) is an important part of their objectives; and, historically, the measurement of quality of life has been a research focus in these sectors [18]. Our search did not aim to focus on a particular type of threshold, but to identify all the valuations of health explicitly or implicitly used in government-related publications to represent the government’s value for money threshold, regardless of the approach followed to determine it (a discussion of the different approaches can be found elsewhere [31, 32]).

As suggested by several authors, in absence of any other prioritisation criteria, both the absence of a threshold and the use of thresholds that are too high, too low, or too many, risks an unfair and inefficient allocation of resources [13, 31]. In this sense, our research is intended to raise a number of important issues on assessing the value of public expenditure across different sectors and so better inform the policy process. We focussed on a group of countries that showed commitment to implement formal health economic evaluation early on [33, 34]; and/or introduced mandatory or recommended guidance on the conduct of health economic evaluation that has been revised subsequently, indicating an active use of economic evaluation [34]. The selected countries are: Australia, Canada, Japan, New Zealand, the Netherlands, and the United Kingdom (UK).

This paper is structured as follows. The next section explores whether it is theoretically meaningful to compare government expenditure across different public sector departments, in terms of the value of each departmental outcome. Section "Methods" introduces the measures for valuing health that are considered in this paper and provides the methods used to map across measures. The document search methodology and validation exercise are also detailed in this section. Section "Results" provides the results of the document analysis through inter-country comparison. Section "Discussion" provides a discussion of results and limitations, and section "Conclusions" concludes.

A framework for comparing the value of public sector outcomes

In this section, we present a framework to describe how governments allocate resources across different public services. We begin with a model framed in the welfare economics literature, which provides the foundation for most approaches to allocative efficiency [35]. We take a Pareto optimality perspective and a utilitarian standpoint, which measures social welfare as the sum of individuals’ utility. In particular, our model builds on the resource-allocation model suggested by Meltzer and Smith [15] and Martin et al. [14]; it categorises the various policy outcomes into principal attributes (or outcome types), differentiating between health (H) and non-health (A) attributes. In relation to health attributes, we use the optimisation model frequently found in the health economics literature [3, 36,37,38].

The model is illustrated in Fig. 1. The government seeks to allocate its budget, M comprising \(\left\{{m}_{1},{m}_{2},\dots ,{m}_{J}\right\}\), across the departments j = {1,…,J}, in such a way as to maximise the total welfare of society \(W\left(H,A\right)\). Every government department invests \({m}_{j}\) to generate \({h}_{j}\) health outcomes and \({a}_{j}\) outcomes in non-health attributes. For consistency, we assume that all the outcomes \({h}_{j}\) and \({a}_{j}\) can be expressed in terms of monetary benefits.

Fig. 1
figure 1

Elements of our theoretical model of resource allocation

In this simple model, the optimal allocation of budgets is achieved in a scenario where a marginal increase (e.g., one dollar or one pound) in the budget of any department will have the same total effect on the social welfare function, through health and non-health attributes. Details on the optimisation problem can be found in Appendix 1. We define υ = \(\frac{\partial {\varvec{W}}}{\partial {\varvec{H}}}/\frac{\partial {\varvec{W}}}{\partial {\varvec{A}}}\) as the social value of health relative to the non-health attribute A. In a W maximising equilibrium, the following relationship for any two departments i and j needs to hold:

$$ \mathrm{\upsilon}\left(\frac{\partial {\text h}_{\text i}}{\partial {\text m}_{\text i}}-\frac{\partial {\text h}_{\text j}}{\partial {\text m}_{\text j}}\right)=\frac{\partial {\text a}_{\text j}}{\partial {\text m}_{\text j}}-\frac{\partial {\text a}_{\text i}}{\partial {\text m}_{\text i}}\text{ for each i, j} \, \text{in 1, 2,} { \ldots, \text{J}}$$

For illustrative purposes, consider two government departments: health and social care (i = HSC) and transport (j =  T); and two attributes: improved health (H) and improved standards of living (A). In this context, υ signals the relative value that improved health has for the society, in relation to improved standards of living. Note that this is consistent with the interpretation of a threshold as the societal value for health. In contrast, the health benefit generated by a marginal change in the HSC budget, \(\frac{\partial {h}_{HSC}}{\partial {m}_{HSC}}\), is more aligned with the interpretation of thresholds in the sense of OC: they are reflecting the (inverse of the) ‘price’ in monetary units of purchasing one additional unit of health. We could observe that resources allocated to the transport sector can have a positive impact on health—yet may not be not as cost-effective in terms of health generation as investments in HSC sector \(\left(\frac{\partial {h}_{HSC}}{\partial {m}_{HSC}}>\frac{\partial {h}_{T}}{\partial {m}_{T}}\right)\); however, the transport sector can possibly make a more efficient contribution to the improvement of non-health standards of living than HSC sector \(\left(\frac{\partial {a}_{T}}{\partial {m}_{T}}>\frac{\partial {a}_{HSC}}{\partial {m}_{HSC}}\right)\). The optimal allocation of resources between both sectors depends on the relative position of differences of benefits in both attributes, weighted by the social value that society puts on health in relation to standards of living: \(\mathrm{\upsilon}\left(\frac{\partial {h}_{HSC}}{\partial {m}_{HSC}}-\frac{\partial {h}_{T}}{\partial {m}_{T}}\right)=\frac{\partial {a}_{T}}{\partial {m}_{T}}-\frac{\partial {a}_{HSC}}{\partial {m}_{HSC}}\).

This paper identifies estimates of the different inputs that enter the equation above: societal values (WTP thresholds) which would determine the relative exchange rate (υ); and the decision-rules thresholds, which we assume that are based on the OC approach. In other words, we aim to identify the figures that relate to the monetary expenditure behind producing health at different public sector departments; and we do not explore the way that the monetary figure has been fixed, or the use that is made of it—either from supply-side or demand-side approaches, or even simply by using a convenience number based on past expenditure.

Methods

Measuring the value of life and health

The estimation of ‘value of life and health’ measures requires instruments aimed to capture the value of reducing the risk of death (mortality risks), increasing life expectancy, or improving health-related quality of life [27, 39].

There are several different approaches to representing the value of life and health. The value of a statistical life (VSL; also known as the value of a prevented fatality or VPF), is one of the most applied methods. Two approaches can be used to estimate a VSL, which are based on revealed preferences (e.g., the wage-risk approach, also referred to as the wage differential or labour market method) [27], or stated preferences (e.g., WTP or contingent valuation methods). The WTP approach involves asking a sample of participants to each state their WTP for a small risk reduction, which is then translated into an overall VSL estimate. Thus, as noted by Mason et al. [40], a WTP-based VSL can be defined as the aggregate WTP across a large group of individuals for small risk reductions. Importantly therefore, a WTP-based VSL does not represent the amount that an individual is willing to pay to save a life.

Another measure, useful for analyses of programmes or interventions that result in a small number of years of life being saved, is the value of a life year (VOLY; or statistical life year SLY). VOLYs can be derived directly (e.g., using a WTP approach), or indirectly from an existing VSL estimate [41]. The latter essentially involves dividing a population-level VSL by average life expectancy [40, 42].

In some cases, particularly in the health setting, it is important to consider both morbidity and mortality. That is, quality of life gains should be considered alongside survival gains. The quality-adjusted life year (QALY) achieves this by combining health state utilities (where 0 is equivalent to dead, and 1 is equivalent to full health) with survival gains, such that a year in full health corresponds to one QALY.

A clear unifying framework that demonstrates the conceptual link between VSLs, VOLYs, and willingness to pay for quality-adjusted life years (WTP-QALYs) has been provided in a recent study conducted on behalf of the Health and Safety Executive in the United Kingdom [39]. Future research may be able to utilise this framework to obtain value of life estimates that are better aligned and consistent. In lieu of such aligned estimates, comparisons between measures are inevitably imperfect.

In this paper, we will use the expression ‘VOQ’ to refer to every threshold benchmarking used when life and health are measured in terms of QALYs (VOQ = ‘value of QALY’). Both demand-side thresholds (referred to as ‘WTP-QALY’) and supply-side thresholds will be captured under that label. This is because our aim is not to distinguish between different types of thresholds; but to identify those explicitly or implicitly addressed in government-related publications, which represent the government’s value for money threshold in the context of health gains. Similarly, for simplification, we will use the expression ‘value of life’ or ‘VoL’ to encompass all types of estimates of the value of health and life (i.e., VSLs, VOLYs, and VOQs). This equivalence in the nomenclature is detailed in Table 1.

Table 1 Terminology for value of life estimates

In order to compare estimates of the VoL (Table 1) between governmental departments, it is necessary to convert them into equivalent metrics. Given the primary focus on health, we set out to compare figures in terms of VOQs. As VoL in transport and environmental settings is often presented as the value of a statistical life (VSL) or the value of a life year (VOLY) we use the formula from Abelson [27] to convert these from VSLs to VOLY estimates. The latter were subsequently converted to VOQ estimates, which required country-specific discount rates and life expectancy estimates, as detailed in Appendix 2.

We anticipated high sensitivity of the estimates to the various assumptions; therefore, we conduct sensitivity analyses to test the robustness of our findings. This includes varying the expected remaining life expectancy in the VSL/VOLY conversions (i.e., using ages other than 40) and varying the conversion rate between VOLYs and VOQs (with an upper limit of 1:1 based on the rationale outlined above).Footnote 1

Document analysis

The aim of the document search and analysis was to identify whether VoL thresholds exist within the health, social care, transport, and environment departments of the six selected countries and to determine which thresholds are used to inform resource allocation decisions. In the case of Health Departments, we aimed to identify whether there was any variation in the VoL estimates used within the department, for example, thresholds used in HTA evaluation compared to public health. For the other departments, we sought to identify the main VoL indicators, and we did not seek to explore if other thresholds were used in sub-programmes, or the use of these thresholds in impact assessment documents.

The document search gathered evidence from technical reports, guidelines, and tools published directly by government departments in each of these countries indicating methods for conducting impact assessments (e.g. Green Book in the UK [19, 43]), where necessary other published literature was explored.

Sources were identified using two methods. Firstly, a targeted review of government and departmental websites was conducted, seeking to identify official guidelines and tools published for use in impact assessments or cost–benefit analysis and so on. We identified the most recent estimate available. Secondly, EconLit database was searched, using variations of the terms “Value of life” and “Government/Department” and “UK/Australia/New Zealand/…”. Only documents in English were included. Papers were deemed relevant if they reported an explicit or implicit value of life or threshold in any department under consideration. In both cases, documents were prioritised if they stated an explicit VoL threshold. For countries where no explicit threshold exists or could be identified, implicit thresholds e.g., thresholds elicited in academic papers and cited as relevant in official documents were accepted as the best proxy available. Further details of the document search and data gathering can be found elsewhere [44].

To make comparisons across departments, we estimated VoL in transport and environment departments as a proportion of the VoL in the health setting. These proportions could then be compared across countries to provide a clear indication of whether the same trends are found between countries and without having to consider variations in exchange rates or the need for purchasing power parity.

Results

An estimate of value of life or health was suitably identified for all departments and for all countries. Generally, it was not possible to distinguish between guidance for health and for social care. Therefore, we focused solely on health.

Most of the thresholds identified in the search related to demand-side thresholds or policy thresholds. The Department of Health and Social Care (DHSC) in the UK provided the only exception, as it appears to have recently adopted an explicit supply-side threshold of £15,000 of additional spending per QALY gain for projects excluding HTA.Footnote 2 This figure, used in impact assessments [45, 46], is based on an estimate from Claxton et al. [30]. The adoption of £15,000 as an explicit supply-side threshold was also discussed at the review of Cost-Effectiveness Methodology for Immunisation Programmes & Procurements [47]; but the adoption of this figure was finally rejected by the government [48]. Our search did not identify any explicit, evidence-based supply-side thresholds elsewhere. Therefore, this section primarily focuses on our findings on demand-side and policy thresholds.

To maximise comparability, values have been adjusted to 2019 prices wherever possible. All prices are shown in the local currency (i.e., GBP, EUR, CAD, JPY, AUD & NZD). Tables 2, 3, 4, 5, 6, 7 show the estimates identified for the UK, the Netherlands, Canada, Japan, Australia, and New Zealand.

Table 2 Value of life in the UK
Table 3 Value of life in the Netherlands
Table 4 Value of life in Canada
Table 5 Value of life in Japan
Table 6 Value of life in Australia
Table 7 Value of life in New Zealand

Table 8 illustrates that in 13 out of 15 comparisons the values in the health setting are far lower than those from the transport and environment settings, with the value of life used in health often < 50% of the value used in transport (in six out of seven comparisons) and environment (in five out of eight comparisons). There are only two examples where the value in transport or environment does not exceed the value used in health. The first is the smaller of the two Defra estimates in the UK, which is intended for capturing the acute mortality impact of air pollution. If a broader estimate had been used, such as one from the Green Book, the value would be at least double the upper health value of £30,000 (see Table 2). The other example is the smaller of the two academic estimates in Japan, also in an environment setting, which was the lowest value from that particular study.

Table 8 Inter-country comparison

Figure 2 presents our findings graphically. Using each country’s value of life estimate from the health setting as the baseline, we show transport and environment estimates relative to health. The shaded areas indicate the range of the estimates. The largest difference between value of life estimates in the health setting compared to transport and environment is seen in New Zealand, but there are sizeable differences in all countries considered.

Fig. 2
figure 2

‘Value of life’ benchmarks by government department. Health = baseline (100)

The sensitivity analysis found that the observed trend that health is valued less in the health setting relative to transport and environment setting appears to be robust to the assumptions made when converting between measures. Further details of the sensitivity analysis can be found in Appendix 3.

Discussion

Our research has two main contributions to the literature on resource allocation. First, we provide a simple model on resource allocation that is consistent with two interpretations of the threshold: as the societal value for health (the relative value that improved health has for the society, in relation to other non-health attributes), and the opportunity cost (in the sense of the ‘price’ in monetary units of purchasing one additional unit of health). This theoretical framework supports that the optimal allocation of resources between both sectors depends on the relative position of differences of benefits between health and other attributes, weighted by the social value that society puts on health in relation to other attributes. Second, we examine the policy-related documents in three public sector areas, identify the main VoL estimates used within each department, and make comparisons between them.

From a theoretical perspective, as we have set out in the framework in section "A framework for comparing the value of public sector outcomes", if we assume that the ultimate goal of governments when distributing resources is to make optimal decisions for their citizens, then the methods used for economic evaluations of any government investment must be consistent, and resources should be allocated efficiently across and within departmental portfolios. It is important to recognise that a measure for use within health systems prioritisation (which is what the QALY provides) does not necessarily translate to overall resource allocation. The current cost per QALY threshold is used in a number of countries to determine if new technologies should be introduced into the HSC sector given the current budget constraint. On that basis, the threshold needs to be set at a level which reflects the OC in terms of health gain within the existing health budget. This, of course, raises questions of whether the health budget is optimal and how a government should determine the OC of putting additional resources into the health system or environment or transport, or any other department.

Our analysis also supports the fact that identifying and describing the VoL from disparate public sector activities, in a manner that facilitates comparison, is theoretically meaningful. First, benefits will be assessed around a set of attributes or domains that are relevant to society—‘health’ being one of them. This assumption is supported by a large number of papers that seek to identify country-specific public sector outcome attributes. For pragmatic reasons, ‘health and life’ benefits from different areas of public spending are usually described and measured in a variety of different units (as QALYs, VOLYs, or VSL), so it is typically very challenging to compare the value and OC of new investments. However, under a set of assumptions, it is plausible to estimate equivalences between these outcome measures, as long as a posterior sensitivity analysis proves that results are robust to changes in the assumptions supporting the map**.

Fully variable budgets are probably the most argued condition for the social welfare model suggested by this report. The alternative, in which investments are made within an exogenously determined, annually ‘fixed’ budget, risks decisions being made that are suboptimal in the present and in the future. To move to an optimal distribution, criteria for approval should attract additional resources as long as the total increase in the welfare resulting from improvement in the ‘best’ attributes outperform the welfare lost resulting in a lower improvement in the ‘worst’ attributes relative to the other departments. Economic evaluations should take into account all costs and benefits that are important to society [49, 50] to facilitate the comparison of attributes across sectors; yet, societal preferences in defining the value of the attributes (in terms of trade-offs) are needed to correct the imbalances across sectors [22]. Finally, it is important to notice that when considering efficient allocation across sectors, we need an exchange rate based on societal values; and as it is an exchange rate (as illustrated in our theoretical model), it has to be based on the relative value that improved health has for the society, in relation to other (non-health) attributes.

More pragmatically, there is evidence in the countries studied that the VoL criteria used by those in the health sector are systematically lower than that those used for health gain achieved in comparable sectors. In some countries, this is considerably lower than the VoL provided in other non-health departments. In the vast majority of cases, VoL in the health setting is far lower than VoL in the transport and environment settings.

Empirically, we need to correlate some of these differences in VoL with the fact that transport and environment departments (as well as HSC departments except for HTA-based decision-making) usually use cost–benefit analysis as a form of economic evaluation to inform decision making. This advice translates into not using the threshold as a rationing rule. WTP measures are used to rank projects according to their value for society; but even if so, decision-makers need to establish a policy threshold related to the maximum value given to life and health, in order to ration and make the acceptance/rejection decisions in policies for transport, environment, and non-HTA health-related policies. Guidebooks identified in this paper do not explicitly state any policy threshold or rule to be used for that. Decisions made across departments such as transport or environment are thus generally based on an unspecified rationing rule which, in the absence of any other prioritisation criteria, risks an inefficient allocation of resources in the sense of overpaying for some new programmes. Whilst WTP measures thus overstate the actual policy threshold used for adopting new interventions in non-health departments it remains likely that the effective threshold for health gain is above the OC threshold used in health departments.

Limits

Our study has a number of limitations. The first limitation relates to our ability to identify reliable estimates via a document search and analysis. We sought to identify the value of health estimates that are currently used in practice in analyses across multiple departments in multiple countries. The ideal sources for this information are governmental guidelines for the implementation of cost–benefit analyses, impact assessments or similar, such as the Green Book in the UK. Not all countries have an equivalent document, and, in some cases, such as in the Netherlands, guidelines existed, but explicit values of health were not provided. This meant that we often had to rely on other documentation that typically did not clearly state that a certain value is recommended or required to be used in analyses.

The second relates to our analysis. Whilst converting between VSLs and VOLYs is fairly well established, this is not the case for converting between VOLYs and QALYs. We identified a ratio in the literature and used this for our main analyses. However, it is not clear whether this is entirely appropriate. We tried to mitigate any impact of this by conducting a sensitivity analysis whereby the two measures were equivalent, as this would provide the smallest possible QALY values on the basis that a VOLY would not be valued more highly than a QALY.

A final limitation of the study is the lack of information found about the rationing (supply-side) rules used when non-health departments decide which projects should be implemented. These figures would allow a more accurate comparison of the value of life and health across public sectors, by considering supply- and decision- side rules separately.

Future research

An exploration of the Impact Assessment reports could bring light to the unobserved policy threshold used for decision-making in non-health departments. Future research could also elicit the societal preferences for different attributes (and an exchange rate based on societal values subsequently), following an attribute framework suggested elsewhere [22]. With that information in hand, the optimal resource allocation suggested in this research could help quantify the potential imbalance in the value of health across different parts of the public sector.

Conclusions

Comparing government expenditure across different public sector departments, in terms of the value of each department outcome, is not only possible but also desirable. In that respect, it is essential to identify the relevant social attributes and to quantify the value of the attributes for each public sector. We have shown that health is likely to be valued less by those responsible for allocating resources in health than those in the environment and transport portfolios. If decisions made across departments such as transport or environment are generally based on an unspecified rationing rule, risks of an inefficient allocation of resources in the sense of overpaying for some new programmes will always be present.