Introduction

The phenomenon of population ageing gained substantially in importance over recent decades. For the most part, this is because of a growing recognition that future populations will include an increasing share of older people, raising potentially important resource implications for national and local governments alike (Lutz et al., 2008b). The 2015 World Population Ageing report noted that “Population ageing—the increasing share of older persons in the population—is poised to become one of the most significant social transformations of the twenty-first century, with implications for nearly all sectors of society, including labour and financial markets, the demand for goods and services, such as housing, transportation and social protection, as well as family structures and intergenerational ties” (UN DESA, 2015, p.1). The corresponding 2019 World Population Ageing report notes that “Population ageing is one of the four “mega-trends” that characterise the global population of today—population growth, population ageing, urbanisation and international migration. Each of these mega-trends will continue to have substantial and lasting impacts on sustainable development in the decades to come” (UN DESA, 2019, p.1). It is clear that better understanding of population ageing is critical for policy and other decision-makers.

The importance and implications of population ageing for policy are widely recognised. Most developed countries, and many develo** countries, have adopted some form of ‘healthy ageing’ policies in order to reduce the burden on public infrastructure, including health, housing, and social security (World Health Organization, 2003). In order to plan for future changes in population ageing though, policy makers and planners must understand the trajectories of future population change. In order to target resources appropriately, they must know how and where population ageing is likely to be felt most acutely. Optimal planning and resource allocation require careful measurement of population ageing at both the national and subnational levels.

When considering the implications and measurement of population ageing, it is useful to draw a distinction between numerical ageing and structural ageing (Jackson, 2007). Numerical ageing refers to an absolute increase in the number of older people. The primary cause of numerical ageing is increasing longevity – longer lives mean that people survive longer in old age, increasing the numbers of older people over time. Over longer time scales, birth rates and cohort size at birth also contribute to numerical ageing. As larger cohorts age, the numbers of older people increase, as can most readily be seen by the ‘Baby Boomer’ cohorts born in Western developed countries after World War II. In contrast, structural ageing refers to changes in the age distribution of the population, whereby older people constitute a larger proportion of the total population. Structural ageing arises due to a combination of numerical ageing alongside declining fertility rates, which reduces the size of subsequent age cohorts in the younger age groups.

Although the phenomenon of population ageing is often described in terms of the population getting older in aggregate, there is nothing in theory that makes this process inevitable. The increasing population ageing that we observe in all countries today is a consequence of underlying sociodemographic changes that have occurred over decadal time scales, including decreasing death rates at older ages and decreasing fertility. If these processes were to reverse, the trend in population ageing could do so as well. Indeed, a limited example of this is readily apparent in terms of ‘age structural transitions’ (Pool et al. 2006), whereby waves of birth cohorts of various sizes ripple through the age structure. For example, the Baby Boom of the 1950s created an ‘echo’ in terms of larger birth cohort sizes in the 1970s (Morgan, 1998). As Easterlin (1980) noted, differences in cohort sizes can be highly consequential.

Despite the critical nature of population ageing, we believe that insufficient attention has been paid to the measurement of structural ageing. In part, this situation arises because the extant measures of structural ageing are intuitively easy to understand and have been widely used for generations. However, these widely-used measures have substantial deficiencies that could in theory lead to incorrect inferences about relative rankings of locations in terms of structural ageing, or in changes in structural ageing over time.

In this paper, we outline several straightforward axioms that we believe should underlie a theoretically valid measure of structural ageing. Using these axioms, we first show that the extant measures of structural ageing, including the mean age, median age, proportion of the population aged 65 years and over (or 85 years and over), the child-elder ratio, and the old-age dependency ratio, fail to satisfy one or more of the axioms we propose. The measure that comes closest to satisfying the axioms is the mean age, which may be a suitable measure depending on whether one believes that the measure of structural ageing should be differentially sensitive to different parts of the age distribution. We then propose a class of structural ageing indices that satisfy our proposed axioms, of which the mean age is a special case. We focus attention on one example within the broader class of indices, being the root-mean-squared-age (RMSA). In addition to satisfying each of the axioms, the class of indices we propose are readily interpretable, being of a scale similar to the median age or mean age (noting again that the mean age is a special case of our class of indices).

We then go on to demonstrate, using RMSA as our preferred measure from within the class of axiomatically-consistent measures, the high degree of correlation between the various measures of structural ageing, including the RMSA measure. Specifically, we calculate correlations and rank correlations using: (1) national-level population estimates and projections from the United Nations World Population Prospects, covering the period from 1950 to 2100; (2) U.S. state-level population estimates, covering the period from 1980 to 2018; and (3) district-level population estimates and projections for territorial authorities in New Zealand, covering the period from 1996 to 2048. In all cases, the correlations and rank correlations between the measures are high. While this might provide some comfort that using the extant measures would not bias decision-making, there remain examples where comparisons across countries, states, or districts may lead to incorrect inferences being drawn.

The remainder of this paper proceeds as follows. The next section presents and discusses our proposed axioms for the theoretically valid measurement of structural ageing and proposes an axiomatically-consistent class of indices of structural ageing. We then briefly outline the data and methods we use for comparing measures of structural ageing, before presenting and discussing our results. The final section discusses the implications of the results and considers future directions for research on the measurement of structural ageing.

An axiomatic approach to structural ageing

The purpose of a summary measure of structural ageing is to capture within a single number the key characteristics of the age distribution of the population. In order to achieve this purpose, the measure must encode information from multiple points in the age distribution simultaneously. Moreover, it must encode this information in a way that ensures that changes at any point in the age distribution, and particularly at points in the upper tail of the age distribution, will be appropriately reflected in changes in the summary measure.

Inspired by the example of axiomatic approaches to the measurement of poverty (see Foster et al., 1984), we set out to propose a set of axioms that we believe will ensure that a summary measure of structural ageing will best represent the underlying age distribution, encoding information across the entire age distribution, with a particular focus on the upper tail of the distribution. Our approach is similar to that applied by Chu (1997), although they take a more literal mathematical interpretation of the three Foster-Greer-Thorbecke axioms than we do, whereas we interpret the axioms more naturally in the context of population ageing (see also Kurek 2007; Nath & Islam, 2009). Specifically, we propose four axioms that are simple and intuitive: (1) population size invariance; (2) strong dominance; (3) weak dominance; and (4) age sensitivity. The first three axioms are arguably uncontroversial, whereas the preferred degree of age sensitivity of a summary measure may depend on the preferences of the decision-maker or the party undertaking the measurement.

Axiom 1

(Population size invariance): The measure of structural ageing should not depend on the size of the population.

As noted in the introduction section, structural ageing refers to changes in the age distribution of the population. Thus, the raw size of the population should not matter for a measure of structural ageing. For example, doubling the number of people of every age should leave a measure of structural ageing unchanged, because the proportion of the population at each and every age remains unchanged. However, such a change would lead to a substantial increase in measures of numerical ageing, as the absolute number of older people would increase.

Axiom 2

(Strong dominance): Adding a small amount δ to the age of every person in the population must increase the measure of structural ageing.

This axiom captures the straightforward idea that if every person in the population is somewhat older, then the population in aggregate is older, the proportion of the population at older ages must be higher, and the measure of structural ageing must therefore increase. The cumulative density function representing the population age structure would shift everywhere to the right by the amount δ. In comparing the distribution before and after the addition of δ, it is obvious that the distribution after the addition will strictly dominate the distribution before the addition. Hence, we refer to this axiom as strong dominance.

Axiom 3

(Weak dominance): Adding a small amount δ to the age of one person in the population, while holding the ages of every other person constant, must increase the measure of structural ageing.

This axiom is a weaker form of the strong dominance axiom. However, it is also intuitive since comparing the two populations (before and after the addition of δ, the ages of all people except one are the same, and one person is older. Thus, the population after the addition of δ is the older population. Again, considering the cumulative density function, the distribution after the addition would shift to the right by the amount δ, everywhere to the right of the original age of the person whose age δ was added to. In comparing the distribution before and after the addition of δ, the distribution after the addition first order stochastically dominates the distribution before the addition. However, this is not strict dominance as is the case for Axiom 2, and hence we refer to this axiom as weak dominance.

Axiom 4

(Age sensitivity): Adding a small amount δ to the age of one person in the population, while holding the ages of every other person constant, must increase the measure of structural ageing by a greater amount, the older the age of the person whose age is added to.

This axiom is based on the idea that there are important nonlinearities associated with ageing. As people age, an additional year may have a greater effect across various dimensions of health and wellbeing than the last. For example, there is evidence for concave relationships between age and life satisfaction (Di Tella et al., 2003; Blanchflower & Oswald, 2008; Cheng et al., 2017), age and negative emotions (Teachman, 2006), and age and cognition (Verhaeghen & Salthouse, 1997). If the effect of an additional year of age increases with age, then a theoretically valid measure of structural ageing should be more sensitive to changes in the upper tail of the age distribution. That is because changes in the upper tail of the distribution have a larger effect on outcomes of interest to end users. This axiom is the most arguable of the four, as it is a priori unclear the precise degree of age sensitivity that should be incorporated into the measure of structural ageing (this is a point that we will return to later). Also, not all outcomes of interest have concave relationships with age. For example, productivity may have a convex relationship (Oster & Hamermesh, 1998; Castellucci et al., 2011). However, in cases such as productivity, if the outcome is reframed as productivity decline, then the relationship is concave. Moreover, as Chu (1997) pointed out, Sen (1976) criticised poverty summary measures for their insensitivity to changes in the tail of the distribution, and a similar criticism applies here.

Along with the four axioms outlined above, we also note a fifth property that it is desirable for a measure of structural ageing to possess: easy interpretability. As noted in the introduction, the extant measures of structural ageing persist despite not conforming to all four axioms not necessarily because they are mathematically optimal, but because they are easy for end users to interpret. A measure such as the median age or the proportion of the population aged 65 years and over is readily understandable by policymakers, planners, and other decision-makers, and requires minimal levels of numeracy and statistical literacy to compare and interpret differences between populations and changes over time. A measure of structural ageing that is a bare number with no easy interpretation is likely to face significant challenges in take up by end users.

Demographers have developed a number of measures of structural ageing that are in wide use by end users (Gavrilov & Heuveline, 2003). The variety of measures have been comprehensively described elsewhere (Spijker, 2015), so we do not review them in detail here (although we return to some of the key alternative measures in the Discussion section). The most commonly applied structural ageing measures (which we will refer to as the ‘conventional’ measures) include:

  1. 1.

    Mean age – the average age of the population;

  2. 2.

    Median age – the age that divides the whole population in half, whereby half of the population is older, and half of the population is younger;

  3. 3.

    Proportion of the population aged 65 years and over;

  4. 4.

    Proportion of the population aged 85 years and over;

  5. 5.

    Child-elder ratio – the ratio of the population aged 0–14 years to the population aged 60 years and over; and.

  6. 6.

    Old-age dependency ratio – the ratio of the population aged 65 years and over to the population aged 15 to 64 years.

The child-elder ratio and the old-age dependency ratio can of course be based on different population groups (as to which age groups constitute elders, children, or ‘old-age’), but we focus here on their most common applications. The position of those measures in relation to satisfying the axioms, as described below, does not depend on the specific age groups that are included in their calculation (which is in fact the case for all of the measures we consider). In terms of calculation of the measures, we note that it is conventional to treat age as a discrete measure (in completed years since birth), although any of the measures of structural ageing can easily be calculated based on continuous age. Although there are other measures of structural ageing that are increasingly finding favour among researchers, such as prospective age measures (Sanderson & Scherbov, 2007), we focus on those listed above and leave consideration of these newer measures to the concluding section. Similarly, we also note that many health researchers now favour the concept of ‘functional age’ or ‘biological age’ rather than chronological age (Guralnik & Melzer, 2002; Skirbekk et al., 2019). We leave consideration of functional age and biological age to the concluding section as well.

As an alternative to these conventional measures of structural ageing, we propose a general class of indices of structural ageing, which can be calculated as:

$$A=\sqrt[\alpha ]{\sum _{i=1}^{n}{p}_{i}.{\left({a}_{i}\right)}^{\alpha }}$$
(1)

where A is the index of structural ageing, n is the number of population age groups, \({p}_{i}\) is the proportion of the population in age group i, \({a}_{i}\) is the age of age group i (in years), and α is a coefficient that captures the age-sensitivity of the index. An index of structural ageing calculated using Eq. (1) is simply the uncentred α-th moment of the age distribution. By construction, this index satisfies all of the axioms described earlier in this section provided that α > 1. When α = 1, the index is insensitive to age, and the formula returns the mean age. When α > 1, the index exhibits age sensitivity, and the larger the value of α, the greater the degree of age sensitivity.Footnote 1 In the absence of further research to establish and optimal value, our preferred value for α is 2, in which case the index is the uncentred second moment of the age distribution, which can be referred to as the root-mean-squared age (RMSA):

$$RMSA= \sqrt{{\sum }_{i=1}^{n}{{p}_{i}.\left({a}_{i}\right)}^{2}}$$
(2)

The RMSA represents a balance between ensuring that the index exhibits age sensitivity while not overly ‘penalising’ populations with extreme age distributions. Alternative values of α may be appropriate in different contexts, and likely depend on the preferences of the particular decision-maker. We leave detailed consideration of the optimal value of α for future work.

Table 1 summarises whether each of the measures of structural ageing considered, satisfies each of the axioms outlined earlier in this section. All measures satisfy the population size invariance axiom, and all measures satisfy the strong dominance axiom. In terms of the weak dominance axiom, most measures would only satisfy this axiom in a very restricted case. For example, the median age would only increase when δ was added to the age of a single person, if the result was that person would be moved from below the previous median age to above it (and even then, the median might not change if ages are measured in discrete whole years). Similarly, all of the measures that are based on proportions of the population with fixed age thresholds would only increase if the person were moved from below the threshold to above it. The only two measures that satisfy the weak dominance axiom are mean age and RMSA (both of which are cases of our class of structural ageing indices). Finally, RMSA is the only measure that satisfies the age sensitivity axiom. As noted above, mean age is not sensitive to age. All numerical ages are treated equally. In contrast, RMSA weights the calculation of the index by the numerical age, explicitly making the resulting index more sensitive to increases in the proportion of the population at more advanced ages.

Table 1 Measures of structural ageing and satisfying of axioms

Methods and data

We now turn our attention to the important question of whether the choice of structural ageing measure matters for a decision-maker. In other words, what is the likelihood that a decision-maker draws an incorrect inference due to their use of a structural ageing measure that does not satisfy one or more of the axioms outlined in the previous section? In addressing this question, RMSA is our preferred measure, being the only measure of those we consider that satisfies all four axioms. We first calculate the RMSA measure for different populations and rank the populations from the ‘oldest’ (highest RMSA) to ‘youngest’ (lowest RMSA). We then calculate the other measures of structural ageing for each population and calculate the Pearson correlation between each of the measures, focusing in particular on their correlation with RMSA. There is no expectation that the observed relationship between RMSA and other measures of structural ageing will be linear, as assumed by the Pearson correlation coefficient. Accordingly, we also rank each population based on each measure of structural ageing and calculate the Spearman rank correlation between each of the measures, again focusing on the rank correlation with RMSA. A high correlation between a particular measure and RMSA, and especially a high rank correlation, would suggest that there should be little concern about incorrect inferences being drawn from that measure.

We apply this procedure to three sets of population data, covering different geographies, different levels of spatial aggregation, and different time periods. This allows us to assess any differences in the correlations across a variety of different datasets and contexts. First, we use data from the United Nations World Population Prospects (https://population.un.org/wpp/Download/). These data include population estimates by five-year age group and sex for 201 countries, covering every year in the period from 1950 to 2020. We use data on both sexes combined and calculate the structural ageing measures and correlations for the first year in each decade (i.e., 1950, 1960, 1970, 1980, 1990, 2000, 2010, and 2020). These data also include population projections by five-year age group and sex for the same 201 countries, covering every year from 2020 to 2100. We use the same approach as for the population estimates, calculating the structural ageing measures and correlations based on data for both sexes combined from the medium variant projections, for the first year in each decade (i.e., 2030, 2040, 2050, 2060, 2070, 2080, 2090, and 2100). This results in 16 correlations (and rank correlations), each based on 201 country-level data points. The estimates and projections are available for five-year age groups, with the top age group aged 100 years and over. In the calculations of each structural ageing measure, we use the mid-point of each five-year age interval and use 102.5 as the mid-point of the open-ended age group. The results we present are relatively insensitive to small deviations from these assumptions.

Second, we use data from the US Census Bureau (https://www.census.gov/data/tables.html). These data include population estimates by five-year age group and sex for all fifty states and the District of Columbia, covering every year in the period from 1980 to 2018. As with the country-level data, we use data on both sexes combined, and calculate the structural ageing measures and correlations for the first year in each decade (i.e., 1980, 1990, 2000, and 2010), and for 2018 as the latest year of available data. State-level population projections are no longer available from the US Census Bureau, so we limit our analysis to the resulting five correlations (and rank correlations), based on 51 state-level data points. The estimates are available for five-year age groups, with the top age group aged 85 years and over. In the calculations of each structural ageing measure, we use the mid-point of each five-year age interval, and for the open-ended age group, we use the median age (for the population aged 85 years and over) from the corresponding country-level data. Again, the results we present are relatively insensitive to small deviations from these assumptions.

Third, we use data from Statistics New Zealand (http://nzdotstat.stats.govt.nz/wbos/Index.aspx). These data include population estimates by five-year age group and sex for 87 territorial authorities and local boards (being the most disaggregated administrative areas).Footnote 2 As with the other datasets, we use data on both sexes combined, and calculate the structural ageing measures and correlations at approximately decadal intervals, coinciding with a national population census (i.e., 1996, 2006, and 2018). These data also include population projections by five-year age group and sex for the same 87 territorial authorities and local boards, covering every five years from 2018 to 2048. We use the same approach as for the population estimates, calculating the structural ageing measures and correlations based on data for both sexes combined from the medium variant projections at decadal intervals (i.e., 2028, 2038, and 2048). This results in 6 correlations (and rank correlations), each based on 87 data points at the territorial authority and local board level. The estimates and projections are available for five-year age groups, with the top age group aged 85 years and over. In the calculations of each structural ageing measure, we use the mid-point of each five-year age interval, and for the open-ended age group, we use the median age (for the population aged 85 years and over) from the corresponding country-level data. Again, the results we present are relatively insensitive to small deviations from these assumptions

Results

As an example of the various measures of structural ageing, Table 2 presents three examples from the country-level data for 2020. Uganda has among the youngest age distributions across all 201 countries. It has a median age of just 16.7 years, and less than 2% of the population is aged 65 years and under. It is ranked within the six youngest countries in all of the measures of structural ageing, and is the youngest based on the RMSA, with a value of 26.8. Morocco is close to the middle of all countries in terms of its age distribution. It has a median age of 29.5 years, and 7.6% of the population is aged 65 years and over. Its ranking varies from 94th (old-age dependency ratio) to 115th (proportion aged 85 years and over), and is 97th based on the RMSA, with a value of 38.2. Japan has the oldest age distribution among all countries. It has a median age of 47.5 years, and 28.4% of the population is aged 65 years and over, and nearly 5% aged 85 years and over. It is ranked as the oldest country in the world for all of the measures of structural ageing and has a RMSA value of 53.3.

Table 2 Measures of structural ageing for selected countries, 2020

It is clear from Table 2 that RMSA has a similar interpretation to the mean age or median age and has a value that is of a similar size, albeit uniformly higher. This is because people at older ages are weighted more heavily in the calculation of RMSA than they are for the mean age. It is also clear from the rankings presented in Table 2 that there is a high degree of consistency among the various measures of structural ageing. Japan is ranked as the oldest country on every measure, while Uganda is ranked as the youngest country on two of the measures (RMSA and child-elder ratio), and among the six youngest on all of them. Niger is the youngest country based on mean age (20.4 years) and median age (15.2 years), United Arab Emirates has the lowest proportion of the population aged 65 years and over (1.26%) and the lowest old-age dependency ratio (0.02), and Nigeria has the lowest proportion aged 85 years and over (0.04%). The consistency in ranking provides some comfort in relation to the choice of measure by decision-makers.

Tables 3 and 4 present the correlations between RMSA and each other structural ageing measure based on Pearson correlations (Table 3) and Spearman rank correlations (Table 4), calculated using country-level data from the United Nations World Population Prospects. The full cross-correlation tables for all measures for selected years (1950, 2020, and 2100) are included in the Appendix, Tables A1 and A2.Footnote 3 The Spearman rank correlations are further illustrated by scatter plots of the country-level percentile ranks of RMSA and each of the other measures. These plots are presented in the Appendix for three selected years: Figures A1 to A6 (for 1950); Figures A7 to A12 (for 2020); and Figures A13 to A18 (for 2100).Footnote 4 As is clear from Tables 3 and 4, the Pearson correlations and Spearman rank correlations between each measure and RMSA are high, ranging from 0.717 to 0.999. All of the correlations are highly statistically significant.

Table 3 Pearson correlations between RMSA and other measures of structural ageing based on country-level data 1950–2100
Table 4 Spearman rank correlations between RMSA and other measures of structural ageing based on country-level data 1950–2100

Considering the results in Tables 3 and 4, some key features are apparent. First, the correlations for population estimates (based on past age distributions) are much lower than the correlations for population projections (based on projected future age distributions). Moreover, the further into the future, the higher the correlations become. This is also illustrated in Figs. 1 and 2, which plot over time the Pearson correlations and Spearman rank correlations, respectively. It is likely that more ‘regular’ population age distributions, where there are fewer population waves of larger or smaller cohorts, lead to measures that are more closely correlated with each other. These waves appear naturally in real-world settings, driven by policy and sociocultural changes. Since the population estimates of past population age distributions are based on observed real-world data, they pick up these waves. In contrast, the parameters of population projection models are either fixed or change slowly over time. They are not typically designed to introduce waves of larger or smaller cohorts (although they may generate echoes of past waves). This leads to population age structures that are more regular than those observed in real-world data. So, it should be no surprise that the correlations between the measures of structural ageing are higher for projected data than for historical data. Moreover, the age distribution in projected years is derived from a combination of the baseline population age structure, which is itself based on real-world data, and the changes in the distribution driven by the underlying assumptions and parameters in the population projections model. Thus, as we consider projections that are further into the future, the current distribution contributes less, and the model assumptions and parameters contribute more, to the observed age distribution. Hence, the further into the future, the more the projections model will contribute to the measures of structural ageing. This leads to measures that appear to become more correlated over time into the future. Looking backwards, it is likely that data quality becomes more of an issue for more historical population estimates. That could lead to lower correlations between measures of structural ageing (due to measurement error in the population age distribution) or higher correlations if population estimates are based predominantly on model outputs (for the same reason that population projections lead to more regular age distributions). In both sets of correlations, the proportion of the population aged 85 years and over has the lowest correlation with RMSA. It is likely that this is the population that is subject to the greatest extent of measurement error. However, for some measures, the Spearman rank correlations (Table 4; Fig. 2) are higher in earlier years, where modelling is probably required to derive the estimated population age distributions for a greater proportion of countries. Thus, it is likely that both effects are occurring simultaneously within these data.

Fig. 1
figure 1

Pearson correlations between RMSA and other measures of structural ageing based on country-level data 1950–2100

Publication note: Fig. 1 was created in Microsoft Excel

Fig. 2
figure 2

Spearman rank correlations between RMSA and other measures of structural ageing based on country-level data 1950–2100

Publication note: Fig. 2 was created in Microsoft Excel

Second, comparing Tables 3 and 4, for most measures the Spearman rank correlation (Table 4) is higher than the corresponding Pearson correlation (Table 3). This reflects that the relationship between the measures is not linear, as discussed earlier. Third, it is clear that mean age has the highest correlation with RMSA of all of the other measures of structural ageing. It has a Pearson correlation with RMSA that ranges from 0.993 to 0.999, and a Spearman rank correlation that ranges from 0.975 to 0.998. Recall that mean age and RMSA are the only measures that consistently satisfy the weak dominance axiom, while all other measures only obey the axiom in some specific circumstances. Moreover, mean age is a special case of our generalised structural ageing index, where α = 1. The other measures are not as closely related to RMSA, and this is reflected in lower correlations. The relative ranking of the other measures in terms of their correlations with RMSA is not consistent between Pearson and Spearman rank correlations, or over time, with the exception of the proportion of the population aged 85 years and over, which is for the most part the least correlated with RMSA. This suggests that there is little to choose between these other measures, and they are highly correlated with each other (see Appendix, Tables A1 and A2).

Finally, despite the high correlations between the measures, there are some countries in some periods where the measures lead to disparate rankings. For example, in 2020 Kuwait is ranked 63rd based on median age (36.8 years) but is ranked 200th based on the proportion aged 85 years and over (0.05%). Similarly, United Arab Emirates is ranked 81st based on median age (32.6 years), but as noted earlier is ranked 201st and last based on the proportion aged 65 years and over and the old-age dependency ratio. Those discrepancies, along with similar differences for Bahrain, Qatar, Oman, and Saudi Arabia, reflect the large and relatively young temporary migrant worker populations in those countries. However, it is not just the Gulf states where large discrepancies in rankings exist. The Maldives is ranked 101st for median age (29.9 years) but is ranked 189th for old-age dependency ratio (0.05). And relatively large discrepancies are not just a feature of low-income and middle-income countries. Sweden is ranked 15th based on old-age dependency ratio (0.33) but is ranked 41st based on median age (41.1 years). Similar discrepancies are observable across all years of the data, both past estimates and future projections.

It is clear that the various measures of structural ageing are closely correlated, but imperfectly so. Countries are over-ranked or under-ranked by some measures both relative to RMSA, and relative to other measures. However, the analysis above relies on cross-country data, and resource allocation decisions are rarely made between countries. That makes cross-country comparisons less relevant to most decision-makers. Accordingly, we now turn to comparisons based on subnational data.

Table 5 presents the Pearson and Spearman rank correlations between RMSA and each other structural ageing measure, calculated using state-level population estimates data from the US Census Bureau. The full cross-correlation tables for all measures for selected years (1980, 2000, and 2018) are included in the Appendix, Tables A3 and A4.Footnote 5 The results are not as straightforward as those based on the international data. Like for the international data, the mean age has the highest correlations with RMSA. However, the correlations are generally lower than those observed in the international data, and the pattern of gradual increases in the correlations over time is repeated only for mean age and median age, and not for other measures. In fact, the correlations between the proportion of the population aged 85 years and over and RMSA actually decrease over time. Also, the Spearman rank correlations are smaller than the Pearson correlations for some of the measures. This suggests that it cannot be taken for granted that measures of structural ageing will become more highly correlated over time.

Table 5 Pearson and Spearman rank correlations between RMSA and other measures of structural ageing based on US state-level data 1980–2018

However, like the cross-country data there are wide differences in the rankings of some states across different measures. For example, in 2018 New York is ranked eighth based on the proportion of the population aged 85 years and over (2.50%) but is ranked 47th based on median age (35.4 years). South Carolina is ranked ninth based on old-age dependency ratio (0.28) but is ranked 41st based on the proportion of the population aged 85 years and over (1.76%). In contrast, Utah is ranked the youngest or second-youngest state on every measure, while Florida is ranked among the five oldest states on every measure.

Like the cross-country data, the US state-level data highlights the close (although not as close as for the cross-country data) correlations between the various measures of structural ageing. However, the comparisons are somewhat hampered by being based only on historical population estimates data. Most decision-makers will instead look at data based on projected future age distributions. Accordingly, we now apply the same comparisons to subnational data where both past estimates and future projections are available from the same source.

Table 6 presents the Pearson and Spearman rank correlations between RMSA and each other structural ageing measure, calculated using population estimates and projections data for territorial authorities and local boards in New Zealand, from Statistics New Zealand. The full cross-correlation tables for all measures for selected years (1996, 2018, and 2048) are included in the Appendix, Tables A5 and A6. Footnote 6 The results fall somewhere in-between those presented earlier based on country-level and US state-level data. The correlations are lower than those observed in the international data, but higher than those in the state-level data. However, unlike either of the other data sources, there is little systematic difference between the Pearson correlations and the Spearman rank correlations in the New Zealand data. Looking at changes over time, there is no apparent substantial increase in the correlations in the historical population estimates data, but the correlations do increase over time in the projected data. This supports the earlier assertions that population projections models lead to age distributions that are more regular, and that measurement error likely contributes to lower correlations in the more historical international data. Like for both other data sources, mean age has the highest correlation with RMSA, while the proportion of the population aged 85 years and over is in most instances the least correlated with RMSA.

Table 6 Pearson and Spearman rank correlations between RMSA and other measures of structural ageing based on New Zealand territorial authority and local board data 1996–2048

However, like the other data sources there are wide differences in the rankings of some territorial authorities and local boards across different measures. For example, in 2018 the Aotea/Great Barrier local board area is ranked second based on mean age (47.1 years), median age (52.3 years), child-elder ratio (0.40), and RMSA (52.3 years), but 79th based on the proportion of the population aged 85 years and over (1.0%). This probably reflects the relative remoteness of that local board area from the tertiary hospital care that the oldest aged people may need access to. The Waiheke local board area is similar. However, unlike Aotea/Great Barrier and Waiheke, the Waitemata local board area is located on the mainland and in central Auckland, and yet is ranked as the youngest area based on the proportion of the population aged 65 years and over (7.9%), the proportion of the population aged 85 years and over (0.6%), and the old-age dependency ratio (0.10), but 32nd based on the child-elder ratio (0.78). That reflects that central city urban living in New Zealand is currently attractive for relatively young urban professionals and students but is less attractive for both older people and young families. In contrast, Thames-Coromandel District is ranked the oldest area on every measure except one (the proportion aged 85 years and over), while the Otara-Papatoetoe local board area is ranked among the three youngest areas on every measure, and the Mangere-Otahuhu local board area is ranked among the four youngest areas on every measure.

Discussion

Our results should provide both comfort and concern to end users of structural ageing measures. The high correlations between the measures, and especially between our preferred RMSA measure and other measures, suggests that for the most part, the consequences of the choice of structural ageing measure are unlikely to be seriously negative. This will particularly be the case where the numerical value of the structural ageing measure, and not the relative ranking of the area, is the primary concern. For example, in quantitative applications, the use of median age or the proportion of the population aged 65 years and over, is unlikely to seriously bias estimated model coefficients. However, the substantial disparity in ranking for some countries or areas between different measures of structural ageing could lead to serious misallocations of resources, particularly when those resources are targeted or allocated on the basis of a ranking on current or projected future levels of structural ageing. Policymakers could be misled to believe that particular areas are projected to age more rapidly than others based on the conventional measures of structural ageing, where that might not necessarily be the case based on an axiomatically-consistent measure. For example, this might have consequences for the quantity and quality of community-based and home-based aged care services (Hunter et al., 2019), the provision or rollout of ‘age-friendly’ infrastructure or services (O’Brien, 2014), or the priority attached to the development of ‘ageing-in-place’ strategies (Heumann and Boldy, 1993). Misallocation or mis-targeting of resources in these areas could have significant consequences on the future quality of life for older people.

A key concern about many of the measures of structural ageing is that they are based on particular age thresholds, the thresholds are essentially arbitrary and based on historical ‘accidents’ (Costa, 1998), and the calculation of the measures is discontinuous at those thresholds (Lutz et al., 2008a; Spijker, 2015). This applies even to recently proposed measures that share inspiration from the analysis of income distributions, such as the optimal grou** approach based on Lorenz Curves proposed by D’Albis and Collard (2013). This leads to structural ageing measures that ignore parts of the age distribution in their calculation (which is why those measures do not satisfy one or more of our proposed axioms). Our preferred axiomatically-consistent class of structural ageing indices ensure that differences at any point in the age distribution are adequately accounted for, and especially differences in the upper tail of the age distribution, which are most consequential in terms of their resource implications for policymakers and other decision-makers. Nevertheless, structural ageing measures with discontinuous thresholds will remain relevant where the thresholds correspond to important policy-relevant ages, such as the age of eligibility for old-age pensions or health care subsidies.

The optimal degree of age sensitivity built into the measure is one aspect that we have not explored in this paper. For simplicity, we employed α = 2, leading to the RMSA measure. Alternative values of α may be desirable in different applications, and different decision-makers, with different preferences or use cases, may exhibit different degrees of age sensitivity. We leave further exploration of the optimal degree of age sensitivity in different applications for future research. However, we do note that the measure that consistently has the highest correlation with RMSA is mean age, which is in itself a special case within the axiomatically-consistent class of structural ageing indices, with α = 1. Given that mean age is easy for decision-makers to understand and interpret, this measure may be preferred in spite of it not satisfying the age sensitivity axiom.

There have been two under-appreciated developments in the measurement of ageing that have increasingly gained prominence in recent years. The first is a proposal to reverse the measurement of ageing from the number of years of life completed to the number of years of life remaining, which dates to Ryder (1975). These ‘prospective ageing measures’ (Sanderson & Scherbov, 2007) offer the advantage of better capturing population ageing in a context of an increasing number of years of healthy life, whereby chronological age fails to capture changes in the physical and mental capacities of older people that relate to their health over the entire life course, as well as important implications in relation to social security and health expenditures (Fuchs, 1984; Shoven, 2007). However, they have been subject to criticism due to uncertainty in life expectancy and sensitivity to proportional rescaling (D’Albis and Collard, 2013). The second and related development is a change in focus in the field of health from measuring ‘chronological ageing’ in terms of completed years since birth, to ‘functional ageing’, which incorporates physical or cognitive functioning, and better captures the ability of older people to engage in activities of daily living (Guralnik & Melzer, 2002; Skirbekk et al., 2019). Aligned to this is the concept of active life expectancy or healthy life expectancy (Katz et al., 1983; Robine & Ritchie, 1991), disability-free life expectancy (Sanderson & Scherbov, 2010; Manton & Gu, 2001) and more recently to ageing measures based on α-ages, which can be computed from characteristics of the population, such as remaining healthy life expectancy (e.g. see Sanderson & Scherbov 2013; Sanderson et al., 2016), and measures based on cognitive ageing (Skirbekk et al., 2011) or biological ageing (Skirbekk et al., 2019).

To date, prospective ageing measures have been rarely applied. Sanderson and Scherbov (2005) apply the concept of the standardised median age – the median number of expected years of remaining life. Sanderson & Scherbov (2007) applied the standardised median age (which they now termed prospective median age), along with the prospective old age dependency ratio – the ratio of the number of people at a ‘prospective age’ of 65 years or older to the number of people aged between 20 years and the ‘prospective age’ of 65 years. Similarly, Sanderson and Scherbov (2008) computed a measure of the proportion of the population with 15 years or less of remaining life expectancy as a prospective ageing measure (see also Spijker & MacInnes 2013; Spijker et al., 2014; Scherbov et al., 2016; Sanderson et al., 2017; Gietel-Basten & Scherbov, 2019). When a threshold or median is employed in these measures, even where that threshold changes over time to account for increasing life expectancy, they will generally fail to satisfy the weak dominance axiom, just as their chronological age equivalents do. However, in principle there is no reason why the mean number of years of life remaining could not be employed, which would meet all of the axioms other than age sensitivity, or indeed the broader class of structural ageing indices that we have introduced in this paper could easily be extended to the measurement of prospective ageing. The combination of these two approaches would be a welcome advance, leveraging the theoretical advantages of both. We leave this as a fruitful avenue for future work in this area.

We based our class of indices of structural ageing on four underlying axioms – population size invariance, strong dominance, weak dominance, and age sensitivity – along with easy interpretability. By construction, our class of indices meets the four axioms, and because the indices are similar in size to the median age and mean age (which is itself a special case of our broader class of measures), we expect that they share the property of easy interpretability (unlike the index developed by Chu (1997) based on a similar set of axioms). However, our proposed axioms are not the only properties that end users may desire in a theoretically valid measure of structural ageing. Group-wise additivity or decomposability may also be desirable, whereby the measure of structural ageing for the population as a whole is a weighted sum of the measures for important population subgroups (such as male/female, or regions within a country). Our class of structural ageing indices is not decomposable in this way. There exist many other concave functions of age that would satisfy the four axioms (including age sensitivity) and be decomposable. However, those alternatives are not as readily interpretable as our indices, and thus it may be difficult to encourage their use among policymakers and other decision-makers. Further exploration of these issues is desirable.

The measurement of structural ageing is important, and as the global population ages, accurate and theoretically valid measurement will continue to increase in importance. We hope that our proposed class of axiomatically-consistent measures of structural ageing will contribute to advancing the state of measurement in this important area of research and policy.