Introduction

Generic health-related quality of life instruments—measures of self-perceived health status designed to be applicable across diverse groups of respondents, regardless of disease or demographics—have a multitude of applications in research and clinical care. These include as indicators of disease burden and health outcome, as case-mix indicators for risk adjustment, and at the individual patient-level for informing care [1,2,3,4,5]. There is no gold standard of measurement for health-related quality of life, whether generic or disease-specific. For this reason, it is important that instrument-specific resources exist to assist with the interpretation of data. An example of such a resource is the establishment of reference data for population norms, which summarize health outcomes in the general population and across demographic and clinical subgroups. The availability of population norms enables researchers to compare estimates from study samples with a population reference. Such data are useful in multiple contexts, including tracking changes in health outcomes over time [6,7,8,9], comparing populations with different health conditions [10], comparing between countries [11, 12], and modelling comparison groups in the evaluation of health technologies in the absence of primary data [13].

The Veterans RAND 12-item Health Survey (VR-12) and associated measures are among the most widely used instruments for assessing respondents’ (patients and non-patients) perceptions of health domains relevant to their quality of life [5, 14]. The VR-12 was developed from the Veterans RAND 36-item Health Survey, which was developed and modified from the original RAND version of the 36-item Health Survey version 1.0 (also known as the MOS SF-36) [15, 16]. Responses to the VR-12 can be used to generate (i) health utility values, (ii) summary component scores for mental health and physical health, and (iii) scores for each of the instrument’s eight domains. (Further details about the VR-12 are provided in the Methods section.) Population norms for the VR-12 have been described using data from over 170,000 respondents to the Medicare Health Outcomes Survey in the United States (US) [16] and a sample of 500 Chinese adults in Hong Kong [17]. There are no Canadian population norms for the VR-12. Previous studies have published Canadian population norms for health utility scores generated using other instruments, including the Health Utilities Index Mark 3 (HUI3) [13], EQ-5D-3L (Alberta only) [18], EQ-5D-5L [19, 59].

In our cross-sectional survey, we identified that people in older age groups provide VR-12 responses that correspond with higher mean health utility values. Previous studies have found different patterns with respect to age, including an age-related monotonic decline in health utility values for the SF-6D [50] and AQoL-4D [60] instruments, and a u-shaped utility curve for the AQoL-6D and AQoL-8D instruments [22]. The differences in these trajectories may reflect the items captured by each instrument. Evidence suggests that instruments with a higher proportion of items related to physical health tend to exhibit a monotonic decline, while those with a higher proportion of items related to mental health tend to exhibit a u-shaped curve given older age groups tend to report higher mental health scores [22, 37, 61]. In our study, higher scores in the mental health and social functioning domains in older Canadians seems to be driving the higher health utility scores in older ages. There was some evidence of lower physical health scores in older age groups for the bodily pain, physical functioning, and role physical domains, however these differences were modest in comparison with the age-related increase in scores on the mental health domains.

We observed that females reported lower scores than males. This finding was consistent across health utility values, summary component scores, and domain scores. A study of Medicare beneficiaries in the US that reported norms for the VR-12 also found that females reported lower mean PCS and MCS scores than males [16]. This general observation has been observed for other instruments [22, 23, 37]. It is possible men and women with the same health status may interpret and respond differently to the items (known as differential item functioning). However, a 2017 study of the VR-12, which included over 270,000 respondents, concluded there was no evidence of differential item functioning at the domain level across genders [62]. There were statistically significant differences for some items, although the magnitudes of these differences were considered negligible. This suggests that the observed gender differences may reflect differences in health status. A 2016 systematic review of self-reported health data from 59 countries found that women systematically reported lower health and functional status than males [63]. This effect was present across all age groups and in both low- and high-income countries. The authors suggested that both societal (e.g., gender inequality in employment and education) and biological (e.g., higher rates of chronic conditions in females) factors may have contributed to the findings [63].

To facilitate the use of the VR-12 norms, we have created an interactive website (https://vr12.cheos.ubc.ca/home). This website includes downloadable comma-separated value files of the health utility score, summary component score, and domain score norms by gender, age group, and province. The website also allows users with respondent-level VR-12 data to calculate VR-12 health utility scores, summary component scores, and domain scores; estimate EQ-5D-5L health utility scores based on a map** (or ‘crosswalk’) algorithm; and obtain normative VR-12 data for their sample that is adjusted for age, gender, and geographic location.

This study has several strengths and limitations. First, our data come from a large sample of nearly 7000 respondents that were selected to be representative of the Canadian population based on age, gender, and geographic location. To further enhance the representation of our sample we used a raking procedure to weight responses to the 2021 Canadian census and 2017/18 Canadian Community Health Survey based on gender, province, education, age group, and health status. A 2010 study in the US compared online and telephone-based administration for estimating population norms for the PROMIS instrument and found that online administration with post-sampling adjustment resulted in a sample comparable to probability-based general population samples [64]. In our study, adjustment resulted in the sample more closely reflecting the characteristics of the Canadian population, particularly with respect to education and health status, although some imbalances remained. For example, our weighted sample overrepresented those who identify as ‘White’ and underrepresented selected groups, including those who identify as ‘South Asian’ and ‘Black,’ and those in the lowest household gross income bracket. Our sampling procedure, which used an online market research panel, may have resulted in other populations being underrepresented in our sample, such as those without access to a computer or new immigrants. In addition, survey respondents were able to choose to complete the survey in either English or French. Consequently, caution is necessary when generalizing these findings to populations who are underrepresented in our sample or to samples where the VR-12 was administered in one language.

Conclusion

We have estimated Canadian VR-12 population norms for health utility scores, summary component scores, and domain scores in a large, nationally representative population sample. The health utility norms, which are reported by age group, gender, and region, can serve as a valuable input for Canadian economic models, particularly those interested in subgroup analyses. The norms for summary component scores and domain scores provide a reference standard that allows for routinely-collected data to be interpreted in the context of the Canadian population.