Abstract
Purpose
Using the lens of classical test theory, we examine a linkage’s generalizability with respect to use in multivariable analyses, including multiple regression and structural equation modeling, rather than comparison of established subpopulations as is most common in the literature.
Methods
To aid in this evaluation, we present a structural-equation-modeling based statistical method to examine the suitability of a given linkage for use cases involving continuous and categorical variables external to the linkage itself.
Results
Using the PROMIS® Parent Proxy and Early Childhood Global Health measures, we show that, although a high correlation between the scores (here, r = .829) may imply a general suitability for linking, a more detailed investigation of content, measurement structure, and results of the proposed methodology reveal important differences between the measures which can compromise interchangeability in certain use cases.
Conclusion
In addition to the statistical quality of a linkage, users of linking methodology should also assess the question of whether the linkage is appropriate to apply to particular use cases of interest.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11136-023-03592-x/MediaObjects/11136_2023_3592_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11136-023-03592-x/MediaObjects/11136_2023_3592_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11136-023-03592-x/MediaObjects/11136_2023_3592_Fig3_HTML.png)
Similar content being viewed by others
Data availability
Crosswalk tables are publicly available at the APA OSF repository (url: https://osf.io/nf8bx/). All measures used can be obtained from the Health Measures website at healthmeasures.net or modified therefrom, with modifications described in the manuscript. Study data and analysis code are not publicly available.
Notes
The actual item parameters used for PROMIS scoring are proprietary and could not be included. To obtain these parameters, contact HealthMeasures.net. Perturbation was conducted to mask the actual values by adding noise generated, separately for each item, from a random uniform distribution with limits of − .25 and .25.
References
Dorans, N. J., Pommerich, M., & Holland, P. W. (2007). Linking and aligning scores and scales. Springer.
Feuer, M. J. (2005). E pluribus Unum: Linking tests and democratic education. Measurement and research in the accountability era (pp. 173–192). Routledge.
Huggins, A. C., & Penfield, R. D. (2012). An NCME instructional module on population invariance in linking and equating. Educational Measurement: Issues and Practice, 31(1), 27–40.
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81.
Choi, S. W., Schalet, B., Cook, K. F., & Cella, D. (2014). Establishing a common metric for depressive symptoms: Linking the BDI, CESD, and PHQ9 to PROMIS depression. Psychological Assessment, 26(2), 513.
Kaat, A. J., Newcomb, M. E., Ryan, D. T., & Mustanski, B. (2017). Expanding a common metric for depression reporting: Linking two scales to PROMIS® depression. Quality of Life Research, 26(5), 1119–1128.
Blackwell, C. K., Tang, X., Elliott, A. J., Thomes, T., Louwagie, H., Gershon, R., & Cella, D. (2021). Develo** a common metric for depression across adulthood: Linking PROMIS depression with the Edinburgh Postnatal Depression Scale. Psychological Assessment, 33(7), 610.
Holland, P. W. (2007). A framework and history for score linking. Linking and aligning scores and scales (pp. 5–30). Springer.
Bentler, P. M. (2017). Specificity-enhanced reliability coefficients. Psychological Methods, 22(3), 527.
Kallen, M. A., Lai, J. S., Blackwell, C. K., Schuchard, J. R., Forrest, C. B., Wakschlag, L. S., & Cella, D. (2022). Measuring PROMIS® global health in early childhood. Journal of Pediatric Psychology, 47(5), 523–533.
Forrest, C. B., Bevans, K. B., Pratiwadi, R., Moon, J., Teneralli, R. E., Minton, J. M., & Tucker, C. A. (2014). Development of the PROMIS® pediatric global health (PGH-7) measure. Quality of Life Research, 23, 1221–1231.
Cella, D., Blackwell, C. K., & Wakschlag, L. S. (2022). Bringing PROMIS to early childhood: Introduction and qualitative methods for the development of early childhood parent report instruments. Journal of Pediatric Psychology, 47(5), 500–509.
Lai, J.-S., Kallen, M. A., Blackwell, C. K., Wakschlag, L. S., & Cella, D. (2022). Psychometric considerations in develo** PROMIS® measures for early childhood. Journal of Pediatric Psychology, 47(5), 510–522.
Albano, A. D. (2016). equate: An R package for observed-score linking and equating. Journal of Statistical Software, 74(8), 1–36. https://doi.org/10.18637/jss.v074.i08
R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. University of California Press.
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(817), 838. https://doi.org/10.2307/1912934
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equitability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281–306.
Dorans, N. J., & Feigenbaum, M. D. (1994). Equating issues engendered by changes to the new SAT and PSAT/NMSQT. In I. M. Lawrence, N. J. Dorans, M. D. Feigenbaum, N. J. Feryok, A. P. Schmitt, & N. K. Wright (Eds.), Technical issues related to the introduction of the new SAT and PSAT/NMSQT (ETS Research Memorandum No. RM-94-10). Educational Testing Service.
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Routledge Academic.
Acknowledgements
Research reported in this publication was supported by the Environmental influences on Child Health Outcomes (ECHO) program, Office of The Director, National Institutes of Health, under Award Number U24OD023319 with co-funding from the Office of Behavioral and Social Sciences Research (OBSSR; Person Reported Outcomes Core). We have no conflicts of interest to disclose.
Funding
Research reported in this publication was supported by the Environmental influences on Child Health Outcomes (ECHO) program, Office of The Director, National Institutes of Health, under Award Number U24OD023319 with co-funding from the Office of Behavioral and Social Sciences Research (OBSSR; Person Reported Outcomes Core).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. All analyses and summarizations of results were performed by MM, who also wrote the first draft of the manuscript. All authors contributed to proofreading and improving the manuscript to its final form, and all authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
We have no conflict of interest to disclose.
Ethical approval
This study was performed in line with the principles of the Declaration of Helsinki. We received approval from Northwestern University’s Institutional Review Board. Study materials and data are not publicly available.
Consent to participate
All participants provided informed consent to participate in this research.
Consent to publication
No individual person’s data are presented, only aggregated results across participants.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mansolf, M., Blackwell, C.K., Cella, D. et al. Assessing the interchangeability of linked scores in multivariable statistical analyses. Qual Life Res 33, 1121–1131 (2024). https://doi.org/10.1007/s11136-023-03592-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-023-03592-x