Abstract
Interval sampling and two-phase sampling have both been advocated for studying rare failure outcomes. With few exceptions focusing on specific designs such as the case-cohort design, they are often studied separately in the statistical literature and require different estimation procedures. We consider efficient estimation of interval-censored data collected in a two-phase sampling design using a localized nonparametric likelihood. An expectation maximization algorithm is proposed by exploiting multiple layers of data augmentation that handle transformation function, interval-censoring, and two-phase sampling structure simultaneously. We study the asymptotic properties of the estimators and conduct inference using profile likelihood. We illustrate the performance of the proposed estimator by simulations and an HIV vaccine trial.
Similar content being viewed by others
References
Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73(1):1–11
Liddell F, McDonald J, Thomas D, Cunliffe SV (1977) Methods of cohort analysis: appraisal by application to asbestos mining. J R Stat Soc Ser A 140(4):469–491
Self SG, Prentice RL (1988) Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat 16(1):64–81
Chen K, Lo S-H (1999) Case-cohort and case-control analysis with cox’s model. Biometrika 86(4):755–764
Kulich M, Lin DY (2000) Additive hazards regression for case-cohort studies. Biometrika 87(1):73–87
Chen K (2001) Generalized case-cohort sampling. J R Stat Soc Ser B 63(4):791–809
Kong L, Cai J, Sen PK (2004) Weighted estimating equations for semiparametric transformation models with censored data from a case-cohort design. Biometrika 91(2):305–319
Lu W, Tsiatis AA (2006) Semiparametric transformation models for the case-cohort study. Biometrika 93(1):207–214
Nan B, Yu M, Kalbfleisch JD (2006) Censored linear regression for case-cohort studies. Biometrika 93(4):747–762
Kong L, Cai J (2009) Case-cohort analysis with accelerated failure time model. Biometrics 65(1):135–142
Kulich M, Lin DY (2004) Improving the efficiency of relative-risk estimation in case-cohort studies. J Am Stat Assoc 99(467):832–844
Qi L, Wang C, Prentice RL (2005) Weighted estimators for proportional hazards regression with missing covariates. J Am Stat Assoc 100(472):1250–1263
Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M (2009) Using the whole cohort in the analysis of case-cohort data. Am J Epidemiol 169(11):1398–1405
Luo X, Tsai WY, Xu Q (2009) Pseudo-partial likelihood estimators for the cox regression model with missing covariates. Biometrika 96(3):617–633
Scheike TH, Juul A (2004) Maximum likelihood estimation for cox’s regression model under nested case-control sampling. Biostatistics 5(2):193–206
Zeng D, Lin D (2014) Efficient estimation of semiparametric transformation models for two-phase cohort studies. J Am Stat Assoc 109(505):371–383
Gilbert PB, Peterson ML, Follmann D, Hudgens MG, Francis DP, Gurwith M, Heyward WL, Jobes DV, Popovic V, Self SG (2005) Correlation between immunologic responses to a recombinant glycoprotein 120 vaccine and incidence of HIV-1 infection in a phase 3 HIV-1 preventive vaccine trial. J Infect Dis 191(5):666–677
Forthal DN, Gilbert PB, Landucci G, Phan T (2007) Recombinant gp120 vaccine-induced antibodies inhibit clinical strains of HIV-1 in the presence of Fc receptor-bearing effector cells and correlate inversely with HIV infection rate. J Immun 178(10):6596–6603
Li Z, Gilbert P, Nan B (2008) Weighted likelihood method for grouped survival data in case-cohort studies with application to HIV vaccine trials. Biometrics 64(4):1247–1255
Li Z, Nan B (2011) Relative risk regression for current status data in case-cohort studies. Can J Stat 39(4):557–577
Zhou Q, Zhou H, Cai J (2017) Case-cohort studies with interval-censored failure time data. Biometrika 104(1):17–29
Zhou Q, Cai J, Zhou H (2018) Outcome-dependent sampling with interval-censored failure time data. Biometrics 74(1):58–67
Zhou Q, Cai J, Zhou H (2020) Semiparametric inference for a two-stage outcome-dependent sampling design with interval-censored failure time data. Lifetime Data Anal 26(1):85–108
Du M, Li H, Sun J (2020) Additive hazards regression for case-cohort studies with interval-censored data. Stat Interface 13(2):181–191
Du M, Li H, Sun J (2021) Regression analysis of censored data with nonignorable missing covariates and application to Alzheimer disease. Comput Stat Data Anal 157:107157
Chatterjee N, Chen Y-H, Breslow NE (2003) A pseudoscore estimator for regression problems with two-phase sampling. J Am Stat Assoc 98(461):158–168
Zeng D, Mao L, Lin D (2016) Maximum likelihood estimation for semiparametric transformation models with interval-censored data. Biometrika 103(2):253–271
Zeng D, Gao F, Lin D (2017) Maximum likelihood estimation for semiparametric regression models with multivariate interval-censored data. Biometrika 104(3):505–525
Buchbinder SP, Mehrotra DV, Duerr A, Fitzgerald DW, Mogg R, Li D, Gilbert PB, Lama JR, Marmor M, del Rio C (2008) Efficacy assessment of a cell-mediated immunity HIV-1 vaccine (the Step Study): a double-blind, randomised, placebo-controlled, test-of-concept trial. The Lancet 372(9653):1881–1893
Duerr A, Huang Y, Buchbinder S, Coombs RW, Sanchez J, Del Rio C, Casapia M, Santiago S, Gilbert P, Corey L (2012) Extended follow-up confirms early vaccine-enhanced risk of HIV acquisition and demonstrates waning effect over time among participants in a randomized trial of recombinant adenovirus HIV vaccine (Step Study). J Infect Dis 206(2):258–266
Huang Y, Duerr A, Frahm N, Zhang L, Moodie Z, De Rosa S, McElrath MJ, Gilbert PB (2014) Immune-correlates analysis of an HIV-1 vaccine efficacy trial reveals an association of nonspecific interferon-\(\gamma \) secretion with increased HIV-1 infection risk: a cohort-based modeling study. PLoS ONE 9(11):108631
Therneau TM, Li H (1999) Computing the cox model for case cohort designs. Lifetime Data Anal 5(2):99–112
Centers for Disease Control and Prevention. Laboratory testing for the diagnosis of HIV infection: updated recommendations. https://stacks.cdc.gov/view/cdc/23446. Accessed 12 Mar 2023
Tao R, Zeng D, Lin D-Y (2020) Optimal designs of two-phase studies. J Am Stat Assoc 115(532):1946–1959
Acknowledgements
The authors thank Drs. Peter Gilbert and Yunda Huang, the HIV Vaccine Trials Network, and Merck for providing the Step 502/Merck 023+HVTN 504 data, which was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under the U.S. Public Health Service Grant AI068635 (Statistical and Data Management Center for the HIV Vaccine Trials Network). The content is solely the responsibility of the authors and does 345 not necessarily represent the official views of the National Institutes of Health. This work was supported by the U.S. National Institutes of Health Grants R01HL122212 and R37AI029168.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gao, F., Chan, K.C.G. Efficient Estimation of Semiparametric Transformation Model with Interval-Censored Data in Two-Phase Cohort Studies. Stat Biosci 16, 203–220 (2024). https://doi.org/10.1007/s12561-023-09392-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-023-09392-8