Log in

On High-Dimensional Covariate Adjustment for Estimating Causal Effects in Randomized Trials with Survival Outcomes

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

The purpose of this work is to improve the efficiency in estimating the average causal effect (ACE) on the survival scale where right censoring exists and high-dimensional covariate information is available. We propose new estimators using regularized survival regression and survival Random Forest (RF) to adjust for the high-dimensional covariate to improve efficiency. We study the behavior of the adjusted estimators under mild assumptions and show theoretical guarantees that the proposed estimators are more efficient than the unadjusted ones asymptotically when using RF for the adjustment. In addition, these adjusted estimators are \(\sqrt{n}\)- consistent and asymptotically normally distributed. The finite sample behavior of our methods is studied by simulation. The simulation results are in agreement with the theoretical results. We also illustrate our methods by analyzing the real data from transplant research to identify the relative effectiveness of identical sibling donors compared to unrelated donors with the adjustment of cytogenetic abnormalities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Thailand)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the center for international blood & marrow transplant research (CIBMTR) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of CIBMTR.

Code Availability

All code for the simulation and data analysis associated with the current submission is available upon request via email to the corresponding author.

References

  1. Bai X, Tsiatis A, O’Brien S (2013) Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling. Biometrics 69:830–839

    Article  MathSciNet  MATH  Google Scholar 

  2. Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection among high-dimensional controls. Rev Econ Stud 81:608–650

    Article  MathSciNet  MATH  Google Scholar 

  3. Bloniarz A, Liu H, Zhang C-H, Sekhon J, Yu B (2016) Lasso adjustments of treatment effect estimates in randomized experiments. Proc Nat Acad Sci 113:7383–7390

    Article  MathSciNet  MATH  Google Scholar 

  4. Chalmers T, Smith H, Blackburn B, Silverman B, Schroeder B, Reitman D, Ambroz A (1981) A method for assessing the quality of a randomized control trial. Controll Clin Trials 2:31–49

    Article  Google Scholar 

  5. Chen P, Tsiatis A (2001) Causal inference on the difference of the restricted mean life between two groups. Biometrics 57:1030–1038

    Article  MathSciNet  MATH  Google Scholar 

  6. Cole S, Frangakis C (2009) The consistency statement in causal inference: a definition or an assumption. Epidemiology 20:3–5

    Article  Google Scholar 

  7. Cole S, Hernan M (2004) Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed 75:45–49

    Article  Google Scholar 

  8. Cox D (1972) Regression models and life-tables (with discussion). J R Stat Soc Ser B 34:187–202

    MATH  Google Scholar 

  9. Dabrowka D (1989) Uniform consistency of the kernel conditional Kaplan–Meier estimate. Ann Stat 17:1157–1167

    MathSciNet  MATH  Google Scholar 

  10. Efron B, Stein C (1981) The jackknife estimate of variance. Ann Stat 9:586–596

    Article  MathSciNet  MATH  Google Scholar 

  11. Fisher R (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh

    MATH  Google Scholar 

  12. Freedman D (2008) On regression adjustments in experiments with several treatments. Ann Appl Stat 2:176–196

    Article  MathSciNet  MATH  Google Scholar 

  13. Hernan M (2010) The hazards of hazard ratios. Epidemiology 21:13–15

    Article  Google Scholar 

  14. Imbens G, Rubin D (2015) Causal inference for statistics, social, and biomedical sciences. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  15. Ishwaran H, Kogalur U (2010) Consistency of random survival forests. Stat Probab Lett 80:1056–1064

    Article  MathSciNet  MATH  Google Scholar 

  16. Ishwaran H, Kogalur U, Blackstone E, Lauer M (2008) Random survival forests. Ann Appl Stat 2:841–860

    Article  MathSciNet  MATH  Google Scholar 

  17. Ishwaran H, Kogalur U, Gorodeski E, Minn A, Lauer M (2010) High-dimensional variable selection for survival data. J Am Stat Assoc 105:205–217

    Article  MathSciNet  MATH  Google Scholar 

  18. Kaplan E, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    Article  MathSciNet  MATH  Google Scholar 

  19. Knight K, Fu W (2000) Asymptotics for lasso-type estimators. Ann Stat 28:1356–1378

    MathSciNet  MATH  Google Scholar 

  20. Lei L, Ding P (2021) Regression adjustment in randomized experiments with a diverging number of covariates. Biometrika 108:815–828

    Article  MathSciNet  MATH  Google Scholar 

  21. Lin H, Li Y, Li G (2014) A semiparametric linear transformation model to estimate causal effects for survival data. Can Stat 42:18–35

    Article  MathSciNet  MATH  Google Scholar 

  22. Lok J, Yang S, Sharkey B, Hughes M (2018) Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms. Lifetime Data Anal 24:201–223

    Article  MathSciNet  MATH  Google Scholar 

  23. Ozenne B, Scheike LSRTH, Gerds T (2020) On the estimation of average treatment effects with right-censored time to event outcome and competing risks. Biom J 62:751–763

    Article  MathSciNet  MATH  Google Scholar 

  24. Pollard D (1982) A central limit theorem for empirical process. J Austral Math Soc (Ser A) 33:235–248

    Article  MathSciNet  MATH  Google Scholar 

  25. Robins J, Finkelstein D (2000) Correcting for noncompliance and dependent censoring in an aids clinical trial with inverse probability of censoring weighted (ipcw) log-rank tests. Biometrics 56:779–788

    Article  MATH  Google Scholar 

  26. Rosenbaum P (1987) Model-based direct adjustment. J Am Stat Assoc 82:387–394

    Article  MATH  Google Scholar 

  27. Rosenbaum P (2002) Covariance adjustment in randomized experiments and observational studies. Stat Sci 17:286–327

    Article  MathSciNet  MATH  Google Scholar 

  28. Rubin D (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  29. Rubin D (1978) Bayesian inference in causal effects: the role of randomization. Ann Stat 6:34–58

    Article  MATH  Google Scholar 

  30. Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16:385–395

    Article  Google Scholar 

  31. Tsiatis A (2006) Semiparametric theory and missing data. Springer, Berlin

    MATH  Google Scholar 

  32. VanderWeele T (2011) Causal mediation analysis with survival data. Epidemiology 22:582–585

    Article  Google Scholar 

  33. Wager S, Du W, Taylor J, Tibshirani R (2016) High-dimensional regression adjustments in randomized experiments. Proc Nat Acad Sci 113:12673–12678

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

R.D. was partially supported by the National Institute of Health via grant 2U54GM115458-06. C.Z. was partially supported by the National Institute of Health via grant 2U54GM115458-06. M.Z. was partly supported by NCATS 12-303G-UN.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Zheng.

Ethics declarations

Conflict of interest

All authors declare that there are no relevant financial or non-financial competing interests to report.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 5284 kb)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, R., Zheng, C. & Zhang, MJ. On High-Dimensional Covariate Adjustment for Estimating Causal Effects in Randomized Trials with Survival Outcomes. Stat Biosci 15, 242–260 (2023). https://doi.org/10.1007/s12561-022-09358-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-022-09358-2

Keywords

Navigation