Abstract
The purpose of this work is to improve the efficiency in estimating the average causal effect (ACE) on the survival scale where right censoring exists and high-dimensional covariate information is available. We propose new estimators using regularized survival regression and survival Random Forest (RF) to adjust for the high-dimensional covariate to improve efficiency. We study the behavior of the adjusted estimators under mild assumptions and show theoretical guarantees that the proposed estimators are more efficient than the unadjusted ones asymptotically when using RF for the adjustment. In addition, these adjusted estimators are \(\sqrt{n}\)- consistent and asymptotically normally distributed. The finite sample behavior of our methods is studied by simulation. The simulation results are in agreement with the theoretical results. We also illustrate our methods by analyzing the real data from transplant research to identify the relative effectiveness of identical sibling donors compared to unrelated donors with the adjustment of cytogenetic abnormalities.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12561-022-09358-2/MediaObjects/12561_2022_9358_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12561-022-09358-2/MediaObjects/12561_2022_9358_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs12561-022-09358-2/MediaObjects/12561_2022_9358_Fig3_HTML.png)
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the center for international blood & marrow transplant research (CIBMTR) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of CIBMTR.
Code Availability
All code for the simulation and data analysis associated with the current submission is available upon request via email to the corresponding author.
References
Bai X, Tsiatis A, O’Brien S (2013) Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling. Biometrics 69:830–839
Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection among high-dimensional controls. Rev Econ Stud 81:608–650
Bloniarz A, Liu H, Zhang C-H, Sekhon J, Yu B (2016) Lasso adjustments of treatment effect estimates in randomized experiments. Proc Nat Acad Sci 113:7383–7390
Chalmers T, Smith H, Blackburn B, Silverman B, Schroeder B, Reitman D, Ambroz A (1981) A method for assessing the quality of a randomized control trial. Controll Clin Trials 2:31–49
Chen P, Tsiatis A (2001) Causal inference on the difference of the restricted mean life between two groups. Biometrics 57:1030–1038
Cole S, Frangakis C (2009) The consistency statement in causal inference: a definition or an assumption. Epidemiology 20:3–5
Cole S, Hernan M (2004) Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed 75:45–49
Cox D (1972) Regression models and life-tables (with discussion). J R Stat Soc Ser B 34:187–202
Dabrowka D (1989) Uniform consistency of the kernel conditional Kaplan–Meier estimate. Ann Stat 17:1157–1167
Efron B, Stein C (1981) The jackknife estimate of variance. Ann Stat 9:586–596
Fisher R (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh
Freedman D (2008) On regression adjustments in experiments with several treatments. Ann Appl Stat 2:176–196
Hernan M (2010) The hazards of hazard ratios. Epidemiology 21:13–15
Imbens G, Rubin D (2015) Causal inference for statistics, social, and biomedical sciences. Cambridge University Press, New York
Ishwaran H, Kogalur U (2010) Consistency of random survival forests. Stat Probab Lett 80:1056–1064
Ishwaran H, Kogalur U, Blackstone E, Lauer M (2008) Random survival forests. Ann Appl Stat 2:841–860
Ishwaran H, Kogalur U, Gorodeski E, Minn A, Lauer M (2010) High-dimensional variable selection for survival data. J Am Stat Assoc 105:205–217
Kaplan E, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Knight K, Fu W (2000) Asymptotics for lasso-type estimators. Ann Stat 28:1356–1378
Lei L, Ding P (2021) Regression adjustment in randomized experiments with a diverging number of covariates. Biometrika 108:815–828
Lin H, Li Y, Li G (2014) A semiparametric linear transformation model to estimate causal effects for survival data. Can Stat 42:18–35
Lok J, Yang S, Sharkey B, Hughes M (2018) Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms. Lifetime Data Anal 24:201–223
Ozenne B, Scheike LSRTH, Gerds T (2020) On the estimation of average treatment effects with right-censored time to event outcome and competing risks. Biom J 62:751–763
Pollard D (1982) A central limit theorem for empirical process. J Austral Math Soc (Ser A) 33:235–248
Robins J, Finkelstein D (2000) Correcting for noncompliance and dependent censoring in an aids clinical trial with inverse probability of censoring weighted (ipcw) log-rank tests. Biometrics 56:779–788
Rosenbaum P (1987) Model-based direct adjustment. J Am Stat Assoc 82:387–394
Rosenbaum P (2002) Covariance adjustment in randomized experiments and observational studies. Stat Sci 17:286–327
Rubin D (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Rubin D (1978) Bayesian inference in causal effects: the role of randomization. Ann Stat 6:34–58
Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16:385–395
Tsiatis A (2006) Semiparametric theory and missing data. Springer, Berlin
VanderWeele T (2011) Causal mediation analysis with survival data. Epidemiology 22:582–585
Wager S, Du W, Taylor J, Tibshirani R (2016) High-dimensional regression adjustments in randomized experiments. Proc Nat Acad Sci 113:12673–12678
Funding
R.D. was partially supported by the National Institute of Health via grant 2U54GM115458-06. C.Z. was partially supported by the National Institute of Health via grant 2U54GM115458-06. M.Z. was partly supported by NCATS 12-303G-UN.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that there are no relevant financial or non-financial competing interests to report.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dai, R., Zheng, C. & Zhang, MJ. On High-Dimensional Covariate Adjustment for Estimating Causal Effects in Randomized Trials with Survival Outcomes. Stat Biosci 15, 242–260 (2023). https://doi.org/10.1007/s12561-022-09358-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-022-09358-2