Abstract
Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency. We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditional mean (ECM) yield consistent estimators.. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECM approach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.
Mathematics Subject Classification (2010): Primary 62H12, Secondary 62F10
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
R.E. Bellman, Adaptive Control Processes (Princeton University Press, Princeton, 1961)
R. Cameron, K.S. Brown, J.A. Best, C.L. Pelkman, C.L. Madill, S.R. Manske, M.E. Payne, Effectiveness of a social influences smoking prevention program as a function of provider type, training method, and social risk. Am. J. Public Health 89, 1827–1831 (1999)
P.J. Diggle, P. Heagerty, K.Y. Liang, S.L. Zeger, Analysis of Longitudinal Data, 2nd edn. (Oxford University Press, London, 2002)
J.H. Friedman, An Overview of Predictive Learning and Function Approximation, ed. by V. Cherkassky, J.H. Friedman, H. Wechsler. From Statistics to Neural Networks. Proc. NATO/ASI Workshop (Springer, Berlin, 1994), pp. 1–61
G.N. Hortobagyi, R.L. Theriault, A. Lipton, L. Porter, D. Blayney, C. Sinoff, H. Wheeler, J.F. Simeone, J.J. Seaman, R.D. Knight, M. Heffernan, K. Mellars, D.J. Reitsma, Long-term prevention of skeletal complications of metastatic breast cancer with Pamidronate. J. Clin. Oncol. 16, 2038–2044 (1998)
D.G. Horvitz, D.J. Thompson, A generalization of sampling without replacement from a finite universe, J. Am. Stat. Assoc. 47 663–685 (1952)
J.D.Y. Kang, J.L. Schafer, Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22, 523–539 (2007)
K.Y. Liang, S.L. Zeger, Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
R.J.A. Little, D.B. Rubin, Statistical Analysis with Missing Data (Wiley, 2nd edn. 2002)
C.R. Loader, Local likelihood density estimation. Ann. Stat. 24, 1602–1618 (1996)
P. McCullagh, J.A. Nelder, Generalized Linear Models (Chapman and Hall, London, 1989)
J. Qin, B. Zhang, Empirical-likelihood-based inference in missing response problems and its application in observational studies. J. Roy. Stat. Soc. B 69, 101–122 (2007)
J.M. Robins, A. Rotnitzky, L.P. Zhao, Estimation of regression coefficients when some regressor are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)
J.M. Robins, A. Rotnitzky, L.P. Zhao, Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)
D.B. Rubin, Inference and Missing data. Biometrika 63, 581–592 (1976)
D.B. Rubin, Multiple Imputation for Nonresponse in Surveys (Wiley, New York, 1987)
J.L. Schafer, Analysis of Incomplete Multivariate Data (Chapman and Hall, New York, 1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Chen, B., Cook, R.J. (2013). Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random. In: Pardalos, P., Coleman, T., Xanthopoulos, P. (eds) Optimization and Data Analysis in Biomedical Informatics. Fields Institute Communications, vol 63. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4133-5_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4133-5_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4132-8
Online ISBN: 978-1-4614-4133-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)