Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random

Chen, Baojiang; Cook, Richard J.

doi:10.1007/978-1-4614-4133-5_5

Baojiang Chen⁴ &
Richard J. Cook⁵

Part of the book series: Fields Institute Communications ((FIC,volume 63))

1123 Accesses
1 Citations

Abstract

Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency. We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditional mean (ECM) yield consistent estimators.. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECM approach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.

Mathematics Subject Classification (2010): Primary 62H12, Secondary 62F10

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 42.79; Price includes VAT (Germany)

Softcover Book: EUR 53.49; Price includes VAT (Germany)

Hardcover Book: EUR 53.49; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Estimation of the mean of the partially linear single-index errors-in-variables model with missing response variables

Article Open access 30 January 2020

Improving the Robustness of Parametric Imputation

Empirical likelihood-based inferences in varying coefficient models with missing data

Article 12 July 2015

References

R.E. Bellman, Adaptive Control Processes (Princeton University Press, Princeton, 1961)
MATH Google Scholar
R. Cameron, K.S. Brown, J.A. Best, C.L. Pelkman, C.L. Madill, S.R. Manske, M.E. Payne, Effectiveness of a social influences smoking prevention program as a function of provider type, training method, and social risk. Am. J. Public Health 89, 1827–1831 (1999)
Article Google Scholar
P.J. Diggle, P. Heagerty, K.Y. Liang, S.L. Zeger, Analysis of Longitudinal Data, 2nd edn. (Oxford University Press, London, 2002)
Google Scholar
J.H. Friedman, An Overview of Predictive Learning and Function Approximation, ed. by V. Cherkassky, J.H. Friedman, H. Wechsler. From Statistics to Neural Networks. Proc. NATO/ASI Workshop (Springer, Berlin, 1994), pp. 1–61
Google Scholar
G.N. Hortobagyi, R.L. Theriault, A. Lipton, L. Porter, D. Blayney, C. Sinoff, H. Wheeler, J.F. Simeone, J.J. Seaman, R.D. Knight, M. Heffernan, K. Mellars, D.J. Reitsma, Long-term prevention of skeletal complications of metastatic breast cancer with Pamidronate. J. Clin. Oncol. 16, 2038–2044 (1998)
Google Scholar
D.G. Horvitz, D.J. Thompson, A generalization of sampling without replacement from a finite universe, J. Am. Stat. Assoc. 47 663–685 (1952)
Article MathSciNet MATH Google Scholar
J.D.Y. Kang, J.L. Schafer, Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22, 523–539 (2007)
Article MathSciNet MATH Google Scholar
K.Y. Liang, S.L. Zeger, Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Article MathSciNet MATH Google Scholar
R.J.A. Little, D.B. Rubin, Statistical Analysis with Missing Data (Wiley, 2nd edn. 2002)
Google Scholar
C.R. Loader, Local likelihood density estimation. Ann. Stat. 24, 1602–1618 (1996)
MathSciNet MATH Google Scholar
P. McCullagh, J.A. Nelder, Generalized Linear Models (Chapman and Hall, London, 1989)
MATH Google Scholar
J. Qin, B. Zhang, Empirical-likelihood-based inference in missing response problems and its application in observational studies. J. Roy. Stat. Soc. B 69, 101–122 (2007)
Article MathSciNet Google Scholar
J.M. Robins, A. Rotnitzky, L.P. Zhao, Estimation of regression coefficients when some regressor are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)
Article MathSciNet MATH Google Scholar
J.M. Robins, A. Rotnitzky, L.P. Zhao, Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)
Article MathSciNet MATH Google Scholar
D.B. Rubin, Inference and Missing data. Biometrika 63, 581–592 (1976)
MATH Google Scholar
D.B. Rubin, Multiple Imputation for Nonresponse in Surveys (Wiley, New York, 1987)
Book Google Scholar
J.L. Schafer, Analysis of Incomplete Multivariate Data (Chapman and Hall, New York, 1997)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE, 68198, USA
Baojiang Chen
Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada, N2L 3G1
Richard J. Cook

Authors

Baojiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Richard J. Cook
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baojiang Chen .

Editor information

Editors and Affiliations

, Department of Industrial & Systems Engin, University of Florida, Weil Hall 401, Gainesville, 32611, Florida, USA
Panos M. Pardalos
, Department of Mathematics, University of Waterloo, University Avenue West 200, Waterloo, N2L 3G1, Ontario, Canada
Thomas F. Coleman
, Department of Industrial Engineering, University of Central Florida, Central Florida Blvd 4000, Orlando, 32816, Florida, USA
Petros Xanthopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chen, B., Cook, R.J. (2013). Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random. In: Pardalos, P., Coleman, T., Xanthopoulos, P. (eds) Optimization and Data Analysis in Biomedical Informatics. Fields Institute Communications, vol 63. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4133-5_5

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4133-5_5
Published: 20 July 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4132-8
Online ISBN: 978-1-4614-4133-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Estimation of the mean of the partially linear single-index errors-in-variables model with missing response variables

Improving the Robustness of Parametric Imputation

Empirical likelihood-based inferences in varying coefficient models with missing data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Estimation of the mean of the partially linear single-index errors-in-variables model with missing response variables

Improving the Robustness of Parametric Imputation

Empirical likelihood-based inferences in varying coefficient models with missing data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation