Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random

  • Chapter
  • First Online:
Optimization and Data Analysis in Biomedical Informatics

Part of the book series: Fields Institute Communications ((FIC,volume 63))

Abstract

Incomplete data are common in many fields of research, and interest often lies in estimating a marginal mean based on available information. This paper is concerned with the comparison of different strategies for estimating the marginal mean of a response when data are missing at random. We evaluate these methods based on the asymptotic bias, empirical bias and efficiency. We show that complete case analysis gives biased results when data are missing at random, but inverse probability weighted estimating equations (IPWEE) and a method based on the expected conditional mean (ECM) yield consistent estimators.. While these methods give estimators which behave similarly in the contexts studied they are based on quite different assumptions. The IPWEE approach requires analysts to specify a model for the missing data mechanism whereas the ECM approach requires a model for the distribution of auxiliary variables driving the missing data mechanism. The latter can be a challenge in practice, particularly when the covariates are of high dimension or are a mixture of continuous and categorical variables. The IPWEE approach therefore has considerable appeal in many practical settings.

Mathematics Subject Classification (2010): Primary 62H12, Secondary 62F10

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 53.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. R.E. Bellman, Adaptive Control Processes (Princeton University Press, Princeton, 1961)

    MATH  Google Scholar 

  2. R. Cameron, K.S. Brown, J.A. Best, C.L. Pelkman, C.L. Madill, S.R. Manske, M.E. Payne, Effectiveness of a social influences smoking prevention program as a function of provider type, training method, and social risk. Am. J. Public Health 89, 1827–1831 (1999)

    Article  Google Scholar 

  3. P.J. Diggle, P. Heagerty, K.Y. Liang, S.L. Zeger, Analysis of Longitudinal Data, 2nd edn. (Oxford University Press, London, 2002)

    Google Scholar 

  4. J.H. Friedman, An Overview of Predictive Learning and Function Approximation, ed. by V. Cherkassky, J.H. Friedman, H. Wechsler. From Statistics to Neural Networks. Proc. NATO/ASI Workshop (Springer, Berlin, 1994), pp. 1–61

    Google Scholar 

  5. G.N. Hortobagyi, R.L. Theriault, A. Lipton, L. Porter, D. Blayney, C. Sinoff, H. Wheeler, J.F. Simeone, J.J. Seaman, R.D. Knight, M. Heffernan, K. Mellars, D.J. Reitsma, Long-term prevention of skeletal complications of metastatic breast cancer with Pamidronate. J. Clin. Oncol. 16, 2038–2044 (1998)

    Google Scholar 

  6. D.G. Horvitz, D.J. Thompson, A generalization of sampling without replacement from a finite universe, J. Am. Stat. Assoc. 47 663–685 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  7. J.D.Y. Kang, J.L. Schafer, Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Stat. Sci. 22, 523–539 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. K.Y. Liang, S.L. Zeger, Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  9. R.J.A. Little, D.B. Rubin, Statistical Analysis with Missing Data (Wiley, 2nd edn. 2002)

    Google Scholar 

  10. C.R. Loader, Local likelihood density estimation. Ann. Stat. 24, 1602–1618 (1996)

    MathSciNet  MATH  Google Scholar 

  11. P. McCullagh, J.A. Nelder, Generalized Linear Models (Chapman and Hall, London, 1989)

    MATH  Google Scholar 

  12. J. Qin, B. Zhang, Empirical-likelihood-based inference in missing response problems and its application in observational studies. J. Roy. Stat. Soc. B 69, 101–122 (2007)

    Article  MathSciNet  Google Scholar 

  13. J.M. Robins, A. Rotnitzky, L.P. Zhao, Estimation of regression coefficients when some regressor are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  14. J.M. Robins, A. Rotnitzky, L.P. Zhao, Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  15. D.B. Rubin, Inference and Missing data. Biometrika 63, 581–592 (1976)

    MATH  Google Scholar 

  16. D.B. Rubin, Multiple Imputation for Nonresponse in Surveys (Wiley, New York, 1987)

    Book  Google Scholar 

  17. J.L. Schafer, Analysis of Incomplete Multivariate Data (Chapman and Hall, New York, 1997)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baojiang Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Chen, B., Cook, R.J. (2013). Strategies for Bias Reduction in Estimation of Marginal Means with Data Missing at Random. In: Pardalos, P., Coleman, T., Xanthopoulos, P. (eds) Optimization and Data Analysis in Biomedical Informatics. Fields Institute Communications, vol 63. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4133-5_5

Download citation

Publish with us

Policies and ethics

Navigation