Abstract
Model-based methods are often used to impute missing values. If the model assumptions are satisfied, these types of methods are often superior to model-free methods. In this chapter, linear models are discussed, while the following chapters focus on nonlinear methods.
First, we introduce linear regression based on (classical) ordinary least squares (OLS) at a very basic level. OLS has some nice mathematical properties, but this type of method is strongly influenced by outliers. Robust methods give roughly the same results in the case of a multivariate normal distribution, but give reliable results when the data contain artifacts or/and outliers.
Therefore, after an introduction to common concepts and implementations based on mice, we focus on robust imputation available in the R package VIM, as it gives roughly the same results in the case of elliptically normally distributed data, but is better in practice when obvious or masked outliers are present.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Buuren, S. van. 2012. Flexible Imputation of Missing Data. Chapman & Hall/CRC Interdisciplinary Statistics. Taylor & Francis. https://books.google.ch/books?id=M89TDSml-FoC.
Cantoni, E., and E. Ronchetti. 2001. “Robust Inference for Generalized Linear Models.” Journal of the American Statistical Association 96 (455): 1022–30. http://www.jstor.org/stable/2670248.
Gaffert, Philipp, Florian Meinfelder, and Volker Bosch. 2016. “Towards an MI-Proper Predictive Mean Matching.” In Survey Research Methods Section, JSM 2018.
Huber, P. J. 1981. Robust Statistics. Wiley.
Kleinke, Kristian. 2018. “Multiple Imputation by Predictive Mean Matching When Sample Size Is Small.” Methodology 14 (1): 3–15. https://doi.org/10.1027/1614-2241/a000141.
Maronna, R. A., R. D. Martin, and V. J. Yohai. 2006. Robust Statistics: Theory and Methods. John Wiley & Sons, New York.
Morris, Tim P., Ian R. White, and Patrick Royston. 2014. “Tuning Multiple Imputation by Predictive Mean Matching and Local Residual Draws.” BMC Medical Research Methodology 14 (1): 75. https://doi.org/10.1186/1471-2288-14-75.
Parzen, Michael, Stuart R. Lipsitz, and Garrett M. Fitzmaurice. 2005. “A Note on Reducing the Bias of the Approximate Bayesian Bootstrap Imputation Variance Estimator.” Biometrika 92 (4): 971–74. http://www.jstor.org/stable/20441250.
Raessler, S., and R. Münnich. 2004. “The Impact of Multiple Imputation for DACSEIS.” Research Report IST-2000-26057-DACSEIS, 5/2004. University of Tübingen.
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press.
Rousseeuw, P. J., and K. Van Driessen. 2002. “Computing LTS Regression for Large Data Sets.” Estadistica 54: 163–90.
Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: Wiley.
Schenker, Nathaniel, and Jeremy M. G. Taylor. 1996. “Partially Parametric Techniques for Multiple Imputation.” Computational Statistics & Data Analysis 22 (4): 425–46. https://EconPapers.repec.org/RePEc:eee:csdana:v:22:y:1996:i:4:p:425-446.
Siddique, Juned, and Thomas Belin. 2008. “Multiple Imputation Using an Iterative Hot-Deck with Distance-Based Donor Selection.” Statistics in Medicine 27 (January): 83–102. https://doi.org/10.1002/sim.3001.
Templ, M., A. Kowarik, and P. Filzmoser. 2011. “Iterative Stepwise Regression Imputation Using Standard and Robust Methods.” Comput Stat Data Anal 55 (10): 2793–2806.
Templ, M. 2023. “Enhancing Precision in Large-Scale Data Analysis: An Innovative Robust Imputation Algorithm for Managing Outliers and Missing Values.” Mathematics 11 (12): 2729.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with s. Fourth. New York: Springer. http://www.stats.ox.ac.uk/pub/MASS4.
Yohai, V. J. 1987. “High Breakdown-Point and High Efficiency Estimates for Regression.” The Annals of Statistics 15: 642–65.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Templ, M. (2023). Model-Based Methods. In: Visualization and Imputation of Missing Values. Statistics and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-30073-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-30073-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30072-1
Online ISBN: 978-3-031-30073-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)