Abstract
When using small area estimation models, the presence of outlying observations in the response and/or in the auxiliary variables can severely affect the estimates of the model parameters, which can in turn affect the small area estimates produced using these models. In this paper we propose an M-quantile estimator of the small area mean that is robust to the presence of outliers in the response variable and in the continuous auxiliary variables. To estimate the variability of this estimator we propose a non-parametric bootstrap estimator. The performance of the proposed estimator is evaluated by means of model- and design-based simulations and by an application to real data. In these comparisons we also include the extension of the Robust EBLUP able to down-weight the outliers in the auxiliary variables. The results show that in the presence of outliers in the auxiliary variables the proposed estimator outperforms its traditional version that takes into account the presence of outliers only in the response variable.
Similar content being viewed by others
References
Battese G, Harter R, Fuller W (1988) An error component model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36
Bianchi A, Fabrizi E, Salvati N, Tzavidis N (2015) M-quantile regression: diagnostics and parametric representation of the model. Working paper. http://www.sp.unipg.it/surwey/dowload/publications/24-mq-diagn.html
Bowman A (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71:353–360
Bowman A, Hall P, Prvan T (1998) Bandwidth selection for the smoothing of distribution functions. Biometrika 85:799–808
Breckling J, Chambers R (1988) M-quantiles. Biometrika 75(4):761–771
Carroll R, Pederson S (1993) On robustness in the logistic regression model. J R Stat Soc B 55:693–706
Chambers R (1986) Outlier robust finite population estimation. J Am Stat Ass 81:1063–1069
Chambers R, Tzavidis N (2006) M-quantile models for small area estimation. Biometrika 93(2):255–268
Chambers R, Chandra H, Tzavidis N (2011) On bias-robust mean squared error estimation for pseudo-linear small area estimators. Surv Methodol 37(2):153–170
Chambers R, Chandra H, Salvati N, Tzavidis N (2014) Outlier robust small area estimation. J R Stat Soc Ser B 76:47–69
Chambers R, Salvati N, Tzavidis N (2016) Semiparametric small area estimation for binary outcomes with application to unemployment estimation for local authorities in the UK. J R Stat Soc Ser A 179:453–479
Cook R, Weisberg S (1980) Characterization of an empirical influence function for detecting influential cases in regression. Technometrics 22:495–508
Dongmo Jiongo V, Haziza D, Duchesne P (2013) Controlling the bias of robust small area estimators. Biometrika 100:843–858
ESS (2014) The European statistical system vision 2020. Technical report, Eurostat. http://ec.europa.eu/eurostat/documents/10186/756730/ESS-Vision-2020.pdf/8d97506b-b802-439e-9ea4-303e905f4255
Fellner W (1986) Robust estimation of variance components. Technometrics 28:51–60
Filzmosera P, Maronna R, Werner M (2008) Outlier identification in high dimensions. Comput Stat Data Anal 52:1694–1711
Gershunskaya J (2010) Robust small area estimation using a mixture model. In: Proceedings of the joint statistical meeting 2010. American Statistical Association
Ghosh M, Sinha K, Kim D (2006) Empirical and hierarchical bayesian estimation in finite population sampling under structural measurement error model. Scand J Stat 33(3):560–568
Giusti C, Tzavidis N, Pratesi M, Salvati N (2014) Resistance to outliers of m-quantile and robust random effects small area models. Commun Stat Simul Comput 43(3):549–568
Hall P, Maiti T (2006) On parametric bootstrap methods for small area prediction. J R Stat Soc Ser B (Stat Methodol) 68(2):221–238. doi:10.1111/j.1467-9868.2006.00541.x
Hampel F, Ronchetti E, Rousseuw P, Stahel W (1986) Robust statistics: the approach based on influence functions. Wiley, New York
Hayfield T, Racine JS (2008) Nonparametric econometrics: the np package. J Stat Softw 27(5). http://www.jstatsoft.org/v27/i05/
Hubert M, Rousseeuw P, Van Aelst S (2008) High-breakdown robust multivariate methods. Stat Sci 23(1):92–119
Huggins R (1993) A robust approach to the analysis of repeated measures. Biometrics 49:255–268
Jiang J, Lahiri P (2006) Mixed model prediction and small area estimation. Test 15:1–96
Kokic P, Chambers R, Breckling J, Beare S (1997) A measure of production performance. J Bus Econ Stat 15(4):445–451
Koller PJ, Stahel WA (2011) Sharpening Wald-type inference in robust regression for small samples. Comput Stat Data Anal 55(8):2504–2515
Lombardía M, González-Manteiga W, Prada-Sánchez J (2003) Bootstrap** the Chambers–Dunstan estimate of finite population distribution function. J Stat Plan Inference 116:367–388
Marchetti S, Tzavidis N, Pratesi M (2012) Non-parametric bootstrap mean squared error estimation for m-quantile estimators of small area averages, quantiles and poverty indicators. Comput Stat Data Anal 56:2889–2902
Maronna AR, Martin R, Yohai V (2006) Robust statistics theory and methods. Wiley, London
Prasad N, Rao J (1990) The estimation of the mean squared error of small-area estimators. J Am Stat Assoc 85:163–171
R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Rao J, Molina I (2015) Small area estimation. Wiley series in survey methodology. Wiley. https://books.google.it/books?id=i1B_BwAAQBAJ
Richardson A, Welsh A (1995) Robust estimation in the mixed linear model. Biometrics 51:1429–1439
Rudemo M (1982) Empirical choice of histograms and kernel density estimators. Scand J Stat 9:66–78
Ruiz-Gazen A, Marie-Sainte S, Berro A (2010) Detecting multivariate outliers using projection pursuit with particle swarm optimization. In: Lechevallier Y, Saporta G (eds) Proceedings of COMPSTAT’2010. Physica-Verlag HD, pp 89–98. doi:10.1007/978-3-7908-2604-3_8
Sinha S, Rao J (2009) Robust small area estimation. Can J Stat 37(3):381–399
Stone C (1974) Cross-validatory choice and assessment of statistical predictors (with discussion). J R Stat Soc 36:11–147
Stone C (1984) An asymptotically optimal window selection rule for kernel density estimates. Ann Stat 12:1285–1297
Torabi M, Datta G, Rao J (2009) Empirical Bayes estimation of small area means under a nested error linear regression model with measurement errors in the covariates. Scand J Stat 36:355–368
Tzavidis N, Marchetti S, Chambers R (2010) Robust estimation of small area means and quantiles. Aust NZ J Stat 52(2):167–186
Ybarra L, Lohr S (2008) Small area estimation when auxiliary information is measured with error. Biometrika 95:919–931
Acknowledgements
The authors are grateful to the Associate Editor, the referee and the Editor for their helpful comments which have led to substantial improvements in the paper. The work of Salvati has been developed under the support of the project PRIN-SURWEY http://www.sp.unipg.it/surwey/ (Grant 2012F42NS8, Italy).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
See Table 11.
Rights and permissions
About this article
Cite this article
Marchetti, S., Giusti, C., Salvati, N. et al. Small area estimation based on M-quantile models in presence of outliers in auxiliary variables. Stat Methods Appl 26, 531–555 (2017). https://doi.org/10.1007/s10260-017-0380-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-017-0380-4