Abstract
Linear mixed-effects models are a class of models widely used for analyzing different types of data: longitudinal, clustered and panel data. Many fields, in which a statistical methodology is required, involve the employment of linear mixed models, such as biology, chemistry, medicine, finance and so forth. One of the most important processes, in a statistical analysis, is given by model selection. Hence, since there are a large number of linear mixed model selection procedures available in the literature, a pressing issue is how to identify the best approach to adopt in a specific case. We outline mainly all approaches focusing on the part of the model subject to selection (fixed and/or random), the dimensionality of models and the structure of variance and covariance matrices, and also, wherever possible, the existence of an implemented application of the methodologies set out.
Similar content being viewed by others
References
Ahn, M., Zhang, H.H., Lu, W.: Moment-based method for random effects selection in linear mixed models. Stat. Sin. 22(4), 1539 (2012)
Akaike, H.: Information theory and an extension of the maximum likelihood principle. Breakthroughs in Statistics, pp. 610–624. Springer, Berlin (1992)
Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)
Bozdogan, H.: Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52, 345–370 (1987)
Braun, J., Held, L., Ledergerber, B.: Predictive cross-validation for the choice of linear mixed-effects models with application to data from the Swiss HIV Cohort Study. Biometrics 68(1), 53–61 (2012)
Bülmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer, Berlin (2011)
Chen, Z., Dunson, D.B.: Random effects selection in linear mixed models. Biometrics 59(4), 762–769 (2003)
Chen, F., Li, Z., Shi, L., Zhu, L.: Inference for mixed models of anova type with high-dimensional data. J. Multivar. Anal. 133, 382–401 (2015)
Dimova, R.B., Markatou, M., Talal, A.H.: Information methods for model selection in linear mixed effects models with application to HCV data. Comput. Stat. Data Anal. 55(9), 2677–2697 (2011)
Fan, Y., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Fan, Y., Li, R.: Variable selection in linear mixed effects models. Ann. Stat. 40(4), 2043–2068 (2012)
Fan, Y., Qin, G., Zhu, Z.Y.: Robust variable selection in linear mixed models. Commun. Stat. Theory Methods 43(21), 4566–4581 (2014)
Frank, I.E., Friedman, J.H.: A statistical view of some chemometric regression tools. Technometrics 35, 109–148 (1993)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)
Fu, W.J.: Penalized regressions: the bridge versus the LASSO. J. Comput. Graph. Stat. 7, 397–416 (1998)
Ghosh, A., Thoresen, M.: Non-concave penalization in linear mixed-effects models and regularized selection of fixed effects. AStA Adv. Stat. Anal. 102(2), 179–210 (2018)
Gilmour, S.G.: The interpretation of mallow’s cp statistic. The Statistician 45, 49–56 (1996)
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)
Greven, S., Kneib, T.: On the behaviour of marginal and conditional aic in linear mixed models. Biometrika 97, 773–789 (2010)
Han, B.: Conditional akaike information criterion in the Fay-Herriot model. Stat. Methodol. 11, 53–67 (2013)
Hansen, M.H., Yu, B.: Minimum description length model selection criteria for generalized linear models. Stat. Sci. A Festschrift Terry Speed 40, 145–163 (2003)
Hodges, J.S., Sargent, D.J.: Counting degrees of freedom in hierarchical and other richly-parameterised models. Biometrika 88, 367–379 (2001)
Hossain, S., Thomson, T., Ahmed, E.: Shrinkage estimation in linear mixed models for longitudinal data. Metrika 81(5), 569–586 (2018)
Hui, F.K., Müller, S., Welsh, A.: Joint selection in mixed models using regularized PQL. J. Am. Stat. Assoc. 112(519), 1323–1333 (2017)
Ibrahim, J.G., Zhu, H., Garcia, R.I., Guo, R.: Fixed and random effects selection in mixed effects models. Biometrics 67(2), 495–503 (2011)
Jiang, J., Rao, J.S.: Consistent procedures for mixed linear model selection. Sankhya Ser A 65(1), 23–42 (2003)
Jiang, J., Rao, J.S., Gu, Z., Nguyen, T., et al.: Fence methods for mixed model selection. Ann. Stat. 36(4), 1669–1692 (2008)
Jiang, J., Nguyen, T., Rao, J.S.: A simplified adaptive fence procedure. Stat. Probab. Lett. 79, 625–629 (2009)
Kawakubo, Y., Kubokawa, T.: Modified conditional AIC in linear mixed models. J. Multivar. Anal. 129, 44–56 (2014)
Kawakubo Y, Sugasawa S, Kubokawa T, et al. (2014) Conditional AIC under covariate shift with application to small area prediction. Technical report, CIRJE, Faculty of Economics, University of Tokyo
Kawakubo, Y., Sugasawa, S., Kubokawa, T.: Conditional akaike information under covariate shift with application to small area estimation. Can. J. Stat. 46(2), 316–335 (2018)
Kubokawa, T.: Conditional and unconditional methods for selecting variables in linear mixed models. J. Multivar. Anal. 102(3), 641–660 (2011)
Kubokawa, T., Srivastava, M.S.: An empirical Bayes information criterion for selecting variables in linear mixed models. J. Jpn. Stat. Soc. 40(1), 111–131 (2010)
Kuran, Ö., Özkale, M.R.: Model selection via conditional conceptual predictive statistic under ridge regression in linear mixed models. J. Stat. Comput. Simul. 89(1), 155–187 (2019)
Lahiri, P., Suntornchost, J.: Variable selection for linear mixed models with applications in small area estimation. Sankhya B 77(2), 312–320 (2015)
Li, Z., Zhu, L.: A new test for random effects in linear mixed models with longitudinal data. J. Stat. Plan. Inference 143(1), 82–95 (2013)
Li, L., Yao, F., Craiu, R.V., Zou, J.: Minimum description length principle for linear mixed effects models. Stat. Sin. 24, 1161–1178 (2014)
Li, Y., Wang, S., Song, P.X.K., Wang, N., Zhou, L., Zhu, J.: Doubly regularized estimation and selection in linear mixed-effects models for high-dimensional longitudinal data. Stat. Interface 11(4), 721 (2018)
Liang, H., Wu, H., Zou, G.: A note on conditional aic for linear mixed-effects models. Biometrika 95, 773–778 (2008)
Lin, B., Pang, Z., Jiang, J.: Fixed and random effects selection by reml and pathwise coordinate optimization. J. Comput. Graph. Stat. 22(2), 341–355 (2013)
Liski EP, Liski A (2008) Model selection in linear mixed models using mdl criterion with an application to spline smoothing. In: Proceedings of the First Workshop on Information Theoretic Methods in Science and Engineering, Tampere, Finland, pp. 18–20
Liu, X.Q., Hu, P.: General ridge predictors in a mixed linear model. Statistics 47(2), 363–378 (2013)
Lombardía, M.J., López-Vizcaíno, E., Rueda, C.: Mixed generalized Akaike information criterion for small area models. J. R. Stat. Soc. Ser. A Stat. Soc. 180:1229–1252 (2017)
Marhuenda, Y., Molina, I., Morales, D.: Small area estimation with spatio-temporal Fay-Herriot models. Comput. Stat. Data Anal. 58, 308–325 (2013)
Marino, M., Buxton, O.M., Li, Y.: Covariate selection for multilevel models with missing data. Stat 6(1), 31–46 (2017)
Marshall, E.C., Spiegelhalter, D.J.: Approximate cross-validatory predictive checks in disease map** models. Stat. Med. 22, 1649–1660 (2003)
Müller, S., Scealy, J.L., Welsh, A.H., et al.: Model selection in linear mixed models. Stat. Sci. 28(2), 135–167 (2013)
Nguyen, T., Jiang, J.: Restricted fence method for covariate selection in longitudinal data analysis. Biostatistics 13(2), 303–314 (2012)
Özkale, M.R., Can, F.: An evaluation of ridge estimator in linear mixed models: an example from kidney failure data. J. Appl. Stat. 44(12), 2251–2269 (2017)
Pan J (2016) Adaptive LASSO for mixed model selection via profile log-likelihood. Ph.D. thesis, Bowling Green State University
Pan, J., Shang, J.: Adaptive lasso for linear mixed model selection via profile log-likelihood. Commun. Stat. Theory Methods 47(8), 1882–1900 (2018a)
Pan, J., Shang, J.: A simultaneous variable selection methodology for linear mixed models. J. Stat. Comput. Simul. 88(17), 3323–3337 (2018b)
Peng, H., Lu, Y.: Model selection in linear mixed effect models. J. Multivar. Anal. 109, 109–129 (2012)
Pu, W., Niu, X.F.: Selecting mixed-effects models based on a generalized information criterion. J. Multivar. Anal. 97(3), 733–758 (2006)
Rissanen, J.: Stochastic complexity and modeling. Ann. Stat. 14(3), 1080–1100 (1986)
Rocha, F.M., Singer, J.M.: Selection of terms in random coefficient regression models. J. Appl. Stat. 45(2), 225–242 (2018)
Rohart, F., San Cristobal, M., Laurent, B.: Selection of fixed effects in high dimensional linear mixed models using a multicycle ecm algorithm. Comput. Stat. Data Anal. 80, 209–222 (2014)
Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys, vol. 81. Wiley, Hoboken (2004)
Schelldorfer, J., Bühlmann, P., De Geer, S.V.: Estimation for high-dimensional linear mixed-effects models using l1-penalization. Scand. J. Stat. 38(2), 197–214 (2011)
Schmidt, K., Smith, R.C.: A parameter subset selection algorithm for mixed-effects models. Int. J. Uncertain. Quantif. 6(5), 405–416 (2016)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Sciandra, M., Plaia, A.: A graphical model selection tool for mixed models. Commun. Stat. Simul. Comput. 47(9), 2624–2638 (2018)
Shang, J., Cavanaugh, J.E.: Bootstrap variants of the akaike information criterion for mixed model selection. Comput. Stat. Data Anal. 52(4), 2004–2021 (2008)
Singer, J.M., Rocha, F.M., Nobre, J.S.: Graphical tools for detecting departures from linear mixed model assumptions and some remedial measures. Int. Stat. Rev. 85(2), 290–324 (2017)
Sorensen, G., Barbeau, E., Stoddard, A.M., Hunt, M.K., Kaphingst, K., Wallace, L.: Promoting behavior change among working-class, multiethnic workers: results of the healthy directions-small business study. Am. J. Public Health 95(8), 1389–1395 (2005)
Srivastava, M.S., Kubokawa, T.: Conditional information criteria for selecting variables in linear mixed models. J. Multivar. Anal. 101(9), 1970–1980 (2010)
Sugiura, N.: Further analysis of the data by akaike’s information criterion and the finite corrections. Commun. Stat. A 7, 13–26 (1978)
Taylor, J.D., Verbyla, A.P., Cavanagh, C., Newberry, M.: Variable selection in linear mixed models using an extended class of penalties. Aust. N. Z. J. Stat. 54(4), 427–449 (2012)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996)
Vaida, F., Blanchard, S.: Conditional akaike information for mixed-effects models. Biometrika 92(2), 351–370 (2005)
Wang, W.: Identifiability of covariance parameters in linear mixed effects models. Linear Algebra Appl. 506, 603–613 (2016)
Wang, J., Schaalje, G.B.: Model selection for linear mixed models using predictive criteria. Commun. Stat. Simul. Comput. 38(4), 788–801 (2009)
Weiss, R.E.: Modeling Longitudinal Data. Springer, Berlin (2005)
Wenren, C., Shang, J.: Conditional conceptual predictive statistic for mixed model selection. J. Appl. Stat. 43(4), 585–603 (2016)
Wenren, C., Shang, J., Pan, J.: Marginal conceptual predictive statistic for mixed model selection. Open J. Stat. 6(02), 239 (2016)
Wu, P., Luo, X., Xu, P., Zhu, L.: New variable selection for linear mixed-effects models. Ann. Inst. Stat. Math. 69, 627–646 (2016)
Zhang, X., Liang, H., Liu, A., Ruppert, D., Zou, G.: Selection strategy for covariance structure of random effects in linear mixed-effects models. Scand. J. Stat. 43(1), 275–291 (2016)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Buscemi, S., Plaia, A. Model selection in linear mixed-effect models. AStA Adv Stat Anal 104, 529–575 (2020). https://doi.org/10.1007/s10182-019-00359-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-019-00359-z