Abstract
In this work, we addressed parameter estimation and prediction in the high-dimensional sparse logistic regression model through both Monte Carlo simulations and application to real data. We applied two well-known penalized maximum likelihood (ML) methods (LASSO and aLASSO) for variable screening. There may exist overfitting from LASSO or underfitting from aLASSO, making ML estimators based on these models inefficient. Hence, after performing variable selection, we proposed post-selection improved estimation based on linear shrinkage, pretest, and James-Stein shrinkage strategies, which efficiently combine overfitted and underfitted ML estimators. Regardless of the correctness in the variable selection stage, the proposed estimators were shown to be more efficient than the classical ML estimators, which were severely affected by inappropriate variable selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agresti, A.: Foundations of Linear and Generalized Linear Models. Wiley, New York (2015)
Ahmed, S.E.: Shrinkage preliminary test estimation in multivariate normal distributions. J. Stat. Comput. Simul. 43(3–4), 177–195 (1992)
Ahmed, S.E.: Penalty, Shrinkage and Pretest Strategies: Variable Selection and Estimation. Springer (2014)
Ahmed, S.E., Yüzbaşı, B.: Big data analytics: integrating penalty strategies. Int. J. Manag. Sci. Eng. Manag. 11(2), 105–115 (2016)
Algamal, Z.: An efficient gene selection method for high-dimensional microarray data based on sparse logistic regression. Electron. J. Appl. Stat. Anal. 10(1), 242–256 (2017)
Algamal, Z.Y., Lee, M.H.: Penalized logistic regression with the adaptive lasso for gene selection in high-dimensional cancer classification. Expert. Syst. Appl. 42(23), 9326–9332 (2015)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Gao, X., Ahmed, S.E., Feng, Y.: Post selection shrinkage estimation for high-dimensional data analysis. Appl. Stoch. Model. Bus. Ind. 33(2), 97–120 (2017)
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1), 80–86 (2000)
Hossain, S., Ahmed, S.E., Doksum, K.A.: Shrinkage, pretest, and penalty estimators in generalized linear models. Stat. Methodol. 24, 52–68 (2015)
Li, Y., Hong, H.G., Ahmed, S.E., Li, Y.: Weak signals in high-dimensional regression: Detection, estimation and prediction. Appl. Stoch. Model. Bus. Ind. (2018)
Lisawadi, S., Shah, M.K.A., Ahmed, S.E.: Model selection and post estimation based on a pretest for logistic regression models. J. Stat. Comput. Simul. 86(17), 3495–3511 (2016)
Myers, R.H., Montgomery, D.C., Vining, G.G., Robinson, T.J.: Generalized Linear Models: With Applications in Engineering and the Sciences, vol. 791. Wiley, New York (2012)
Reangsephet, O., Lisawadi, S., Ahmed, S.E.: A comparison of pretest, stein-type and penalty estimators in logistic regression model. In: International Conference on Management Science and Engineering Management, pp. 19–34. Springer (2017)
Reangsephet, O., Lisawadi, S., Ahmed, S.E.: Improving estimation of regression parameters in negative binomial regression model. In: International Conference on Management Science and Engineering Management, pp. 265–275. Springer (2018)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 267–288 (1996)
Towell, G.G., Shavlik, J.W., Noordewier, M.O.: Refinement of approximate domain theories by knowledge-based neural networks. In: Proceedings of the Eighth National Conference on Artificial Intelligence, Boston, MA (1990)
Yuzbasi, B., Arashi, M., Ahmed, S.E.: Big data analysis using shrinkage strategies (2017). ar**v:170405074
Yüzbaşı, B., Arashi, M., Ahmed, S.E.: Shrinkage estimation strategies in generalized ridge regression models under low/high-dimension regime (2017). ar**v:170702331
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
Acknowledgments
The research of Professor S. Ejaz Ahmed was partially supported by the Natural Sciences and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Reangsephet, O., Lisawadi, S., Ahmed, S.E. (2020). Weak Signals in High-Dimensional Logistic Regression Models. In: Xu, J., Ahmed, S., Cooke, F., Duca, G. (eds) Proceedings of the Thirteenth International Conference on Management Science and Engineering Management. ICMSEM 2019. Advances in Intelligent Systems and Computing, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-21248-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-21248-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21247-6
Online ISBN: 978-3-030-21248-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)