Abstract
Factor model has the capacity of reducing redundant information in real data analysis. Note that sparse principal component (SPC) method is developed to obtain sparse solutions from the model, online principal component (OPC) method is used to handle with online dimension reduction problem. It is worth considering how to obtain a sparse solution with online learning. In this paper we propose a novel sparse online principal component (SOPC) method for sparse parameter estimation in factor model, where we combine the advantages of the SPC and OPC methods in estimating the loading matrix and the idiosyncratic variance matrix. By integrating sparse modelling with online update, the SOPC is capable of finding the sparse solution through iterative online updating, leading to a consistent and easily interpretable solution. Stability and sensitivity of the SOPC are assessed through a simulation study. The method is then applied to analyze two real data sets concerning drug efficacy and human activity recognition.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-022-01270-z/MediaObjects/180_2022_1270_Fig8_HTML.png)
Similar content being viewed by others
References
Ait-Sahalia Y, **u D (2017) Using principal component analysis to estimate a high dimensional factor model with high-frequency data. J Econom 201(2):384–399
Ait-Sahalia Y, **u D (2019) Principal component analysis of high-frequency data. J Am Stat Assoc 114(525):287–303
Bai J, Li K (2012) Statistical analysis of factor models of high dimension. Annals Stat 40:43–465
Belmar N, Quappe I, Luengo L, Campos V (2018) Exploratory factor analysis of the chilean deafness attitude scale. Int J Med Surg Sci 5(2):80–88
Bai Z, Chan R, Luk F (2005) Principal component analysis for distributed data sets with updating. Lecture Notes Computer Sci 3756:471–483
Camacho M, Domenech R (2012) MICA-BBVA: a factor model of economic and financial indicators for short-term GDP forecasting. Series 3(4):475–497
Cardot H, Degras D (2017) Online principal component analysis in high dimension: which algorithm to choose? Int Stat Rev 86(1):29–50
Fan J, Xue L, Yao J (2017) Sufficient forecasting using factor models. J Econ 201(2):292–306
Fan J, Liao Y, Wang W (2016) Projected principal component analysis in factor models. Annals Stat 44(1):219–254
Fan J, Wang D, Wang K, Zhu Z (2019) Distributed estimation of principal eigenspaces. Annals Stat 47(6):3009–3031
Guo G, Wei C, Qian G (2022) SOPC: The sparse online principal component estimation algorithm. URL: https://CRAN.R-project.org/package=SOPC
Han L, Wu Z, Zeng K, Yang X (2018) Online multilinear principal component analysis. Neurocomputing 275:888–896
Hirose K, Yamamoto M (2015) Sparse estimation via nonconcave penalized likelihood in factor analysis model. Stat Comput 25(5):863–875
Kendler K, Myers J (2010) The genetic and environmental relationship between major depression and the five-factor model of personality. Psychol Med 40(05):801–806
Liu D, Wang J, Wang H (2015) Short-term wind speed forecasting based on spectral clustering and optimised echo state networks. Renew Energy 78:599–608
Lam C (2016) Nonparametric eigenvalue-regularized precision or covariance matrix estimator. Annals Stat 44(3):928–953
Lin T, Mclachlan G, Lee S (2016) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J Multiv Anal 143:398–413
Lam C, Yao Q (2012) Factor modeling for high-dimensional time series: inference for the number of factors. LSE Res Online Doc Econ 40(40):694–726
Li Q, Cheng G, Fan J, Wang Y (2018) Embracing the blessing of dimensionality in factor models. J Am Stat Assoc 113(521):380–389
Ozawa S, Pang S, Kasabov N (2004) A modified incremental principal component analysis for on-line learning of feature space and classifier. International Conference on Pricai: Trends in Artificial Intelligence. Springer Berlin Heidelberg (pp. 231-240)
Pelger M, **ong R (2018) State-varying factor models of large dimensions. http://arxiv.org/abs/1807.02248
Pena D, Yohai V (2016) Generalized dynamic principal components. J Am Stat Assoc 111(515):1121–1131
Skocaj D, Leonardis A (2003) Weighted and robust incremental method for subspace learning. Proc Computer Vision 2:1494–1501
Trendafilov N, Fontanella S, Adachi K (2017) Sparse exploratory factor analysis. Psychometrika 82(3):778–794
Yao M, Qu X, Gu Q, Ruan T, Lou Z (2010) Online PCA with adaptive subspace method for real-time hand gesture learning and recognition. WSEAS Trans Computers Arch 9(6):583–592
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Gr Stat 15(2):265–286
Zhu J, Ge Z, Song Z (2017) Distributed parallel PCA for modeling and monitoring of large-scale plant-wide processes with big data. IEEE Trans Industr Inf 13(4):1877–1885
Acknowledgements
We thank the associate editor and the reviewers for their useful feedback that improved this paper. This work was supported by a grant from Natural Science Foundation of Shandong under project ID ZR2020MA022.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary materials
The codes of the SOPC in R software are presented in supplementary materials, which is available as package \emph{SOPC} (Guo, et al. [2022]). (PDF 220KB).
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, G., Wei, C. & Qian, G. Sparse online principal component analysis for parameter estimation in factor model. Comput Stat 38, 1095–1116 (2023). https://doi.org/10.1007/s00180-022-01270-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-022-01270-z