Log in

Sparse online principal component analysis for parameter estimation in factor model

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Factor model has the capacity of reducing redundant information in real data analysis. Note that sparse principal component (SPC) method is developed to obtain sparse solutions from the model, online principal component (OPC) method is used to handle with online dimension reduction problem. It is worth considering how to obtain a sparse solution with online learning. In this paper we propose a novel sparse online principal component (SOPC) method for sparse parameter estimation in factor model, where we combine the advantages of the SPC and OPC methods in estimating the loading matrix and the idiosyncratic variance matrix. By integrating sparse modelling with online update, the SOPC is capable of finding the sparse solution through iterative online updating, leading to a consistent and easily interpretable solution. Stability and sensitivity of the SOPC are assessed through a simulation study. The method is then applied to analyze two real data sets concerning drug efficacy and human activity recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Ait-Sahalia Y, **u D (2017) Using principal component analysis to estimate a high dimensional factor model with high-frequency data. J Econom 201(2):384–399

    Article  MathSciNet  MATH  Google Scholar 

  • Ait-Sahalia Y, **u D (2019) Principal component analysis of high-frequency data. J Am Stat Assoc 114(525):287–303

    Article  MathSciNet  MATH  Google Scholar 

  • Bai J, Li K (2012) Statistical analysis of factor models of high dimension. Annals Stat 40:43–465

    Article  MathSciNet  Google Scholar 

  • Belmar N, Quappe I, Luengo L, Campos V (2018) Exploratory factor analysis of the chilean deafness attitude scale. Int J Med Surg Sci 5(2):80–88

    Article  Google Scholar 

  • Bai Z, Chan R, Luk F (2005) Principal component analysis for distributed data sets with updating. Lecture Notes Computer Sci 3756:471–483

    Article  Google Scholar 

  • Camacho M, Domenech R (2012) MICA-BBVA: a factor model of economic and financial indicators for short-term GDP forecasting. Series 3(4):475–497

    Article  Google Scholar 

  • Cardot H, Degras D (2017) Online principal component analysis in high dimension: which algorithm to choose? Int Stat Rev 86(1):29–50

    Article  MathSciNet  Google Scholar 

  • Fan J, Xue L, Yao J (2017) Sufficient forecasting using factor models. J Econ 201(2):292–306

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Liao Y, Wang W (2016) Projected principal component analysis in factor models. Annals Stat 44(1):219–254

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Wang D, Wang K, Zhu Z (2019) Distributed estimation of principal eigenspaces. Annals Stat 47(6):3009–3031

    Article  MathSciNet  MATH  Google Scholar 

  • Guo G, Wei C, Qian G (2022) SOPC: The sparse online principal component estimation algorithm. URL: https://CRAN.R-project.org/package=SOPC

  • Han L, Wu Z, Zeng K, Yang X (2018) Online multilinear principal component analysis. Neurocomputing 275:888–896

    Article  Google Scholar 

  • Hirose K, Yamamoto M (2015) Sparse estimation via nonconcave penalized likelihood in factor analysis model. Stat Comput 25(5):863–875

    Article  MathSciNet  MATH  Google Scholar 

  • Kendler K, Myers J (2010) The genetic and environmental relationship between major depression and the five-factor model of personality. Psychol Med 40(05):801–806

    Article  Google Scholar 

  • Liu D, Wang J, Wang H (2015) Short-term wind speed forecasting based on spectral clustering and optimised echo state networks. Renew Energy 78:599–608

    Article  Google Scholar 

  • Lam C (2016) Nonparametric eigenvalue-regularized precision or covariance matrix estimator. Annals Stat 44(3):928–953

    Article  MathSciNet  MATH  Google Scholar 

  • Lin T, Mclachlan G, Lee S (2016) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J Multiv Anal 143:398–413

    Article  MathSciNet  MATH  Google Scholar 

  • Lam C, Yao Q (2012) Factor modeling for high-dimensional time series: inference for the number of factors. LSE Res Online Doc Econ 40(40):694–726

    MathSciNet  MATH  Google Scholar 

  • Li Q, Cheng G, Fan J, Wang Y (2018) Embracing the blessing of dimensionality in factor models. J Am Stat Assoc 113(521):380–389

    Article  MathSciNet  MATH  Google Scholar 

  • Ozawa S, Pang S, Kasabov N (2004) A modified incremental principal component analysis for on-line learning of feature space and classifier. International Conference on Pricai: Trends in Artificial Intelligence. Springer Berlin Heidelberg (pp. 231-240)

  • Pelger M, **ong R (2018) State-varying factor models of large dimensions. http://arxiv.org/abs/1807.02248

  • Pena D, Yohai V (2016) Generalized dynamic principal components. J Am Stat Assoc 111(515):1121–1131

    Article  MathSciNet  Google Scholar 

  • Skocaj D, Leonardis A (2003) Weighted and robust incremental method for subspace learning. Proc Computer Vision 2:1494–1501

    Article  Google Scholar 

  • Trendafilov N, Fontanella S, Adachi K (2017) Sparse exploratory factor analysis. Psychometrika 82(3):778–794

    Article  MathSciNet  MATH  Google Scholar 

  • Yao M, Qu X, Gu Q, Ruan T, Lou Z (2010) Online PCA with adaptive subspace method for real-time hand gesture learning and recognition. WSEAS Trans Computers Arch 9(6):583–592

    Google Scholar 

  • Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Gr Stat 15(2):265–286

    Article  MathSciNet  Google Scholar 

  • Zhu J, Ge Z, Song Z (2017) Distributed parallel PCA for modeling and monitoring of large-scale plant-wide processes with big data. IEEE Trans Industr Inf 13(4):1877–1885

    Article  Google Scholar 

Download references

Acknowledgements

We thank the associate editor and the reviewers for their useful feedback that improved this paper. This work was supported by a grant from Natural Science Foundation of Shandong under project ID ZR2020MA022.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangbao Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary materials

The codes of the SOPC in R software are presented in supplementary materials, which is available as package \emph{SOPC} (Guo, et al. [2022]). (PDF 220KB).

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, G., Wei, C. & Qian, G. Sparse online principal component analysis for parameter estimation in factor model. Comput Stat 38, 1095–1116 (2023). https://doi.org/10.1007/s00180-022-01270-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-022-01270-z

Keywords

Navigation