Abstract
Tucker-3 decomposition is a dimension reduction method for tensor data, similar to principal component analysis. One of the characteristics of Tucker-3 is the core array, which represents the interactions between low-dimensional spaces. However, it is difficult to interpret the result when the number of elements in the core array is large. One solution to this problem is using sparse estimation, such as the L1 regularization method, for the core array. However, some regularization methods often sacrifice the model fit too much. To solve this issue, we propose a novel estimation method for Tucker-3 decomposition with a penalty function based on the Gini index, which is a measure of sparsity and variance. Maximizing the Gini index is expected to obtain an estimated value for the core array that is easy to interpret. Moreover, the model fitted to the data will not involve shrinkage much because the Gini index is a measure of variance, which is one of the model fit measures of Tucker-3. The nonconvex problem poses a challenge when using the proposed penalty function based on the Gini index. To address this problem, we develop a majorization–minimization algorithm. From a numerical example, we revealed that the performance (the precision and accurate prediction of zero cells) of our method is superior to that of the estimation method with existing penalties, such as the L1 penalty, smoothly clipped absolute deviation, and minimax concave penalty.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42081-022-00179-7/MediaObjects/42081_2022_179_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42081-022-00179-7/MediaObjects/42081_2022_179_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42081-022-00179-7/MediaObjects/42081_2022_179_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42081-022-00179-7/MediaObjects/42081_2022_179_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs42081-022-00179-7/MediaObjects/42081_2022_179_Fig5_HTML.png)
Similar content being viewed by others
References
Adachi, K. (2020). Matrix-based introduction to multivariate data analysis (2nd ed.). Springer Singapore.
Allen, G. (2012). Sparse higher-order principal components analysis. In Proceedings of the 15th international conference on artificial intelligence and statistics (pp. 27–35).
Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications. Springer.
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted \(\ell \) 1 minimization. Journal of Fourier Analysis and Applications, 14, 877–905.
De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21, 1253–1278.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of American Statistical Association, 96, 1348–1360.
Harshman, R. (1970). Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Paper in Phonetics, 16, 1–84.
Hurley, N., & Rickard, S. (2009). Comparing measures of sparsity. IEEE Transactions on Information Theory, 55, 4723–4741.
Ikemoto, H., & Adachi, K. (2016). Sparse tucker2 analysis of three-way data subject to a constrained number of zero elements in a core array. Computational Statistics and Data Analysis, 98, 1–18.
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.
Kiers, H. A. L. (1998). Three-way simplimax for oblique rotation of the three-mode factor analysis core to simple structure. Computational Statistics and Data Analysis, 28, 307–324.
Kiers, H. A. L. (2000). Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics, 14, 105–122.
Kojima, H. (1975). Inter-battery factor analysis of parents’ and childrens’s reports of parental behavior. Japanese Psychological Research, 17, 33–48.
Kroonenberg, P. M. (1983). Three-mode principal component analysis. DSWO Press.
Kroonenberg, P. M. (2008). Applied multiway data analysis. Wiley.
Li, G. (2020). Generalized co-clustering analysis via regularized aliternating least squares. Computational Statistics and Data Analysis, 150, 106989.
Liu, Y., Song, R., Lu, W., & **ao, Y. (2021). A probit tensor factorization model for relational learning. Journal of Computational and Graphical Statistics. https://doi.org/10.1080/10618600.2021.2003204.
Lundy, M. E., Harshman, R. A., & Kruskal, J. B. (1989). A two-stage procedure incorporating good feature of both trilinear and quadriliear models. In Multiway data analysis (pp. 123–130). Elsevier.
Murakami, T., Ten Berge, J. M., & Kiers, H. A. (1998). A case of extreme simplicity of the core matrix in three-mode principal component analysis. Psychometrika, 63, 255–261.
Neuhaus, J. O., & Wrigley, C. F. (1954). The quartimax method: An analytic approach to orthogonal simple structure. British Journal of Statistical Psychology, 7, 81–91.
Phan, A. H., & Cichocki, A. (2010). Tensor decompositions for feature extraction and classification of high dimensional datasets. Nonlinear Theory and Its Application, 1, 37–68.
Sass, D., & Schmitt, T. (2010). A comparative investigation of rotation criteria within exploratory factor analysis. Multivariate Behavioral Research, 45, 73–103.
Sun, W. W., & Cheng, G. (2017). Provable sparse tensor decomposition. Journal of the Royal Statistics Society Series B, 79, 899–916.
Ten Berge, J. M. F., & Kiers, H. A. L. (1999). Simplicity of core arrays in three-way principal component analysis and the typical rank of \(p \times q \times 2\) arrays. Linear Algebra and its Applications, 294, 169–179.
Thibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistics Society Series B, 58, 267–288.
Thurstone, L. L. (1947). Multiple-factor analysis. University Chicago Press.
Tucker, L. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279–281.
Zhang, A., & Han, R. (2019). Optimal sparse singular value decomposition for high-dimensional high-order data. Journal of American Statistical Association, 114, 1708–1725.
Zhang, C. H. (2010). Nearly unbiased variable selection under minmax concave penalty. The Annuls of Statistics, 38, 894–942.
Zonoobi, D., Kassim, A. A., & Venkatesh, Y. V. (2011). Gini index as sparsity measure for signal reconstruction from compressive samples. IEEE Journal of Selected Topics in Signal Processing, 5, 1–13.
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15, 265–286.
Acknowledgements
We thank to the associate editor and two anonymous referees for their constructive comments, which lead to significant improvement of this article. This work was supported by JSPS KAKENHI Grant No. JP19K20226.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix A: Deriving the updated formula of the core array
Appendix A: Deriving the updated formula of the core array
Let \(\varvec{a} \in {\mathbb {R}}^{p}\) be the parameter vector and \(\varvec{x}\in {\mathbb {R}}^{p}\) be the constant vector. We consider minimizing the objective function f defined as follows:
where \(\alpha> 0,w_i \ge 0, \beta _i \in [0,1], \epsilon >0\) is constant. For minimizing f, we minimize each \(( ( x_i -a_i)^2/2 + \lambda (\dfrac{1}{\alpha } w_i{|a_{i}|}-\beta _i \log ({|a_i|}+\epsilon )) )\). We set \(g(a_i|x_i) =( ( x_i -a_i)^2/2 + \lambda (\dfrac{1}{\alpha } w_i{|a_{i}|}-\beta _i \log ({|a_i|}+\epsilon )) )\). The second term of \(g(a_i|x_i) \) is independent the sign of \(a_i\). Thus, the sign of the optimal solution of \(g(a_i|x_i) \) is the same as \(x_i\) when \(x_i \ne 0\) because \(-2x_ia_i \le -2x_i c\)under the same sign of x and \(a_i\) and \(|a_i| = |c| \).
For the case \(x_i >0\), the candidate set of optimal solution of g is \(a_i>0\). \(g(a_i|x_i)\) is the convex function if \(a_i>0\). When setting the derivative of \(g(a_i|x_i)\) as 0, we obtain the equation as follows:
In the case \(a_i>0\), the candidates of extremal value of g are obtained as follows:
Because \(x- (\epsilon + \lambda {w_i }/{\alpha }) <\sqrt{(x+\epsilon -\lambda {w_i }/{\alpha })^2 +4 \lambda \beta _i}\) holds, in the case \(a_i>0\), the candidate of extremal value of g is
For the case \(x_i <0\), by the same way as the case \(x_i>0\), the candidate of extremal value of g is obtained as follows:
\(x + (\epsilon + \lambda w_i /\alpha ) +\sqrt{(x -\epsilon +\lambda {w_i }/{\alpha })^2 +4 \lambda \beta _i}\) is also positive because \(x + (\epsilon + \lambda w_i /\alpha ) + |x-\epsilon +\lambda {w_i }/{\alpha }|\) is a positive value. Thus, in the case \(a_i<0\), the candidate of extremal value of g as
If \(x_i=0\), the optimal points of \(g(a_i|x_i)\) are
Thus, we obtain the optimal points of \(g(a_i|x_i)\) as
Therefore, we obtain the updated formula of \(\varvec{G}_i\) when we set \(\varvec{x} = \varvec{\eta }+\varvec{M}'_g(\mathrm {Vec}(\varvec{X}_1)- \varvec{M}_g\varvec{\eta })\), \(\epsilon = \epsilon /R\), and \(\lambda = \lambda /L\).
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tsuchida, J., Yadohisa, H. Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index. Jpn J Stat Data Sci 5, 675–700 (2022). https://doi.org/10.1007/s42081-022-00179-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-022-00179-7