Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index

Tsuchida, Jun; Yadohisa, Hiroshi

doi:10.1007/s42081-022-00179-7

Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index

Original Paper
Published: 12 September 2022

Volume 5, pages 675–700, (2022)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

178 Accesses
Explore all metrics

Abstract

Tucker-3 decomposition is a dimension reduction method for tensor data, similar to principal component analysis. One of the characteristics of Tucker-3 is the core array, which represents the interactions between low-dimensional spaces. However, it is difficult to interpret the result when the number of elements in the core array is large. One solution to this problem is using sparse estimation, such as the L1 regularization method, for the core array. However, some regularization methods often sacrifice the model fit too much. To solve this issue, we propose a novel estimation method for Tucker-3 decomposition with a penalty function based on the Gini index, which is a measure of sparsity and variance. Maximizing the Gini index is expected to obtain an estimated value for the core array that is easy to interpret. Moreover, the model fitted to the data will not involve shrinkage much because the Gini index is a measure of variance, which is one of the model fit measures of Tucker-3. The nonconvex problem poses a challenge when using the proposed penalty function based on the Gini index. To address this problem, we develop a majorization–minimization algorithm. From a numerical example, we revealed that the performance (the precision and accurate prediction of zero cells) of our method is superior to that of the estimation method with existing penalties, such as the L1 penalty, smoothly clipped absolute deviation, and minimax concave penalty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Some Theory on Non-negative Tucker Decomposition

Sparse nonnegative tensor decomposition using proximal algorithm and inexact block coordinate descent scheme

Article Open access 04 October 2021

Adaptive graph regularized non-negative Tucker decomposition for multiway dimensionality reduction

Article 24 June 2023

References

Adachi, K. (2020). Matrix-based introduction to multivariate data analysis (2nd ed.). Springer Singapore.
Allen, G. (2012). Sparse higher-order principal components analysis. In Proceedings of the 15th international conference on artificial intelligence and statistics (pp. 27–35).
Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications. Springer.
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted $\ell $ 1 minimization. Journal of Fourier Analysis and Applications, 14, 877–905.
Article MathSciNet MATH Google Scholar
De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21, 1253–1278.
Article MathSciNet MATH Google Scholar
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of American Statistical Association, 96, 1348–1360.
Article MathSciNet MATH Google Scholar
Harshman, R. (1970). Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Paper in Phonetics, 16, 1–84.
Hurley, N., & Rickard, S. (2009). Comparing measures of sparsity. IEEE Transactions on Information Theory, 55, 4723–4741.
Article MathSciNet MATH Google Scholar
Ikemoto, H., & Adachi, K. (2016). Sparse tucker2 analysis of three-way data subject to a constrained number of zero elements in a core array. Computational Statistics and Data Analysis, 98, 1–18.
Article MathSciNet MATH Google Scholar
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.
Article MATH Google Scholar
Kiers, H. A. L. (1998). Three-way simplimax for oblique rotation of the three-mode factor analysis core to simple structure. Computational Statistics and Data Analysis, 28, 307–324.
Article MATH Google Scholar
Kiers, H. A. L. (2000). Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics, 14, 105–122.
Article Google Scholar
Kojima, H. (1975). Inter-battery factor analysis of parents’ and childrens’s reports of parental behavior. Japanese Psychological Research, 17, 33–48.
Article Google Scholar
Kroonenberg, P. M. (1983). Three-mode principal component analysis. DSWO Press.
Kroonenberg, P. M. (2008). Applied multiway data analysis. Wiley.
Book MATH Google Scholar
Li, G. (2020). Generalized co-clustering analysis via regularized aliternating least squares. Computational Statistics and Data Analysis, 150, 106989.
Article MathSciNet MATH Google Scholar
Liu, Y., Song, R., Lu, W., & **ao, Y. (2021). A probit tensor factorization model for relational learning. Journal of Computational and Graphical Statistics. https://doi.org/10.1080/10618600.2021.2003204.
Article Google Scholar
Lundy, M. E., Harshman, R. A., & Kruskal, J. B. (1989). A two-stage procedure incorporating good feature of both trilinear and quadriliear models. In Multiway data analysis (pp. 123–130). Elsevier.
Murakami, T., Ten Berge, J. M., & Kiers, H. A. (1998). A case of extreme simplicity of the core matrix in three-mode principal component analysis. Psychometrika, 63, 255–261.
Article MATH Google Scholar
Neuhaus, J. O., & Wrigley, C. F. (1954). The quartimax method: An analytic approach to orthogonal simple structure. British Journal of Statistical Psychology, 7, 81–91.
Article Google Scholar
Phan, A. H., & Cichocki, A. (2010). Tensor decompositions for feature extraction and classification of high dimensional datasets. Nonlinear Theory and Its Application, 1, 37–68.
Article Google Scholar
Sass, D., & Schmitt, T. (2010). A comparative investigation of rotation criteria within exploratory factor analysis. Multivariate Behavioral Research, 45, 73–103.
Article Google Scholar
Sun, W. W., & Cheng, G. (2017). Provable sparse tensor decomposition. Journal of the Royal Statistics Society Series B, 79, 899–916.
Article MathSciNet MATH Google Scholar
Ten Berge, J. M. F., & Kiers, H. A. L. (1999). Simplicity of core arrays in three-way principal component analysis and the typical rank of $p \times q \times 2$ arrays. Linear Algebra and its Applications, 294, 169–179.
Article MathSciNet MATH Google Scholar
Thibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistics Society Series B, 58, 267–288.
MathSciNet Google Scholar
Thurstone, L. L. (1947). Multiple-factor analysis. University Chicago Press.
Tucker, L. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279–281.
Article MathSciNet Google Scholar
Zhang, A., & Han, R. (2019). Optimal sparse singular value decomposition for high-dimensional high-order data. Journal of American Statistical Association, 114, 1708–1725.
Article MathSciNet MATH Google Scholar
Zhang, C. H. (2010). Nearly unbiased variable selection under minmax concave penalty. The Annuls of Statistics, 38, 894–942.
MATH Google Scholar
Zonoobi, D., Kassim, A. A., & Venkatesh, Y. V. (2011). Gini index as sparsity measure for signal reconstruction from compressive samples. IEEE Journal of Selected Topics in Signal Processing, 5, 1–13.
Article Google Scholar
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15, 265–286.
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank to the associate editor and two anonymous referees for their constructive comments, which lead to significant improvement of this article. This work was supported by JSPS KAKENHI Grant No. JP19K20226.

Author information

Authors and Affiliations

Department of Culture and Information Science, Doshisha University, 1-3, Tataramiyakodani, Kyotanabe, Kyoto, Japan
Jun Tsuchida & Hiroshi Yadohisa

Authors

Jun Tsuchida
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Yadohisa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Tsuchida.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 41 KB)

Appendix A: Deriving the updated formula of the core array

Let $\varvec{a} \in {\mathbb {R}}^{p}$ be the parameter vector and $\varvec{x}\in {\mathbb {R}}^{p}$ be the constant vector. We consider minimizing the objective function f defined as follows:

$$\begin{aligned}&f(\varvec{a}|\varvec{x},\lambda , \alpha , \{w_i \mid i = 1, \,2,\, \dotsc ,\, p\},\{\beta _i \mid i = 1, \,2,\, \dotsc ,\, p\})\\&\quad = \dfrac{1}{2}\Vert \varvec{x}-\varvec{a}\Vert ^2_2 + \lambda \left( \dfrac{1}{\alpha }\sum _{i=1}^{p} w_i{|a_{i}|}-\sum _{i=1}^p \beta _i \log ({|a_i|}+\epsilon )\right) \\&\quad =\sum _{i=1}^{p}\left( \dfrac{1}{2}( x_i -a_i\right) ^2 + \lambda \left( \dfrac{1}{\alpha } w_i{|a_{i}|}-\beta _i \log ({|a_i|}+\epsilon )) \right) , \end{aligned}$$

where $\alpha> 0,w_i \ge 0, \beta _i \in [0,1], \epsilon >0$ is constant. For minimizing f, we minimize each $( ( x_i -a_i)^2/2 + \lambda (\dfrac{1}{\alpha } w_i{|a_{i}|}-\beta _i \log ({|a_i|}+\epsilon )) )$. We set $g(a_i|x_i) =( ( x_i -a_i)^2/2 + \lambda (\dfrac{1}{\alpha } w_i{|a_{i}|}-\beta _i \log ({|a_i|}+\epsilon )) )$. The second term of $g(a_i|x_i) $ is independent the sign of $a_i$. Thus, the sign of the optimal solution of $g(a_i|x_i) $ is the same as $x_i$ when $x_i \ne 0$ because $-2x_ia_i \le -2x_i c$under the same sign of x and $a_i$ and $|a_i| = |c| $.

For the case $x_i >0$, the candidate set of optimal solution of g is $a_i>0$. $g(a_i|x_i)$ is the convex function if $a_i>0$. When setting the derivative of $g(a_i|x_i)$ as 0, we obtain the equation as follows:

$$\begin{aligned}&a_i -x + \lambda \left( \dfrac{w_i}{\alpha }-\dfrac{\beta _i}{\left( a_i + \epsilon \right) }\right) = 0 \\&\quad \Longleftrightarrow (a_i-x)(a_i + \epsilon ) + \lambda \dfrac{w_i}{\alpha } (a_i + \epsilon ) -\lambda \beta _i= 0\\&\quad \Longleftrightarrow a^2_i+ \left( -x +\epsilon + \lambda \dfrac{w_i }{\alpha }\right) a_i + \left( -x \epsilon +\lambda \left( \dfrac{w_i \epsilon - \beta _i\alpha }{\alpha }\right) \right) = 0. \end{aligned}$$

In the case $a_i>0$, the candidates of extremal value of g are obtained as follows:

$$\begin{aligned} a_i&= \dfrac{1}{2}\left( x- \left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) \pm \sqrt{\left( x+\epsilon -\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}\right) . \end{aligned}$$

Because $x- (\epsilon + \lambda {w_i }/{\alpha }) <\sqrt{(x+\epsilon -\lambda {w_i }/{\alpha })^2 +4 \lambda \beta _i}$ holds, in the case $a_i>0$, the candidate of extremal value of g is

$$\begin{aligned} a_i&= \dfrac{1}{2}\max \left\{ 0,\,{x- \left( \epsilon + \lambda {w_i }{\alpha }\right) + \sqrt{\left( x+\epsilon -\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}} \right\} . \end{aligned}$$

For the case $x_i <0$, by the same way as the case $x_i>0$, the candidate of extremal value of g is obtained as follows:

$$\begin{aligned} a_i&= \dfrac{1}{2}\left( x +\left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) \pm \sqrt{\left( x -\epsilon +\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}\right) . \end{aligned}$$

$x + (\epsilon + \lambda w_i /\alpha ) +\sqrt{(x -\epsilon +\lambda {w_i }/{\alpha })^2 +4 \lambda \beta _i}$ is also positive because $x + (\epsilon + \lambda w_i /\alpha ) + |x-\epsilon +\lambda {w_i }/{\alpha }|$ is a positive value. Thus, in the case $a_i<0$, the candidate of extremal value of g as

$$\begin{aligned} a_i&= \dfrac{1}{2}\min \left\{ 0,{x + (\epsilon + \lambda {w_i }{\alpha }) - \sqrt{\left( x -\epsilon +\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}}\right\} . \end{aligned}$$

If $x_i=0$, the optimal points of $g(a_i|x_i)$ are

$$\begin{aligned} \dfrac{ \left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) - \sqrt{\left( \epsilon -\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}}{2} \mathrm {and} \dfrac{ -\left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) + \sqrt{\left( -\epsilon +\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}}{2}. \end{aligned}$$

Thus, we obtain the optimal points of $g(a_i|x_i)$ as

$$\begin{aligned} a_i = {\left\{ \begin{array}{ll} \dfrac{1}{2}\min \left\{ 0,{x + \left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) - \sqrt{\left( x -\epsilon +\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}}\right\} &{}(x<0)\\ \\ \dfrac{1}{2}\max \left\{ 0,\,{x- \left( \epsilon + \lambda \dfrac{w_i }{\alpha }\right) + \sqrt{\left( x+\epsilon -\lambda \dfrac{w_i }{\alpha }\right) ^2 +4 \lambda \beta _i}}\right\} &{}(x\ge 0) \end{array}\right. }. \end{aligned}$$

Therefore, we obtain the updated formula of $\varvec{G}_i$ when we set $\varvec{x} = \varvec{\eta }+\varvec{M}'_g(\mathrm {Vec}(\varvec{X}_1)- \varvec{M}_g\varvec{\eta })$, $\epsilon = \epsilon /R$, and $\lambda = \lambda /L$.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tsuchida, J., Yadohisa, H. Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index. Jpn J Stat Data Sci 5, 675–700 (2022). https://doi.org/10.1007/s42081-022-00179-7

Download citation

Received: 05 January 2022
Revised: 23 August 2022
Accepted: 24 August 2022
Published: 12 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s42081-022-00179-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Some Theory on Non-negative Tucker Decomposition

Sparse nonnegative tensor decomposition using proximal algorithm and inexact block coordinate descent scheme

Adaptive graph regularized non-negative Tucker decomposition for multiway dimensionality reduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 41 KB)

Appendix A: Deriving the updated formula of the core array

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Tucker-3 decomposition with sparse core array using a penalty function based on Gini-index

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Some Theory on Non-negative Tucker Decomposition

Sparse nonnegative tensor decomposition using proximal algorithm and inexact block coordinate descent scheme

Adaptive graph regularized non-negative Tucker decomposition for multiway dimensionality reduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 41 KB)

Appendix A: Deriving the updated formula of the core array

Appendix A: Deriving the updated formula of the core array

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation