Sparse precision matrices for minimum variance portfolios

Torri, Gabriele; Giacometti, Rosella; Paterlini, Sandra

doi:10.1007/s10287-019-00344-6

Sparse precision matrices for minimum variance portfolios

Original Paper
Published: 04 February 2019

Volume 16, pages 375–400, (2019)
Cite this article

Computational Management Science Aims and scope Submit manuscript

675 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Financial crises are typically characterized by highly positively correlated asset returns due to the simultaneous distress on almost all securities, high volatilities and the presence of extreme returns. In the aftermath of the 2008 crisis, investors were prompted even further to look for portfolios that minimize risk and can better deal with estimation error in the inputs of the asset allocation models. The minimum variance portfolio à la Markowitz is considered the reference model for risk minimization in equity markets, due to its simplicity in the optimization as well as its need for just one input estimate: the inverse of the covariance estimate, or the so-called precision matrix. In this paper, we propose a data-driven portfolio framework based on two regularization methods, glasso and tlasso, that provide sparse estimates of the precision matrix by penalizing its $L_1$-norm. Glasso and tlasso rely on asset returns Gaussianity or t-Student assumptions, respectively. Simulation and real-world data results support the proposed methods compared to state-of-art approaches, such as random matrix and Ledoit–Wolf shrinkage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Constructing optimal sparse portfolios using regularization methods

Article 13 December 2014

Research on regularized mean–variance portfolio selection strategy with modified Roy safety-first principle

Article Open access 29 June 2016

Sparsity and stability for minimum-variance portfolios

Article Open access 11 March 2022

Notes

The original specification proposed by Friedman et al. (2008) applied the penalty to the entire matrix $\varvec{\varOmega }$. The version of the model with the penalty applied to $\varvec{\varOmega }$ is the one studied by Rothman et al. (2008) and is currently implemented in the R package ‘glasso’ (Friedman et al. 2014).
See Theorem 2 and Technical Condition (B) in Lam and Fan (2009).
see Theorem 1, Banerjee et al. (2008).
Notice that this representation implies a permutation of the rows and columns to have the ith asset as the last one.
$v_i$ can be interpreted as the unhedgeable component of $X_{i,t}$.
In the original model the factors followed a multivariate normal distribution (Fan et al. 2012). We used a t-Student to capture the leptokurtic distribution of financial time series (Cont 2001).
In the case of glasso we refer to the likelihood of a multivariate normal distribution, while with tlasso we refer to the one of a multivariate t-Student distribution.
The result follows from Corollary 1 in Witten et al. (2011), according to which the ith node is fully unconnected to all other nodes if and only if $|\varvec{\varSigma }_{ij}| \le \rho \quad \forall i \ne j$. When $\varvec{\varSigma }$ is the correlation matrix, all its elements are smaller or equal to one and therefore for $\rho = 1$ all the elements are disconnected, that is, the precision matrix is diagonal.
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.
The dimension of $\mathbf{G }_{\backslash i,\backslash i}$, $g_{\backslash i,i}$ and $g_{i,i}$ are respectively $((n-1)\times (n-1))$, $((n-1)\times 1)$ and $(1\times 1)$.
Interestingly, the Ledoit–Wolf shrinkage is closely related to portfolio optimization with $L_2$ penalization of weight estimates. Indeed, the optimization problem $\min _{\mathbf{w }\in C}(\mathbf{w }'\widehat{\varvec{\varSigma }}\mathbf{w }+ a\mathbf{w }'\mathbf{w })$, with $C = \{{\mathbf{w }} | {{\mathbf{1 }}{'}}\mathbf{w }=1\}$ can be equivalently stated as $\min _{\mathbf{w }\in C}(\mathbf{w }'(\widehat{\varvec{\varSigma }}+a \mathbf{I }) \mathbf{w })$, which then is equivalent to solving the problem using the Ledoit–Wolf shrinkage estimator with $\widehat{\varvec{\varSigma }}_T=\mathbf{I }$ (Bruder et al. 2013).

References

Baba K, Shibata R, Sibuya M (2004) Partial correlation and conditional correlation as measures of conditional independence. Aust N Z J Stat 46(4):657–664
Article Google Scholar
Banerjee O, Ghaoui LE, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. J Mach Learn Res 9:485–516
Google Scholar
Black F, Litterman R (1992) Global portfolio optimization. Finance Anal J 48(5):28–43
Article Google Scholar
Bouchaud JP, Potters M (2009) Financial applications of random matrix theory: a short review. ar**v preprint ar**v:0910.1205
Brodie J, Daubechies I, De Mol C, Giannone D, Loris I (2009) Sparse and stable Markowitz portfolios. Proc Natl Acad Sci 106(30):12267–12272
Article Google Scholar
Brownlees CT, Nualart E, Sun Y (2015) Realized networks. Working Paper, SSRN
Bruder B, Gaussel N, Richard JC, Roncalli T (2013) Regularization of portfolio allocation. Working Paper, SSRN
Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quant Finance 1:223–236
Article Google Scholar
DeMiguel V, Nogales FJ (2009) Portfolio selection with robust estimation. Oper Res 57:560–577
Article Google Scholar
DeMiguel V, Garlappi L, Nogales F, Uppal R (2009a) A generalized approach to portfolio optimization: improving performance by constraining portfolio norm. Manag Sci 55:798–812
Article Google Scholar
DeMiguel V, Garlappi L, Uppal R (2009b) Optimal versus naive diversification: how inefficient is the 1/N portfolio strategy? Rev Financ Stud 22(5):1915–1953
Article Google Scholar
Dempster AP (1972) Covariance selection. Biometrics 28(1):157–175
Article Google Scholar
Engle R (2002) Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional heteroskedasticity models. J Bus Econ Stat 20(3):339–350
Article Google Scholar
Fan J, Zhang J, Yu K (2012) Vast portfolio selection with gross-exposure constraints. J Am Stat Assoc 107(498):592–606
Article Google Scholar
Finegold M, Drton M (2011) Robust graphical modeling of gene networks using classical and alternative t-distributions. Ann Appl Stat 5(2A):1057–1080
Article Google Scholar
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Article Google Scholar
Friedman J, Hastie T, Tibshirani R (2014) Glasso: graphical lasso-estimation of gaussian graphical models. R package
Goto S, Xu Y (2015) Improving mean variance optimization through sparse hedging restrictions. J Financ Quant Anal 50(6):1415–1441
Article Google Scholar
Højsgaard S, Edwards D, Lauritzen S (2012) Graphical models with R. Springer, Berlin
Book Google Scholar
Kan R, Zhou G (2007) Optimal portfolio choice with parameter uncertainty. J Financ Quant Anal 42(3):621–656
Article Google Scholar
Kolm PN, Tütüncü R, Fabozzi F (2014) 60 years following Harry Markowitz’s contribution to portfolio theory and operations research. Eur J Oper Res 234(2):343–582
Article Google Scholar
Kotz S, Nadarajah S (2004) Multivariate t-distributions and their applications. Cambridge University Press, Cambridge
Book Google Scholar
Kremer PJ, Talmaciu A, Paterlini S (2018) Risk minimization in multi-factor portfolios: What is the best strategy? Ann Oper Res 266(1–2):255–291
Article Google Scholar
Laloux L, Cizeau P, Bouchaud JP, Potters M (1999) Noise dressing of financial correlation matrices. Phys Rev Lett 83(7):1467–1469
Article Google Scholar
Lam C, Fan J (2009) Sparsistency and rates of convergence in large covariance matrix estimation. Ann Stat 37(6B):4254
Article Google Scholar
Lange KL, Little RJ, Taylor JM (1989) Robust statistical modeling using the t distribution. J Am Stat Assoc 84(408):881–896
Google Scholar
Lauritzen SL (1996) Graph models, vol 17. Clarendon Press, Oxford
Google Scholar
Ledoit O, Wolf M (2004a) Honey, i shrunk the sample covariance matrix. J Portf Manag 30(4):110–119
Article Google Scholar
Ledoit O, Wolf M (2004b) A well-conditioned estimator for large-dimensional covariance matrices. J multivar anal 88(2):365–411
Article Google Scholar
Ledoit O, Wolf M (2011) Robust performances hypothesis testing with the variance. Wilmott 55:86–89
Article Google Scholar
Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91
Google Scholar
McLachlan G, Krishnan T (2007) The EM algorithm and extensions, vol 382. Wiley, Hoboken
Google Scholar
Meucci A (2009) Risk and asset allocation. Springer, Berlin
Google Scholar
Michaud RO (1989) The Markowitz optimization enigma: is optimized optimal? ICFA Contin Educ Ser 1989(4):43–54
Article Google Scholar
Murphy KP (2012) Machine learning: a probabilistic perspective. The MIT Press, London
Google Scholar
Rothman AJ, Bickel PJ, Levina E, Zhu J et al (2008) Sparse permutation invariant covariance estimation. Electron J Stat 2:494–515
Article Google Scholar
Stevens GV (1998) On the inverse of the covariance matrix in portfolio analysis. J Finance 53(5):1821–1827
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58(1):267–288
Google Scholar
Witten DM, Friedman JH, Simon N (2011) New insights and faster computations for the graphical lasso. J Comput Graph Stat 20(4):892–900
Article Google Scholar
Won JH, Lim J, Kim SJ, Rajaratnam B (2013) Condition-number-regularized covariance estimation. J R Stat Soc Ser B (Statistical Methodology) 75(3):427–450
Article Google Scholar
Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94(1):19–35
Article Google Scholar

Download references

Acknowledgements

Sandra Paterlini acknowledges ICT COST Action IC1408 from CRoNoS. Gabriele Torri acknowledges the support of the Czech Science Foundation (GACR) under project 17-19981S, 19-11965S and SP2018/34, an SGS research project of VSB-TU Ostrava. Rosella Giacometti and Gabriele Torri acknowledge the support given by University of Bergamo research funds 2016 2017.

Author information

Authors and Affiliations

Department of Management, Economics and Quantitative Methods, University of Bergamo, Via dei Caniana, 2, 24127, Bergamo, BG, Italy
Gabriele Torri & Rosella Giacometti
Department of Finance, Faculty of Economics, VŠB-TU Ostrava, Sokolská 33, 701 21 Ostrava 1, Ostrava, Czech Republic
Gabriele Torri
Department of Economics and Management, University of Trento, via Inama 5, 38122, Trento, Italy
Sandra Paterlini
Department of Finance and Accounting, EBS Universität für Wirtschaft und Recht, Gustav-Stresemann-Ring 3, 65189, Wiesbaden, Germany
Sandra Paterlini

Authors

Gabriele Torri
View author publications
You can also search for this author in PubMed Google Scholar
Rosella Giacometti
View author publications
You can also search for this author in PubMed Google Scholar
Sandra Paterlini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rosella Giacometti.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A The glasso algorithm

Here we briefly describe the algorithm proposed by Friedman et al. (2008) to solve (6), the glasso model. For convenience, we define $X_i$ as the ith element of X, and and $X_\backslash i$ as the vector of all the elements of X except the ith. We also define the matrices $\mathbf{G }$ to be the estimate of $\varvec{\varSigma }$, and $\mathbf{S }$ the sample covariance matrix. Furthermore, we identify the following partitions:^{Footnote 10}

$$\begin{aligned} \mathbf{G }= \begin{pmatrix} \mathbf{G }_{\backslash i,\backslash i} &{} \mathbf{g }_{\backslash i,i}\\ \mathbf{g }_{\backslash i,i}' &{} g_{i,i} \end{pmatrix}, \qquad \mathbf{S }= \begin{pmatrix} \mathbf{S }_{\backslash i, \backslash i} &{} \mathbf{s }_{\backslash i,i}\\ \mathbf{s }_{\backslash i ,i}' &{} s_{i,i} \end{pmatrix}. \end{aligned}$$

(20)

Banerjee et al. (2008) show that the solution for $w_{\backslash i,i}$ can be computed by solving the following box-constrained quadratic program:

$$\begin{aligned} g_{\backslash i,i} = \arg \min _y \left\{ y'\mathbf{G }_{\backslash i,\backslash i}^{-1}y:||y-\mathbf{s }_{\backslash i,i}||_{\infty } \le \rho \right\} , \end{aligned}$$

(21)

or in an equivalent way, by solving the dual problem

$$\begin{aligned} \min _{\beta ^{(i)}}\left\{ \frac{1}{2}||\mathbf{G }_{\backslash i,\backslash i}^{1/2}\beta ^{(i)}-c||^2+\rho ||\beta ^{(i)}||_1\right\} , \end{aligned}$$

(22)

where $c = \mathbf{G }_{\backslash i,\backslash i}^{-1/2}\mathbf{s }_{\backslash i,i}$ and ${\hat{\beta }}^{(i)} = \mathbf{G }_{\backslash i,\backslash i}^{-1}\mathbf{g }_{\backslash i, i}$. As noted by Friedman et al. (2008), (22) resembles a lasso least square problem (see Tibshirani 1996). The algorithm estimates then the ith variable on the others using as input $\mathbf{G }_{\backslash i,\backslash i}$, where $\mathbf{G }_{\backslash i,\backslash i}$ is the current estimate of the upper left block. The algorithm then updates the corresponding row and column of $\mathbf{G }$ using $\mathbf{g }_{\backslash i, i} = \mathbf{G }_{\backslash i,\backslash i}{\hat{\beta }}^{(i)}$ and cycles across the variables until convergence.

Glasso algorithm

1.
Start with $\mathbf{G }= \mathbf{S }+ \rho \mathbf{I }$ . The diagonal of $\mathbf{G }$ is unchanged in the next steps.
2.
For each $i = 1,2,\ldots ,n,1,2,\ldots , n,\ldots $, solve the lasso problem (22), which takes as input $\mathbf{G }_{\backslash i,\backslash i}$ and $\mathbf{s }_{\backslash i,i}$. This gives a $n - 1$ vector solution ${\hat{\beta }}$. Fill in the corresponding row and column of $\mathbf{G }$ using $\mathbf{g }_{\backslash i,i} = \mathbf{G }_{\backslash i,\backslash i}{\hat{\beta }}$.
3.
Repeat until a convergence criterion is satisfied.

The algorithm has a computational complexity of $O(n^3)$ for dense problems, and considerably less than that for sparse problems (Friedman et al. 2008).

B Alternative covariance estimation methods

Here, we briefly describe the benchmark covariance estimators we use in the comparative analysis. Differently from glasso and tlasso, these approaches provide an estimate for the covariance matrix and not for the precision matrix. Hence, we compute the precision matrix for such methods to be plug-in into the minimum variance portfolio by inverting the covariance.

In particular, we consider the sample covariance and the equally weighted methods (that are commonly regarded as naive approaches) and two state-of-art estimators: random matrix theory and Ledoit Wolf Shrinkage.

The equally weighted (EW) portfolio, a tough benchmark to beat (DeMiguel et al. 2009b), can be interpreted as an extreme shrinkage estimator of the global minimum variance portfolio, obtained using the identity matrix as the estimate of the covariance matrix. Indeed, using (3), we obtain $\hat{\mathbf{w }}_{EW} = \dfrac{\mathbf{I }\mathbf{1 }}{\mathbf{1 }' \mathbf{I }\mathbf{1 }} = \frac{1}{n}\mathbf{1 }$. By assuming zero correlations and equal variances, such approach is very conservative in terms of estimation error and it suitable in case of severe unpredictability of the parameters.

The second naive approach is the sample covariance estimator, defined as:

$$\begin{aligned} \mathbf{S }= \frac{1}{t-1} \sum _{\tau =1}^t (X_{\tau }-{\bar{X}}) (X_{\tau }-{\bar{X}})', \end{aligned}$$

(23)

where t is the length of the estimation period, $X_i$ is the multivariate variate vector of assets’ returns at time $\tau $ and ${\bar{X}}$ is the vector of the average return for the n assets. Such estimator, when computed on datasets with a number of asset close to the length of the window size, is typically characterized by a larger eigenvalue dispersion compared to true covariance matrix, causing the matrix to be ill-conditioned (Meucci 2009). Therefore, when computing the precision matrix by inverting the covariance matrix, estimates are typically not reliable and unstable on different samples as its ill-conditioning nature amplifies the effects of the estimation error in the covariance matrix.

The shrinkage methodology of Ledoit-Wolf (LW) is well-known to better control for the presence of estimation errors, especially for datasets with a large ratio of n / t, where n is the number of assets and t the length of the estimation window. The Ledoit-Wolf shrinkage estimator is defined to be a convex combination of the sample covariance matrix $\mathbf{S }$ and $\widehat{\varvec{\varSigma }}_{T}$, a highly structured target estimator, such that $\widehat{\varvec{\varSigma }}_{LW} = a \mathbf{S }+ (1-a) \widehat{\varvec{\varSigma }}_{T}$ with $a \in [0,1]$. Following Ledoit and Wolf (2004a), we consider as structured estimator $\widehat{\varvec{\varSigma }}_{T}$ the constant correlation matrix, such that all the pairwise correlations are identical and equal to the average of all the sample pairwise correlations. As the target estimator is characterized by good conditioning, the resulting shrinkage estimator $\widehat{\varvec{\varSigma }}_{LW}$ has a smaller eigenvalues dispersion than the sample covariance matrix. In fact, the sample covariance matrix is shrunk towards the structured estimator, with intensity depending on the value of the shrinkage constanta. Ledoit–Wolf estimation of a is based on the minimization of the expected distance between $\widehat{\varvec{\varSigma }}_{LW}$ and $\varvec{\varSigma }$. For further details, the reader is referred to Ledoit and Wolf (2004a).^{Footnote 11}

The last approach we focus on is the so called random matrix theory (RMT) estimator $\widehat{\varvec{\varSigma }}_{RMT}$, introduced by Laloux et al. (1999). The approach is based on the fact that, in the case of financial time series, the smallest eigenvalues of the correlation matrices are often dominated by noise. From the known distribution of the eigenvalues of a random matrix, it is possible then to filter out the part of spectrum that is likely associated with estimation error and maintain only the eigenvalues that carry useful information (Laloux et al. 1999). In particular, when assuming i.i.d. returns, the eigenvalues of the sample correlation matrix are then distributed according to a Marcenko–Pastur (MP) distribution as a consequence of the estimation error. Therefore, we can compute the eigenvalues that correspond to noise based on the minimum and maximum eigenvalues of the theoretical distribution, such that:

$$\begin{aligned} \lambda _{\min \max } = \sigma ^2 \big (1\pm \sqrt{n/t}\big )^2, \end{aligned}$$

(24)

where $\lambda _{\min }$ and $\lambda _{\max }$ are the theoretical smallest and largest eigenvalues in a $n\times n$ random covariance matrix estimated by a sample of t observations and $\sigma ^2$ is the variance of the i.i.d. asset returns. Only the eigenvalues outside the interval [$\lambda _{\min }$, $\lambda _{\max }$] are then assumed to bring useful information, while the others correspond to noise. Here, we estimate the covariance matrix then by eigenvalue clip**, a technique that consists in substituting the eigenvalues smaller than $\lambda _{\max }$ with their average:

$$\begin{aligned} \widehat{\varvec{\varSigma }}_{RMT} = \mathbf{V }\varvec{\varLambda }_{RMT}\mathbf{V }', \end{aligned}$$

(25)

where $\mathbf{V }$ represents the eigenvectors of the sample covariance matrix and $\varvec{\varLambda }_{RMT}$ is the diagonal matrix with the ordered eigenvalues, where the eigenvalues $\lambda \le \lambda _{\max }$ are substituted by their average (Bouchaud and Potters 2009). The RMT filtering has then the effect of averaging the lowest eigenvalues, improving the conditioning of the matrix and therefore reducing the sensitivity of the precision matrix to estimation errors.

For further details the reader is refereed to Laloux et al. (1999), Bouchaud and Potters (2009) and Bruder et al. (2013).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Torri, G., Giacometti, R. & Paterlini, S. Sparse precision matrices for minimum variance portfolios. Comput Manag Sci 16, 375–400 (2019). https://doi.org/10.1007/s10287-019-00344-6

Download citation

Received: 08 May 2017
Accepted: 04 January 2019
Published: 04 February 2019
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s10287-019-00344-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Sparse precision matrices for minimum variance portfolios

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Constructing optimal sparse portfolios using regularization methods

Research on regularized mean–variance portfolio selection strategy with modified Roy safety-first principle

Sparsity and stability for minimum-variance portfolios

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A The glasso algorithm

B Alternative covariance estimation methods

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Sparse precision matrices for minimum variance portfolios

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Constructing optimal sparse portfolios using regularization methods

Research on regularized mean–variance portfolio selection strategy with modified Roy safety-first principle

Sparsity and stability for minimum-variance portfolios

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A The glasso algorithm

B Alternative covariance estimation methods

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation