A new algorithm for fitting semi-parametric variance regression models

Robledo, Kristy P.; Marschner, Ian C.

doi:10.1007/s00180-021-01067-6

A new algorithm for fitting semi-parametric variance regression models

Original paper
Published: 03 February 2021

Volume 36, pages 2313–2335, (2021)
Cite this article

Computational Statistics Aims and scope Submit manuscript

396 Accesses
3 Citations
5 Altmetric
Explore all metrics

Abstract

Variance regression allows for heterogeneous variance, or heteroscedasticity, by incorporating a regression model into the variance. This paper uses a variant of the expectation–maximisation algorithm to develop a new method for fitting additive variance regression models that allow for regression in both the mean and the variance. The algorithm is easily extended to allow for B-spline bases, thus allowing for the incorporation of a semi-parametric model in both the mean and variance. Although there are existing methods to fit these types of models, this new algorithm provides a reliable alternative approach that is not susceptible to numerical instability that can arise in this constrained estimation context. We utilise the developed algorithm with a series of simulation studies and analyse illustrative data. Various simulation studies show that the algorithm can recover the true model for a variety of scenarios. We also study automatic selection of model complexity based on information-based criteria, and show that the Akaike information criterion is useful for choosing the optimal number of knots in a B-spline model. An R package is available for implementing these methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The unit-improved second-degree Lindley distribution: inference and regression modeling

Article 26 September 2019

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

Article Open access 22 April 2022

On generalized degrees of freedom with application in linear mixed models selection

Article 26 July 2014

References

Aitkin M (1987) Modelling variance heterogeneity in normal regression using GLIM. J R Stat Soc: Ser C (Appl Stat) 36(3):332–339. https://doi.org/10.2307/2347792
Article Google Scholar
Babu G (2011) Resampling methods for model fitting and model selection. J Biopharm Stat 21(6):1177–1186. https://doi.org/10.1080/10543406.2011.607749
Article MathSciNet Google Scholar
Crisp A, Burridge J (1994) A note on nonregular likelihood functions in heteroscedastic regression models. Biometrika 81(3):585–587. https://doi.org/10.1093/biomet/81.3.585
Article MathSciNet MATH Google Scholar
De Boor C (1978) A Practical Guide to Splines. Applied mathematical sciences (Springer-Verlag New York Inc.) ; v. 27. Springer-Verlag, New York
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc: Ser B (Methodol) 39(1):1–38
MathSciNet MATH Google Scholar
Donnelly CA (1995) The spatial analysis of covariates in a study of environmental epidemiology. Stat Med 14(21–22):2393–2409. https://doi.org/10.1002/sim.4780142110
Article Google Scholar
Donoghoe M, Marschner I (2018) logbin: An R package for relative risk regression using the log-binomial model. Journal of Statistical Software 86(9), 1–22. https://doi.org/10.18637/jss.v086.i09. https://www.jstatsoft.org/v086/i09
Donoghoe MW, Marschner IC (2016) Fast stable relative risk regression using an overparameterised EM algorithm. In: Proceedings of the 31st International Workshop on Statistical Modelling, vol. 1, pp. 93–98
Hastie T, Tibshirani R (1990) Generalized Additive Models, 1st, edition. Monographs on statistics and applied probability. Chapman and Hall, London
Hurvich CM, Simonoff JS, Tsai CL (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 60(2), 271–293. http://www.jstor.org/stable/2985940
Ling N, Vieu P (2020) On semiparametric regression in functional data analysis. WIREs Computational Statistics p. e1538. https://doi.org/10.1002/wics.1538. https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.1538
Liu C, Rubin DB (1994) The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81(4):633–648. https://doi.org/10.2307/2337067
Article MathSciNet MATH Google Scholar
Lumley T, Kronmal R, Ma S (2006) Relative risk regression in medical research: models, contrasts, estimators, and algorithms. University of Washington Biostatistics Working Paper Series. Working Paper 293. http://biostats.bepress.com/uwbiostat/paper293/
Ma S (2014) A plug-in the number of knots selector for polynomial spline regression. J Nonparam Stat 26(3):489–507. https://doi.org/10.1080/10485252.2014.930143
Article MathSciNet MATH Google Scholar
Marschner IC (2014) Combinatorial EM algorithms. Stat Comput 24(6):921–940. https://doi.org/10.1007/s11222-013-9411-7
Article MathSciNet MATH Google Scholar
Marschner IC (2015) Relative risk regression for binary outcomes: Methods and recommendations. Australian & New Zealand Journal of Statistics 57(4):437–462 https://doi.org/10.1111/anzs.12131.https://onlinelibrary.wiley.com/doi/abs/10.1111/anzs.12131
McLachlan GJ, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York. https://doi.org/10.1002/9780470191613
Book MATH Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org/
Ramsay JO (1988) Monotone regression splines in action. Stat Sci 3(4):425–441. https://doi.org/10.1214/ss/1177012761
Article Google Scholar
Robledo K (2018) VarReg: Semi-Parametric Variance Regression. https://CRAN.R-project.org/package=VarReg. R package version 1.0.2
Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11(4):735–757
Article MathSciNet Google Scholar
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, New York
Google Scholar
Sigrist MW (1994) Air monitoring by spectroscopic techniques. Chemical analysis. Wiley, New York
Google Scholar
Smyth GK (2002) An efficient algorithm for REML in heteroscedastic regression. J Comput Graph Stat 11(4):836–847. https://doi.org/10.1198/106186002871
Article MathSciNet Google Scholar
Varadhan R, Roland C (2008) Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scand J Stat 35(2):335–353. https://doi.org/10.1111/j.1467-9469.2007.00585.X
Article MathSciNet MATH Google Scholar
Venables WN, Ripley BD (2002) Modern Applied Statistics with S, fourth edition edn. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4
Verbyla AP (1993) Modelling variance heterogeneity: residual maximum likelihood and diagnostics. J Roy Stat Soc: Ser B (Methodol) 55(2):493–508
MathSciNet MATH Google Scholar
Wand M (2018) SemiPar: Semiparametic Regression. https://CRAN.R-project.org/package=SemiPar. R package version 1.0-4.2
Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM (2010) Common snps explain a large proportion of the heritability for human height. Nat Genet 42(7):565–9. https://doi.org/10.1038/ng.608
Article Google Scholar
Zhou H, Alexander D, Lange K (2011) A quasi-newton acceleration for high-dimensional optimization algorithms. Stat Comput 21(2):261–273. https://doi.org/10.1007/s11222-009-9166-3
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

NHMRC Clinical Trials Centre, University of Sydney, Locked Bag 77, Camperdown, NSW, 1450, Australia
Kristy P. Robledo & Ian C. Marschner

Authors

Kristy P. Robledo
View author publications
You can also search for this author in PubMed Google Scholar
Ian C. Marschner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kristy P. Robledo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

1.1 Fisher information matrix

If ${\varvec{\theta }}= (\beta _0,\beta _1,..., \beta _P, \alpha _0,\alpha _1, ..., \alpha _Q)$, with a total of W parameters (where $W=P+Q+2$) then the information matrix is the negative second matrix derivative of the log-likelihood function, which is a $W\times W$ matrix.

The log-likelihood for our general model discussed in the previous section is

$$\begin{aligned} \ell ({{\varvec{\theta }}})&=-\frac{n}{2} \log (2\pi )-\frac{1}{2} \sum \nolimits _{i=1}^n \log \left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq} \right) \nonumber \\&\quad -\frac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( X_i-\beta _0- \sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} . \end{aligned}$$

(10)

If we partially differentiate (10) with respect to $\beta _0$, we get the following likelihood equation

$$\begin{aligned} \dfrac{\partial }{\partial \beta _0}\ell ({{\varvec{\theta }}})&= \sum \nolimits _{i=1}^n \dfrac{X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}}. \end{aligned}$$

This then follows on through each of the $\beta _p$ parameters to give

$$\begin{aligned} \dfrac{\partial }{\partial \beta _p}\ell ({{\varvec{\theta }}})&= \sum \nolimits _{i=1}^n \dfrac{z_{ip}\left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) }{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} . \end{aligned}$$

For the likelihood equation for $\alpha _0$, we get

$$\begin{aligned} \dfrac{\partial }{\partial \alpha _0}\ell ({{\varvec{\theta }}})&=-\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} +\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq} \right) ^2}, \end{aligned}$$

and then for each of the $\alpha _q$ parameters we have

$$\begin{aligned} \dfrac{\partial }{\partial \alpha _q}\ell ({{\varvec{\theta }}})&=-\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iq}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} +\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iq} \left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}\right) ^2}. \end{aligned}$$

Now, taking the second derivatives we obtain the following $(P+1) \times (P+1)$ matrix for the ${\varvec{\beta }}$ parameters. We refer to this matrix as ${\varvec{B}}=\left[ B_{ij}\right] $:

$$\begin{aligned} B_{00}= & {} -\dfrac{\partial ^2}{\partial \beta _0^2}\ell ({{\varvec{\theta }}}) = \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\ B_{01}=B_{10}= & {} -\dfrac{\partial ^2}{\partial \beta _0\beta _1}\ell ({{\varvec{\theta }}}) =\sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\ B_{11}= & {} -\dfrac{\partial ^2}{\partial \beta _1^2}\ell ({\varvec{\theta }}) = \sum \nolimits _{i=1}^n \dfrac{z_{i1}^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\&\vdots \\ B_{PP}= & {} -\frac{\partial ^2}{\partial \beta _P^2}\ell ({\varvec{\theta }}) = \sum \nolimits _{i=1}^n \frac{z_{iP}^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}}. \end{aligned}$$

Now, the partial derivatives for the $\alpha _q$ parameters form a $(Q+1) \times (Q+1)$ matrix ${\varvec{A}}$, where ${\varvec{A}}=[A_{ij}]$:

$$\begin{aligned} A_{00}&=-\frac{\partial ^2}{\partial \alpha _0^2}\ell ({\varvec{\theta }})\\&=-\frac{1}{2}\sum \nolimits _{i=1}^n \frac{1}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq} \right) ^3} \\ A_{01}&=-\frac{\partial ^2}{\partial \alpha _0\alpha _1}\ell ({{\varvec{\theta }}}) =- \frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{ x_{i1}\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} \\ A_{11}&=-\frac{\partial ^2}{\partial \alpha _1^2}\ell ({{\varvec{\theta }}})\\&=-\frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{i1}^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{x_{i1}^2 \left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} \\&\vdots \\ A_{QQ}&=-\frac{\partial ^2}{\partial \alpha _Q^2}\ell ({{\varvec{\theta }}})\\&=- \frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{iQ}^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{x_{iQ}^2\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} . \end{aligned}$$

The partial derivatives of the combination of $\beta _p$ and $\alpha _q$ parameters reduce to zero when we take the expectation, and thus the expected information matrix is block diagonal:

$$\begin{aligned} \left[ \begin{array}{cc} {\varvec{B}} &{} {\varvec{0}}^T\\ {\varvec{0}} &{} {\varvec{A}}\\ \end{array} \right] , \end{aligned}$$

where ${\varvec{0}}$ is a $(Q+1) \times (P+1)$ matrix of zeroes. Now, if we focus on the mean component of the expected information matrix, ${\varvec{B}}$,

$$\begin{aligned} \left[ \begin{array}{cccc} \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{} \cdots &{} \sum \nolimits _{i=1}^n \dfrac{z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} \\ \\ \sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}&{} \sum \nolimits _{i=1}^n \dfrac{\left( z_{i1}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\cdots &{}\sum \nolimits _{i=1}^n \dfrac{z_{i1}z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}\\ \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \\ \sum \nolimits _{i=1}^n \dfrac{z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}&{} \sum \nolimits _{i=1}^n \dfrac{z_{i1}z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\cdots &{}\sum \nolimits _{i=1}^n \dfrac{\left( z_{iP}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}\\ \end{array} \right] , \end{aligned}$$

and for the variance component of the expected information matrix, ${\varvec{A}}$,

$$\begin{aligned} \left[ \begin{array}{cccc} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{1}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \\ \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( x_{i1}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \\ \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( x_{iQ}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \end{array} \right] . \end{aligned}$$

Lastly, these matrices need to be inverted in order to obtain the standard errors of the respective parameters.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Robledo, K.P., Marschner, I.C. A new algorithm for fitting semi-parametric variance regression models. Comput Stat 36, 2313–2335 (2021). https://doi.org/10.1007/s00180-021-01067-6

Download citation

Received: 15 August 2019
Accepted: 06 January 2021
Published: 03 February 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s00180-021-01067-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new algorithm for fitting semi-parametric variance regression models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The unit-improved second-degree Lindley distribution: inference and regression modeling

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

On generalized degrees of freedom with application in linear mixed models selection

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendices

1.1 Fisher information matrix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A new algorithm for fitting semi-parametric variance regression models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The unit-improved second-degree Lindley distribution: inference and regression modeling

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

On generalized degrees of freedom with application in linear mixed models selection

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendices

1.1 Fisher information matrix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation