Abstract
In this paper, we introduce the inflated beta autoregressive moving average (I\(\beta \)ARMA) models for modeling and forecasting time series data that assume values in the intervals (0,1], [0,1) or [0,1]. The proposed model considers a set of regressors, an autoregressive moving average structure and a link function to model the conditional mean of inflated beta conditionally distributed variable observed over the time. We develop partial likelihood estimation and derive closed-form expressions for the score vector and the cumulative partial information matrix. Hypotheses testing, confidence interval, some diagnostic tools and forecasting are also proposed. We evaluate the finite sample performances of partial maximum likelihood estimators and confidence interval using Monte Carlo simulations. Two empirical applications related to forecasting hydro-environmental data are presented and discussed.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40314-023-02322-w/MediaObjects/40314_2023_2322_Fig7_HTML.png)
Similar content being viewed by others
Data availability
The RH time series is publicly available on the Brazilian National Institute of Meteorology (INMET) website and the UV data is available on the Operador Nacional do Sistema Elétrico (ONS) website.
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Bayer FM, Bayer DM, Pumi G (2017) Kumaraswamy autoregressive moving average models for doubled bounded environmental data. J Hydrol 555:385–396
Bayer FM, Cintra RJ, Cribari-Neto F (2018) Beta seasonal autoregressive moving average models. J Stat Comput Simul 88(15):2961–2981
Bayer FM, Bayer DM, Marinoni A, Gamba P (2020) A novel Rayleigh dynamical model for remote sensing data interpretation. IEEE Trans Geosci Remote Sens 58(7):4989–4999
Bayes CL, Valdivieso L (2016) A beta inflated mean regression model for fractional response variables. J Appl Stat 43(10):1814–1830
Benjamin MA, Rigby RA, Stasinopoulos DM (2003) Generalized autoregressive moving average models. J Am Stat Assoc 98(461):214–223
Bloomfield P (2013) Fourier analysis of time series: an introduction, 2nd edn. Wiley-Interscience, New Jersey, p 288
Box G, Jenkins GM, Reinsel G, Ljung GM (2015) Time series analysis: forecasting and control, 5th edn. Wiley, Hardcover
Brazilian National Institute of Meteorology (INMET) (2018) Meteorological database for research and teaching. http://www.inmet.gov.br/projetos/rede/pesquisa. Accessed Oct 2018
Chuang M-D, Yu G-H (2007) Order series method for forecasting non-Gaussian time series. J Forecast 26(4):239–250
Cox DR (1975) Partial likelihood. Biometrika 62(2):69–76
Cox DR (1981) Statistical analysis of time series: some recent developments. Scand J Stat 8:93–115
da-Silva CQ, Migon HS, Correia LT (2011) Dynamic bayesian beta models. Comput Stat Data Anal 55(6):2074–2089
Fahrmeir L (1987) Asymptotic testing theory for generalized linear models. Statistics 18(1):65–76
Fahrmeir L, Kaufmann H (1985) Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann Stat 13(1):342–368
Ferrari SLP, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815
Fokianos K, Kedem B (1998) Prediction and classification of non-stationary categorical time series. J Multivar Anal 67(2):277–296
Fokianos K, Kedem B (2004) Partial likelihood inference for time series following generalized linear models. J Time Ser Anal 25(2):173–197
Grassly NC, Fraser C (2006) Seasonal infectious disease epidemiology. Proc R Soc Lond B: Biol Sci 273(1600):2541–2550
Guolo A, Varin C (2014) Beta regression for time series analysis of bounded data, with application to Canada Google Flu Trends. Ann Appl Stat 8(1):74–88
Hannan EJ, Quinn BG (1979) The determination of the order of an autoregression. J R Stat Soc Ser B 41(2):190–195
Kedem B, Fokianos K (2002) Regression models for time series analysis. Wiley, New Jersey
Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34(1):1–14
Li WK (1991) Testing model adequacy for some Markov regression models for time series. Biometrika 78(1):83–89
Li WK (1994) Time series models based on generalized linear models: some further results. Biometrics 50(2):506–511
Ljung GM, Box GEP (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297–303
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, Boca Raton
Melchior C, Zanini RR, Guerra RR, Rockenbach DA (2021) Forecasting Brazilian mortality rates due to occupational accidents using autoregressive moving average approaches. Int J Forecast 37(2):825–837
Monti AC (1994) A proposal for a residual autocorrelation test in linear models. Biometrika 81(4):776–780
Neyman J, Pearson ES (1928) On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika 20A(1/2):175–240
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Operador Nacional do Sistema Elétrico (2022) Dados Hidrológicos. http://www.ons.org.br/Paginas/resultados-da-operacao/historico-da-operacao/dados_hidrologicos_volumes.aspx. Accessed Dec 2022
Ospina R, Ferrari SLP (2012) A general class of zero-or-one inflated beta regression models. Comput Stat Data Anal 56(6):1609–1623
Palm BG, Bayer FM, Cintra RJ (2021) Signal detection and inference based on the beta binomial autoregressive moving average model. Digital Signal Process 109:102911
Pawitan Y (2001) In all likelihood: statistical modelling and inference using likelihood. Oxford Science publications, New York
Press W, Teukolsky S, Vetterling W, Flannery B (1992) Numerical recipes in C: the art of scientific computing, 2nd edn. Cambridge University Press, New York
Pumi G, Valk M, Bisognin C, Bayer FM, Prass TS (2019) Beta autoregressive fractionally integrated moving average models. J Stat Plann Inference 200:196–202
Pumi G, Prass TS, Souza RR (2021) A dynamic model for double-bounded time series with chaotic-driven conditional averages. Scand J Stat 48(1):68–86
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2020). R Foundation for Statistical Computing
Rao CR (1948) Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Math Proc Cambridge Philos Soc 44(1):50–57
Rocha AV, Cribari-Neto F (2009) Beta autoregressive moving average models. TEST 18(3):529–545
Rocha AV, Cribari-Neto F (2017) Erratum to: beta autoregressive moving average models. TEST 26(2):451–459
Sagrillo M, Guerra RR, Bayer FM (2021) Modified Kumaraswamy distributions for double bounded hydro-environmental data. J Hydrol 603:127021
Scher VT, Cribari-Neto F, Pumi G, Bayer FM (2020) Goodness-of-fit tests for \(\beta \)ARMA hydrological time series modeling. Environmetrics 31(3):2607
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Tamerius JD, Shaman J, Alonso WJ, Bloom-Feshbach K, Uejio CK, Comrie A, Viboud C (2013) Environmental predictors of seasonal influenza epidemics across temperate and tropical climates. PLoS Pathog 9(3):1003194
Terrell GR (2002) The gradient statistic. Comput Sci Stat 34:206–215
Tiku ML, Wong W-K, Vaughan DC, Bian G (2000) Time series models in non-normal situations: symmetric innovations. J Time Ser Anal 21(5):571–596
Wald A (1943) Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans Am Math Soc 54:426–482
Zeger SL, Qaqish B (1988) Markov regression models for time series: a quasi-likelihood approach. Biometrics 44(4):1019–1031
Zheng T, **ao H, Chen R (2015) Generalized ARMA models with martingale difference errors. J Econometr 189(2):492–506
Acknowledgements
We gratefully acknowledge partial financial support from CNPq and FAPERGS, Brazil. The comments and suggestions of the anonymous referee and the Associated Editor are gratefully acknowledged.
Funding
Funding was provided by Conselho Nacional de Desenvolvimento Científico e Tecnológico (310617/2020-0) and Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (21/2551-0002048-2).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Clémentine Prieur.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A Score vector and cumulative partial information matrix
Appendix A Score vector and cumulative partial information matrix
In this appendix, we shall derive the partial score vector and the cumulative partial information matrix from (5). These are useful for the asymptotic theory and inference as well as numerical considerations.
1.1 A.1 Score vector and optimization algorithm
To obtain the partial score vector, we shall need to obtain the derivative of the log-likelihood \(\ell (\varvec{\gamma })\) given in (5) with respect to each coordinate \(\gamma _j\), with \(j \in 1,\dots ,\kappa \), of the parameter \(\varvec{\gamma }\). To obtain the derivative of \(\ell (\varvec{\gamma })\) with respect to \(\alpha _i\), \(i=0,1\), observe that, in view of (5),
Now, for \(i = 0,1\), it is straightforward to show that
where \(y_t^ {*} := \log \Big (\frac{y_t}{1-y_t}\Big ), \quad \mu _t^ {*} :=\psi (\nu _t\phi )-\psi \big ((1-\nu _t)\phi \big )\) and \(\psi :(0,\infty )\rightarrow \mathbb {R}\) is the digamma function defined as \(\psi (z)=\frac{d}{dz}\log \big (\Gamma (z)\big )\). The derivative with respect to \(\phi \) is easy to obtain:
For the remaining parameters, i.e., for \(j \in 4,\dots ,\kappa \), by the chain rule, and since \(\eta _t=g(\mu _t)\), \(\displaystyle {\frac{d \mu _t}{d \eta _t} = \frac{1}{g'(\mu _t)}}\), so that
Observe that \(\frac{\partial \nu _t}{\partial \mu _t}=(\alpha _0-1)(\alpha _1-1)c_t^{-2}\) and
Substituting (A2) into expression (A1), we obtain a simple formula that allows the computation of \(\partial \ell (\varvec{\gamma })/\partial \gamma _j\) for each remaining coordinate \(\gamma _j\), by determining the derivatives \(\partial \eta _t/\partial \gamma _j\), a much simpler task. We have
where \(x_{tl}\) denotes the lth element of \(\varvec{x}_t\), for \(l=1,\dots ,r\). We also have, for \(l=1,\dots ,p\), and \(j=1,\dots ,q\),
Let \(T=\textrm{diag}\big \lbrace 1/g'\left( {\mu }_{m+1}\right) ,\dots , 1/g'\left( {\mu }_{n}\right) \big \rbrace \), \( {{\textbf {a}}}=\left( \frac{\partial \eta _{m+1}}{\partial \alpha },\dots ,\frac{\partial \eta _n}{\partial \alpha }\right) ^\top \) and \(\varvec{v}=\left( \frac{\partial \ell _{m+1}(\varvec{\gamma })}{\partial \mu _{m+1}},\dots ,\frac{\partial \ell _{n}(\varvec{\gamma })}{\partial \mu _{n}}\right) ^\top \). Finally, let R, P, Q be the matrices with dimension \((n-m)\times r\), \((n-m)\times p\) and \((n-m)\times q\), respectively, for which the (i, j)th elements are given by
and set \(U_{\alpha }(\varvec{\gamma }):= {{\textbf {a}}}^\top T \varvec{v}\), \(U_{\varvec{\beta }}(\varvec{\gamma }):=R^\top T \varvec{v}\), \(U_{\varvec{\varphi }}(\varvec{\gamma }):=P^\top T \varvec{v}\) and \(U_{\varvec{\theta }}(\varvec{\gamma }):=Q^\top T \varvec{v}\).
For \(U_{\alpha _j}(\varvec{\gamma }):= \frac{\partial \ell (\varvec{\gamma })}{\partial \alpha _j}\), and \(U_{\phi }(\varvec{\gamma }):= \frac{\partial \ell (\varvec{\gamma })}{\partial \phi }\), then the partial score vector is given by
The PMLE of \(\varvec{\gamma }\), \(\widehat{\varvec{\gamma }}\), is obtained as a solution of the non-linear system \(U(\varvec{\gamma })=\varvec{0}\), where \(\varvec{0}\) is the null vector in \(\mathbb {R}^{\kappa }\). There is no closed form solution for such a system and, hence, PMLE must be obtained numerically (Nocedal and Wright 1999). In this work, we use the so-called Broyden–Fletcher–Goldfarb–Shanno (BFGS) method (Press et al. 1992). In practice, to calculate \(\widehat{\varvec{\gamma }}\) from a sample, we initialize \(r_{t}=0\) and \(\mu _t=0\) for \(t\le \max \{p,q\}\) and calculate \(\mu _t\) and \(r_t\) for \(t>\max \{p,q\}\) recursively from the data using (4). The BFGS algorithm also requires initialization of the parameters. The starting values of \(\alpha \), \(\varvec{\beta }\) and \(\varvec{\varphi }\) were set as the OLS estimate of
restricted to the observations where \(y\in (0,1)\). The vector parameter \(\varvec{\theta }\) is initialized as a null vector, as in Bayer et al. (2017), while inflation parameters \(\alpha _0\) and \(\alpha _1\) were initialized as the sample proportion of zeroes and ones, respectively.
1.2 A.2 Cumulative partial information matrix
In this appendix we derive the cumulative partial information matrix, given by
Since direct knowledge of the unconditional distribution of the proposed model is not obtainable, \(K_n\) will be the first step toward finding the asymptotic variance-covariance matrix related to the PMLE. In this case, under suitable assumptions (Fokianos and Kedem 2004), there exists a non-random information matrix, denoted by \(K(\varvec{\gamma })\), such that the weak convergence
holds, where \(K(\varvec{\gamma })\) is a positive definite and invertible matrix. The matrix \(K(\varvec{\gamma })^{-1}\) is the asymptotic variance-covariance matrix related to the PMLE, presented in (6).
For \(i,j\in \{4,\dots ,\kappa \}\) (that is, \(\gamma _j\notin \{\alpha _0,\alpha _1,\phi \}\)), it can be shown that
Since by Lemma A.1, \(\mathbb {E}\big (\partial \ell _t(\varvec{\gamma })/\partial \mu _t \mid {{\mathscr {F}}}_{t-1}\big )=0\), we arrive at
The second-order derivatives of \(\ell _t(\varvec{\gamma })\) with respect to \(\mu _t\) is given by
Observe that
We have, for \(y_t\in (0,1)\),
hence
Second mixed derivatives related to \(\alpha _0\) and \(\alpha _1\) are obtained through direct differentiation of the log-likelihood. We have, for \(i\in \{0,1\}\) and \(j\in \{4,\dots ,\kappa \}\)
which, by Lemma (A.1), yields
Writing \(\ \displaystyle {\frac{\partial c_t}{\partial \gamma _j}=\frac{\partial c_t}{\partial \mu _t}\frac{\partial \mu _t}{\partial \eta _t}\frac{\partial \eta _t}{\partial \gamma _j}=\frac{\alpha _0-\alpha _1}{g'(\mu _t)}\frac{\partial \eta _t}{\partial \gamma _j}}\), we have
and thus \(\displaystyle {\mathbb {E}\bigg (\frac{\partial ^2\ell _t(\varvec{\gamma })}{\partial \gamma _j \partial \alpha _i} \Bigm \vert {{\mathscr {F}}}_{t-1} \bigg ) =\frac{s_t^{(i)}}{g'(\mu _t)}\frac{\partial \eta _t}{\partial \gamma _j},}\) where
For \(j\in \{4,\dots ,\kappa \}\), it is easy to show that
Observe that, except for the indicator function, all terms in (A4) are \({{\mathscr {F}}}_{t-1}\)-measurable, so that
where
For \(i\in \{0,1\}\),
The first term has conditional expectation 0 (Lemma A.1), so that
Since \(\displaystyle {\frac{\partial ^2\ell _t(\varvec{\gamma })}{\partial \phi ^2}=\Big [\psi '(\phi )-\nu _t^2\psi '(\nu _t\phi )- (1-\nu _t)^2\psi '\big ([1-\nu _t]\phi \big )\Big ]I_{(0,1)}(y_t),}\) we have
Finally, for \(i,j\in \{0,1\}\),
Upon observing that \(P(y_t=i)=\alpha _i(1-i+(-1)^i\mu _t)\), it follows that
For \(i,j\in \{0,1\}\), let
\(\varvec{s}_i:=(s_{m+1}^{(i)},\dots ,s_n^{(i)})^\top \) and \(\varvec{d}:=(d_{m+1},\dots ,d_n)^\top \), where \(s_t^{(i)}\) and \(d_t\) are given in (A3) and (A5), respectively. Thus, the joint cumulative partial information matrix for \(\varvec{\gamma }\) based on a sample of size n is
where, for \(i,j\in \{0,1\}\), \(K_{(\alpha _i,\alpha _j)} = \textrm{tr}(A_{\{i,j\}})\), \(K_{(\alpha _i,\phi )}=K_{(\phi ,\alpha _i)}^\top =\textrm{tr}(B_i)\), \(K_{(\alpha _i,\alpha )} =K_{(\alpha ,\alpha _i)} = \varvec{s}_i^\top T \varvec{a}\), \(K_{(\alpha _i,\varvec{\beta })} =K_{(\varvec{\beta },\alpha _j)}^\top = \varvec{s}_i^\top T R\), \(K_{(\alpha _i,\varvec{\varphi })} =K_{(\varvec{\varphi },\alpha _i)}^\top = \varvec{s}_i^\top T P\), \(K_{(\alpha _i,\varvec{\theta })} =K_{(\varvec{\theta },\alpha _i)}^\top = \varvec{s}_i^\top T Q\), \(K_{(\phi ,\phi )}=\textrm{tr}(C)\), \(K_{(\phi ,\alpha )}=K_{(\alpha ,\phi )} = \varvec{d}^\top T \varvec{a}\), \(K_{(\phi ,\varvec{\beta })} =K_{(\varvec{\beta },\phi )}^\top = \varvec{d}^\top T R\), \(K_{(\phi ,\varvec{\varphi })} =K_{(\varvec{\varphi },\phi )}^\top = \varvec{d}^\top T P\), \(K_{(\phi ,\varvec{\theta })} =K_{(\varvec{\theta },\phi )}^\top = \varvec{d}^\top T Q\), \(K_{(\alpha ,\alpha )} = \varvec{a}^\top T^2 V \varvec{a}\), \(K_{(\alpha ,\beta )} = K_{(\beta ,\alpha )}^\top = \varvec{a}^\top T^2 V R\), \(K_{(\alpha ,\varphi )} = K_{(\varphi ,\alpha )}^\top = \varvec{a}^\top T^2 V P\), \(K_{(\alpha ,\theta )} = K_{(\theta ,\alpha )}^\top = \varvec{a}^\top T^2 V Q\), \(K_{(\varvec{\beta },\varvec{\beta })} = R^\top T^2 V R\), \(K_{(\varvec{\beta },\varvec{\varphi })}=K_{(\varvec{\varphi },\varvec{\beta })}^\top = R^\top T^2 V P\), \(K_{(\varvec{\beta },\varvec{\theta })}=K_{(\varvec{\theta },\varvec{\beta })}^\top = R^\top T^2 V Q\), \(K_{(\varvec{\varphi },\varvec{\varphi })} = P^\top T^2 V P\), \(K_{(\varvec{\varphi },\varvec{\theta })}=K_{(\varvec{\theta },\varvec{\varphi })}^\top = P^\top T^2 V Q\), \(K_{(\varvec{\theta },\varvec{\theta })} = Q^\top T^2 V Q\).
Lemma A.1
With the notation in A.1,
Proof
Observe that
by standard results on the beta distribution. Hence
as asserted. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bayer, F.M., Pumi, G., Pereira, T.L. et al. Inflated beta autoregressive moving average models. Comp. Appl. Math. 42, 183 (2023). https://doi.org/10.1007/s40314-023-02322-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40314-023-02322-w