Log in

Network vector autoregression with individual effects

  • Published:
Metrika Aims and scope Submit manuscript


In recent years, there has been great interest in using network structure to improve classic statistical models in cases where individuals are dependent. The network vector autoregressive (NAR) model assumes that each node’s response can be affected by the average of its connected neighbors. This article focuses on the problem of individual effects in NAR models, as different nodes have different effects on others. We propose a penalty method to estimate the NAR model with different individual effects and investigate some theoretical properties. Two simulation experiments are performed to verify effectiveness and tolerance compared with the original NAR model. The proposed model is also applied to an international trade data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others


  • Amini AA, Chen A, Bickel PJ, Levina E (2013) Pseudo-likelihood methods for community detection in large sparse networks. Ann Stat 41(4):2097–2122

    Article  MathSciNet  Google Scholar 

  • Belkin M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs. In: International conference on computational learning theory. Springer, Berlin, pp 624–638

  • Carlin BP, Gelfand AE, Banerjee S (2014) Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC, London

    MATH  Google Scholar 

  • Chen C, ** R, Lin N (2018) Community detection by \(l_{0}\)-penalized graph Laplacian. Electron J Stat 12(1):1842–1866. https://doi.org/10.1214/18-EJS1445

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng HM, Ning YZ, Yin Z, Yan C, Liu X, Zhang ZY (2018a) Community detection in complex networks using link prediction. Mod Phys Lett B 32(01):1850004

    Article  MathSciNet  Google Scholar 

  • Cheng J, Chen M, Zhou M, Gao S, Liu C, Liu C (2018b) Overlap** community change point detection in an evolving network. IEEE Transactions on Big Data

  • Dimitrios K, Vasileios O (2015) A network analysis of the Greek stock market. Procedia Econ Finance 33:340–349

    Article  Google Scholar 

  • Du N, Song L, Yuan M, Smola AJ (2012) Learning networks of heterogeneous influence. In: Advances in neural information processing systems, pp 2780–2788

  • Geng J, Bhattacharya A, Pati D (2018) Probabilistic community detection with unknown number of communities. J Am Stat Assoc 1–13

  • Hall P, Heyde CC (1980) Martingale limit theory and its application. Academic Press, New York

    MATH  Google Scholar 

  • Hoff PD (2003) Random effects models for network data. na

  • Hoff PD (2018) Additive and multiplicative effects network models. ar**v:180708038

  • Krampe J (2019) Time series modeling on dynamic networks. Electron J Stat 13(2):4945–4976

    Article  MathSciNet  Google Scholar 

  • Lee J, Li G, Wilson JD (2017) Varying-coefficient models for dynamic networks. ar**v:170203632

  • Li T, Levina E, Zhu J (2019) Prediction models for network-linked data. Ann Appl Stat 13(1):132–164

    MathSciNet  MATH  Google Scholar 

  • Petersen KB, Pedersen MS (2008) The matrix cookbook. Tech Univ Den 7(15):510

    Google Scholar 

  • Saldana DF, Yu Y, Feng Y (2017) How many communities are there? J Comput Graph Stat 26(1):171–181

    Article  MathSciNet  Google Scholar 

  • Tinbergen JJ (1962) Sha** the world economy; suggestions for an international economic policy

  • Walters P (2000) An introduction to ergodic theory, vol 79. Springer, Berlin

    MATH  Google Scholar 

  • Weng H, Feng Y (2016) Community detection with nodal information. ar**v:161009735

  • Westveld AH, Hoff PD (2011) A mixed effects model for longitudinal relational and network data, with applications to international trade and conflict. Ann Appl Stat 5(2A):843–872

    Article  MathSciNet  Google Scholar 

  • Wu YJ, Levina E, Zhu J (2017) Generalized linear models with low rank effects for network data. ar**v:170506772

  • Zhang Y, Levina E, Zhu J (2016) Community detection in networks with node features. Electron J Stat 10(2):3153–3178

    MathSciNet  MATH  Google Scholar 

  • Zhang Y, Levina E, Zhu J (2017) Estimating network edge probabilities by neighbourhood smoothing. Biometrika 104(4):771–783

    Article  MathSciNet  Google Scholar 

  • Zhao Y, Levina E, Zhu J (2012) Consistency of community detection in networks under degree-corrected stochastic block models. Ann Stat 40(4):2266–2292

    Article  MathSciNet  Google Scholar 

  • Zhao Y, Wu YJ, Levina E, Zhu J (2017) Link prediction for partially observed networks. J Comput Graph Statistics 26(3):725–733

    Article  MathSciNet  Google Scholar 

  • Zhu T, Li P, Chen K, Chen Y, Yu L (2018) Hyper-network based change point detection in dynamic networks

  • Zhu X, Pan R, Li G, Liu Y, Wang H (2017) Network vector autoregression. Ann Stat 45(3):1096–1123

    Article  MathSciNet  Google Scholar 

  • Zhu X, Wang W, Wang H, Härdle WK (2019) Network quantile autoregression. J Econ

Download references


We thank the Editor and two referees for their helpful comments on the earlier versions of the paper.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Tao Huang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Dr. Bai’s research was partially supported by National Natural Science Foundation of China (NSFC) (No. 11771268). Dr. Huang’s research was partially supported by National Natural Science Foundation of China (NSFC) (No. 11871323).



1. Proof of Theorem 1. Denote by \(\lambda _i(M)\) the ith eigenvalue of any arbitrary matrix \(M\in \mathbb {R}^{N\times N}\). To ensure the existence of a strict stationary solution, it is only required that \(\max _i |\lambda _i(G)|<1\). Note that \(\max _i|\lambda _i(W)|\le 1\) by Zhu et al. (2017) and Carlin et al. (2014); thus, we can obtain the following proof using (284) in Petersen and Pedersen (2008) and the Gelfand corollaries:

$$\begin{aligned} \begin{aligned} \max _{1\le i\le N}|\lambda _i(G)|&=\max _{1\le i\le N}|\lambda _i(W\varvec{\eta }+\beta _1I)|\\&=\max _{1\le i\le N}|\beta _1+\lambda _i(W\varvec{\eta })|\\&\le |\beta _1|+\max _{1\le i\le N}|\lambda _i(W\varvec{\eta })|\\&\le |\beta _1|+\max _{1\le i\le N}|\lambda _i(W)|\cdot \max _{1\le i\le N}|\lambda _i(\varvec{\eta })|\\&\le |\beta _1|+\max _{1\le i\le N}|\eta _i|<1. \end{aligned} \end{aligned}$$

After proving the existence of the strict stationary solution, it is easy to obtain the form and the distribution of the solution given \(\mathbb {Z}\), using Theorem 1 and Proposition 1 in Zhu et al. (2017).

2. Proof of Theorem 2.


$$\begin{aligned} {\hat{\varSigma }}=\frac{1}{T}\begin{pmatrix} \sum \limits _{t=1}^T\mathbb {V}_{t-1}^\top \mathbb {V}_{t-1} &{}\ \sum \limits _{t=1}^T\mathbb {V}_{t-1}^\top \mathbb {X}_{t-1}\\ \sum \limits _{t=1}^T\mathbb {X}_{t-1}^\top \mathbb {V}_{t-1} &{}\ \sum \limits _{t=1}^T\mathbb {X}_{t-1}^\top \mathbb {X}_{t-1}+\lambda L\\ \end{pmatrix}\quad \text {and}\quad {\hat{\varSigma }}_{xe}=\frac{1}{T}\begin{pmatrix} \sum \limits _{t=1}^T\mathbb {V}_{t-1}^\top {\mathcal {E}}_t\\ \sum \limits _{t=1}^T\mathbb {X}_{t-1}^\top {\mathcal {E}}_t\\ \end{pmatrix}. \end{aligned}$$

According to (6) and \(\mathbb {Y}_t=\mathbb {V}_{t-1}\theta +\mathbb {X}_{t-1}+{\mathcal {E}}_t\), we have

$$\begin{aligned} \begin{aligned} \begin{pmatrix} {\hat{\theta }}\\ {\hat{\eta }}\\ \end{pmatrix}&=\begin{pmatrix} \theta \\ \eta \end{pmatrix}-{\hat{\varSigma }}^{-1} \begin{pmatrix} \mathbf {0} &{}\mathbf {0} \\ \mathbf {0} &{}\lambda L/T \end{pmatrix}\begin{pmatrix} \theta \\ \eta \\ \end{pmatrix} +{\hat{\varSigma }}^{-1} {\hat{\varSigma }}_{xe}. \end{aligned} \end{aligned}$$

Then Theorem 2 holds if

$$\begin{aligned}&{\hat{\varSigma }}{\mathop {\longrightarrow }\limits ^{P}}\varSigma , \end{aligned}$$
$$\begin{aligned}&\sqrt{T}{\hat{\varSigma }}_{xe}{\mathop {\longrightarrow }\limits ^{D}}N(0,\sigma ^2\varSigma ), \end{aligned}$$

as \(T\rightarrow \infty \). We will prove (A.1) and (A.2) respectively.

Step 1 In this step, we attempt to prove that

$$\begin{aligned} {\hat{\varSigma }}=\begin{pmatrix} N &{} S_{12} &{} S_{13} &{} S_{14}\\ &{} S_{22} &{} S_{23} &{} S_{24}\\ &{} &{} S_{33} &{} S_{34}\\ &{} &{} &{} S_{44}\\ \end{pmatrix}{\mathop {\longrightarrow }\limits ^{P}}\begin{pmatrix} N, &{} {\mathbf {1}}^\top \mu , &{} {\mathbf {1}}^\top \mathbb {Z}, &{} ({\mathbf {1}}^\top W)\circ \mu ^\top \\ &{} \mu ^\top \mu +\text {tr}\{\varGamma (0)\}, &{} \mu ^\top \mathbb {Z}, &{} \text {diag}[W^\top (\varGamma (0)+\mu \mu ^\top )]\\ &{} &{} \mathbb {Z}^\top \mathbb {Z}, &{} (\mathbb {Z}^\top W)\circ \mu ^\top \\ &{} &{} &{} \mu \mu ^\top +\varGamma (0)\\ \end{pmatrix} \end{aligned}$$


$$\begin{aligned} S_{12}= & {} \frac{1}{T}\sum _{t=1}^T {\mathbf {1}}^\top \mathbb {Y}_{t-1},\quad S_{13}=\frac{1}{T}\sum _{t=1}^T {\mathbf {1}}^\top \mathbb {Z}, \quad S_{14}=\frac{1}{T}\sum _{t=1}^T ({\mathbf {1}}^\top W)\circ \mathbb {Y}_{t-1}^\top , \\ S_{22}= & {} \frac{1}{T}\sum _{t=1}^T \mathbb {Y}_{t-1}^\top \mathbb {Y}_{t-1}, \quad S_{23}=\frac{1}{T}\sum _{t=1}^T \mathbb {Y}_{t-1}^\top \mathbb {Z}, \quad S_{24}=\frac{1}{T}\sum _{t=1}^T (\mathbb {Y}_{t-1}^\top W)\circ \mathbb {Y}_{t-1}^\top , \\ S_{33}= & {} \frac{1}{T}\sum _{t=1}^T \mathbb {Z}^\top \mathbb {Z}, \quad S_{34}=\frac{1}{T}\sum _{t=1}^T (\mathbb {Z}^\top W)\circ \mathbb {Y}_{t-1}^\top , \quad \\ S_{44}= & {} \frac{1}{T}\sum _{t=1}^T (W^\top W)\circ (\mathbb {Y}_{t-1}\mathbb {Y}_{t-1}^\top )+\frac{\lambda }{T}L. \end{aligned}$$

Due to the fact that N is fixed, \(S_{13}={\mathbf {1}}^\top \mathbb {Z}\) and \(S_{33}=\mathbb {Z}^\top \mathbb {Z}\). According to (7), \(\mathbb {Y}_t=(I-G)^{-1}{\mathcal {B}}_0+\sum _{j=0}^{\infty } G^j{\mathcal {E}}_{t-j},\) and \(\mu =(I-G)^{-1}{\mathcal {B}}_0\), we show the convergence of the other entries in \({\hat{\varSigma }}\). By the ergodic theorem as described in Walters (2000), we have \(S_{12}{\mathop {\longrightarrow }\limits ^{P}}{\mathbf {1}}^\top \mu \), \(S_{14}{\mathop {\longrightarrow }\limits ^{P}}({\mathbf {1}}^\top W)\circ \mu ^\top \), \(S_{22}{\mathop {\longrightarrow }\limits ^{P}}\mu ^\top \mu +\text {tr}\{\varGamma (0)\}\), \(S_{23}{\mathop {\longrightarrow }\limits ^{P}}\mu ^\top \mathbb {Z}\), \(S_{34}{\mathop {\longrightarrow }\limits ^{P}}(\mathbb {Z}^\top W)\circ \mu ^\top \), \(S_{44}{\mathop {\longrightarrow }\limits ^{P}}\mu \mu ^\top +\varGamma (0)\), where \(\varGamma (0)\) is defined in Proposition 1. As for the convergence of \(S_{24}\), we use the basic property of the covariance of any n-dimensional random variable \(\mathbf {y}\), \(\mathbb {E}(\mathbf {y}\mathbf {y}^\top )=\text {cov}(\mathbf {y})+\mu _{\mathbf {y}}\mu _{\mathbf {y}}^\top \), where \(\mu _{\mathbf {y}}=\mathbb {E}(\mathbf {y})\). \((\mathbb {Y}_t^\top W)\circ \mathbb {Y}_t^\top =(w_{11}Y_{1t}+\ldots +w_{N1}Y_{1t},\cdots ,w_{1N}Y_{1t}+\ldots +w_{NN}Y_{Nt})= \text {diag}(W^\top \mathbb {Y}_t\mathbb {Y}_t^\top )\). So \(S_{24}{\mathop {\longrightarrow }\limits ^{P}}\text {diag}[W^\top (\varGamma (0)+\mu \mu ^\top )]\). This completes the proof of (A.1).

Step 2 Proof of (A.2). It suffices to show that \(T^{1/2}\delta ^\top {\hat{\varSigma }}_{xe}=T^{-1/2}\sum _t\delta ^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})^\top {\mathcal {E}}_t{\mathop {\longrightarrow }\limits ^{D}}N(0,\delta ^\top \varSigma \delta )\) for any \(\delta \in \mathbb {R}^{2+p+N}\), where \(\sigma ^2\) is set to be 1 in this step for simplicity. To this end, denote \(\xi _t=T^{-1/2}\delta ^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})^\top {\mathcal {E}}_t\), \(\mathbb {S}_t=\sum _{s=1}^t \xi _s\) and \({\mathscr {F}}_t=\sigma \{{\mathcal {E}}_s,-\infty<s<t\}\), then \(\{\mathbb {S}_t,{\mathscr {F}}_t,t>-\infty \}\) is a zero-mean martingale and it is equivalent to prove \(\mathbb {S}_t{\mathop {\longrightarrow }\limits ^{P}}N(0,\delta ^\top \varSigma \delta )\). According to (A.1), we have

$$\begin{aligned} (\delta ^\top \varSigma \delta )^{-1} \sum _{t=1}^T\mathbb {E}(\xi _t^2|{\mathscr {F}}_{t-1}){\mathop {\longrightarrow }\limits ^{P}}1, \end{aligned}$$

and as \(T\rightarrow \infty \),

$$\begin{aligned}&(\delta ^\top \varSigma \delta )^{-1}\sum _{t=1}^T\mathbb {E}[\xi _t^2 I(|\xi _t|\ge \epsilon (\delta ^\top \varSigma \delta )^{1/2})] \nonumber \\&\quad \le \frac{\epsilon ^{-2}}{(\delta ^\top \varSigma \delta )^2}\sum _{t=1}^T\mathbb {E}(\xi _t^4)\nonumber \\&\quad \le \frac{\epsilon ^{-2}}{(\delta ^\top \varSigma \delta )^2T^2}\sum _{t=1}^T\mathbb {E}\Big (\delta ^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})\delta \Big )^2 \rightarrow 0. \end{aligned}$$

The limit is satisfied because \(\Big (\delta ^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})^\top (\mathbb {V}_{t-1}, \mathbb {X}_{t-1})\delta \Big )^2\) does not tend to infinity. Therefore, by (A.3), (A.4) and the central limit theorem for martingale sequences (Hall and Heyde 1980, pp. 9–10), we have \(\mathbb {S}_t=T^{1/2}\delta ^\top {\hat{\varSigma }}_{xe}{\mathop {\longrightarrow }\limits ^{D}}N(0,\delta ^\top \varSigma \delta )\). This completes the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Y., Bai, Y. & Huang, T. Network vector autoregression with individual effects. Metrika 84, 875–893 (2021). https://doi.org/10.1007/s00184-020-00805-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-020-00805-y

