Abstract
As an important extension of the varying-coefficient model, the partially linear varying-coefficient model has been widely studied in the literature. It is vital that how to simultaneously eliminate the redundant covariates and separate the varying and nonzero constant coefficients for varying-coefficient models. In this paper, we consider the penalized composite quantile regression to explore the model structure of ultra-high-dimensional varying-coefficient models. Under some regularity conditions, we study the convergence rate and asymptotic normality of the oracle estimator and prove that, with probability approaching one, the oracle estimator is a local solution of the nonconvex penalized composite quantile regression. Simulation studies indicate that the novel method as well as the oracle method performs in both low dimension and high dimension cases. An environmental data application is also analyzed by utilizing the proposed procedure.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40304-023-00336-8/MediaObjects/40304_2023_336_Fig1_HTML.png)
Similar content being viewed by others
References
Ahmad, I., Leelahanon, S., Li, Q.: Efficient estimation of a semiparametric partially linear varying coefficient model. Ann. Statist. 33, 258–283 (2005)
Chen, Y., Bai, Y., Fung, W.: Structural identification and variable selection in high-dimensional varying-coefficient models. J. Nonparametr. Stat. 29, 258–279 (2017)
Chen, J., Chen, Z.: Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95, 759–771 (2008)
Cheng, M., Honda, T., Li, J., Peng, H.: Nonparametric independence screening and structure identification for ultra-high dimensional longitudinal data. Ann. Statist. 42, 1819–1849 (2014)
De Boor, C.: A Practical Guide to Splines. Springer, New York (2001)
Eubank, R.L., Huang, C.F., Maldonado, Y.M., Wang, N., Wang, S., Buchanan, R.J.: Smoothing spline estimation in varying-coefficient models. J. R. Stat. Soc. Ser. B Stat. Methodol. 66, 653–667 (2004)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Statist. Assoc. 96, 1348–1360 (2001)
Fan, J., Lv, J.: Nonconcave penalized likelihood with NP-dimensionality. IEEE Trans. Inform. Theory 57, 5467–5484 (2011)
Fan, J., Huang, T.: Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11, 1031–1057 (2005)
Fan, J., Zhang, W.: Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand. J. Stat. 27, 715–731 (2000)
Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. Ser. B Stat. Methodol. 55, 757–796 (1993)
Hu, T., **a, Y.: Adaptive semi-varying coefficient model selection. Statist. Sinica 22, 575–599 (2012)
Huang, J., Wei, F., Ma, S.: Semiparametric regression pursuit. Statist. Sinica 22, 1403–1426 (2012)
Hunter, D., Lange, K.: Quantile regression via an MM algorithm. J. Comput. Graph. Statist. 9, 60–77 (2000)
Jiang, Q., Wang, H., **a, Y., Jiang, G.: On a principal varying coefficient model. J. Am. Statist. Assoc. 108, 228–236 (2013)
Kai, B., Li, R., Zou, H.: Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 72, 49–69 (2010)
Kai, B., Li, R., Zou, H.: New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Statist. 39, 305–332 (2011)
Kim, M.O.: Quantile regression with varying coefficients. Ann. Statist. 35, 92–108 (2007)
Kim, Y., Choi, H., Oh, H.: Smoothly clipped absolute deviation on high dimensions. J. Am. Statist. Assoc. 103, 1665–1673 (2008)
Koenker, R.: Quantile Regression. Cambridge University Press, New York (2005)
Leng, C.: A simple approach for varying-coefficient model selection. J. Statist. Plann. Infer. 139, 2138–2146 (2009)
Li, D., Ke, Y., Zhang, W.: Model selection and structure specification in ultra-high dimensional generalised semi-varying coefficient models. Ann. Statist. 43, 2676–2705 (2015)
Li, G., Peng, H., Zhang, J., Zhu, L.: Robust rank correlation based screening. Ann. Statist. 40, 1846–1877 (2012)
Lian, H., Lai, P., Liang, H.: Partially linear structure selection in cox models with varying coefficients. Biometrics 69, 348–357 (2013)
Lian, H.: Variable selection for high-dimensional generalized varying-coefficient models. Statist. Sinica 22, 1563–1588 (2012)
Lian, H., Liang, H., Ruppert, D.: Separation of covariates into nonparametric and parametric parts in high-dimensional partially linear additive models. Statist. Sinica 25, 591–607 (2015)
Ma, X., Zhang, J.: A new variable selection approach for varying coefficient models. Metrika 79, 59–72 (2016)
Noh, H., Van Keilegom, I.: Efficient model selection in semivarying coefficient models. Electron. J. Stat. 6, 2519–2534 (2012)
Park, B.U., Mammen, E., Lee, Y.K., Lee, E.R.: Varying coefficient regression models, a review and new developments. Intern. Statist. Rev. 83, 36–64 (2015)
Qin, G., Mao, J., Zhu, Z.: Joint mean-covariance model in generalized partially linear varying coefficient models for longitudinal data. J. Statist. Comput. Simulat. 86, 1166–1182 (2016)
Qu, A., Li, R.: Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics 62, 379–391 (2006)
Sherwood, B., Wang, L.: Partially linear additive quantile regression in ultra-high dimension. Ann. Statist. 44, 288–317 (2016)
Stone, C.J.: Additive regression and other nonparametric models. Ann. Statist. 13, 689–705 (1985)
Tang, Y., Wang, H.J., Zhu, Z., Song, X.: A unified variable selection approach for varying coefficient models. Statist. Sinica 22, 601–628 (2012)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58, 267–288 (1996)
Wang, L., Kai, B., Li, R.: Local rank inference for varying coefficient models. J. Amer. Statist. Assoc. 104, 1631–1645 (2009)
Wang, D., Kulasekera, K.B.: Parametric component detection and variable selection in varying-coefficient partially linear models. J. Multiv. Anal. 112, 117–129 (2012)
Wang, K., Lin, L.: Robust structure identification and variable selection in partial linear varying coefficient models. J. Statist. Plann. Infer. 174, 153–168 (2016)
Wang, K., Lin, L.: Robust and efficient estimator for simultaneous model structure identification and variable selection in generalized partial linear varying coefficient models with longitudinal data. Statist. Pap. 60, 1649–1676 (2019)
Wang, M., Zhao, P., Kang, X.: Structure identification for varying coefficient models with measurement errors based on kernel smoothing. Statist. Pap. 61, 1841–1857 (2020)
Wang, H.J., Zhu, Z., Zhou, J.: Quantile regression in partially linear varying coefficient models. Ann. Statist. 37, 3841–3866 (2009)
Wei, Y., He, X.: Conditional growth charts (with discussion). Ann. Statist. 34, 2069–2097 (2006)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68, 49–67 (2006)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38, 894–942 (2010)
Zhang, H.H., Cheng, G., Liu, Y.: Linear or nonlinear? Automatic structure discovery for partially linear models. J. Am. Statist. Assoc. 106, 1099–1112 (2011)
Zhou, Y., Liang, H.: Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates. Ann. Statist. 37, 427–458 (2009)
Zou, H., Yuan, M.: Composite quantile regression and the oracle model selection theory. Ann. Statist. 36, 1108–1126 (2008)
Acknowledgements
We sincerely thank the Editor in Field of the journal, Professor Niansheng Tang, and two anonymous referees for their constructive comments and very useful suggestions which were most valuable for improvement in the first version of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
**g Yang’s research was supported by the Natural Science Foundation of Hunan Province (Grant 2022JJ30368), the Scientific Research Fund of Hunan Provincial Education Department (Grant 22A0040) and the National Natural Science Foundation of China (Grants 11801168, 12071124). Tian’s research was supported by the National Natural Science Foundation of China (Grant 12171225). Lu’s research was supported by the Discovery Grants (RGPIN-2018-06466) from Natural Sciences and Engineering Research Council (NSERC) of Canada. Wang’s research was supported by the National Natural Science Foundation of China (Grant 12271294).
Appendix
Appendix
Let C denote a generic constant that might assume different values at different places. To facilitate the proof, we define
We first give some technical lemmas which will be frequently used in the subsequent proof.
Lemma 8.1
Under conditions (C1)–(C6), the following properties hold:
-
(a)
\(\sup _{i}|r_{ni}|=O_p(\sqrt{p_{n2}}m_n^{-r})\),
-
(b)
The eigenvalues of \(\frac{m_n}{n}\varPi ^{\top }\varPi \) and \(\frac{m_n}{n}H^{\top }H\) are bounded in probability,
-
(c)
\(\max _{i}\Vert \widehat{{\textbf{B}}}(U_i,{\textbf{x}}^{v}_{i})\Vert =O_p(\sqrt{p_{n2}m_n/n})\),
-
(d)
\(\sum _{i=1}^n{ w_i\tilde{{\textbf{x}}}^{c}_i\widehat{{\textbf{B}}}(U_i,{\textbf{x}}^{v}_{i})^{\top } }=0\).
Proof
(a) Note that \({\textbf{B}}(U_i,{\textbf{x}}_{i})=\left( X_{i1}{\textbf{B}}(U_i)^{\top },\ldots ,X_{ip_n}{\textbf{B}}(U_i)^{\top } \right) ^{\top }\), based on the result (2.3) and condition (C4), we have
which implies the result of (a).
(b) This conclusion can be directly obtained from lemma A.4 of [18], so we omit its proof here.
(c) It is obvious that \(\Vert \varPi _W^{-1}\Vert =O_p(\sqrt{m_n/n})\) by result (b) and condition (C3). Moreover, from the definition of \({\textbf{B}}(U_i,{\textbf{x}}^{v}_{i})\), we can verify \(\Vert {\textbf{B}}(U_i,{\textbf{x}}^{v}_{i})\Vert =O_p(\sqrt{p_{n2}})\) by noting that \(E(B_j^2(U))=m_n^{-1}\), \(j=1,\ldots ,m_n\). Then, we have
(d) As \(W\varPi -P^{\top }W\varPi =W\varPi -W\varPi (\varPi ^{\top }W\varPi )^{-1}\varPi ^{\top }W\varPi =W\varPi -W\varPi =0\), then
\(\square \)
Lemma 8.2
Under conditions (C1)–(C7), we have
-
(a)
The eigenvalues of \({\textbf{X}}^{c*\top }{\textbf{X}}^{c*}/n\) are bounded in probability,
-
(b)
\(S_n^*=S_n+o_p(1)\) and \(\varLambda _n^*=\varLambda _n+o_p(1)\).
Proof
(a) Note that
Thus, (a) is derived by condition (C4) and the fact that \(I-P\) is a projection matrix.
(b) Recall that \({\textbf{X}}^c=\varvec{\varPhi }^*+\varvec{\varDelta }_n\), then \(n^{-1/2}{\textbf{X}}^{c*}=n^{-1/2}(I-P){\textbf{X}}^{c}=n^{-1/2}\varvec{\varDelta }_n+n^{-1/2}(\varvec{\varPhi }^*-P{\textbf{X}}^{c})\). For \(l=1,\ldots ,p_{n1}\), let \(\varvec{\gamma }_l^* \in R^{m_n}\) be defined as the following weighted least-squares problem, that is \(\varvec{\gamma }_l^*=\arg \min _{\varvec{\gamma }}\sum _{i=1}^n{ w_i (X_{il}-{\textbf{B}}(U_i,{\textbf{x}}^{v}_{i})^{\top }\varvec{\gamma }_l)^2 }\). Further define \(\hat{\phi }_l(U_i,{\textbf{x}}^{v}_{i})={\textbf{B}}(U_i,{\textbf{x}}^{v}_{i})^{\top }\varvec{\gamma }_l^*\), we can obtain that the (i, l)th element of \(P{\textbf{X}}^{c}\) is \(\hat{\phi }_l(U_i,{\textbf{x}}^{v}_{i})\) actually. Taking into account of conditions (C1), (C2) and (C4), it follows from Theorem 1 of [33] that \((\phi _l^*(U_i, {\textbf{x}}^v_i)-\hat{\phi }_l(U_i,{\textbf{x}}^v_i))^2=O_p\left( p_{n2}n^{-2r/(2r+1)} \right) \). Therefore,
where the last equality holds due to condition (C7).
Consequently, we have \(n^{-1/2}{\textbf{X}}^{c*}=n^{-1/2}\varvec{\varDelta }_n+o_p(1)\) and
where the last equality holds because \(n^{-1/2}\varvec{\varDelta }_n^{\top }W=O_p(1)\) from conditions (C4) and (C5). Similarly, we can prove \(\varLambda _n^*=\varLambda _n+o_p(1)\). \(\square \)
Note that
where \(\varvec{\nu }_k=e_k/\sqrt{n}\), \(e_k=(0,\ldots ,0,1,0,\ldots ,0)^{\top } \in R^K\) is a unit vector with the kth element being 1. Define
Lemma 8.3
Let \(\widetilde{\varvec{\theta }}_1=\sqrt{n}({\textbf{X}}^{c*\top }W{\textbf{X}}^{c*})^{-1}{\textbf{X}}^{c*\top }{\textbf{w}}\). Under conditions (C1)–(C7), we have (a) \(\Vert \widetilde{\varvec{\theta }}_1\Vert =O_p(\sqrt{p_{n1}})\); (b) \(\Vert \widehat{\varvec{\theta }}_1-\widetilde{\varvec{\theta }}_1\Vert =o_p(1)\).
Proof
(a) From the proof of Lemma 8.2 (b), we have \(n^{-1/2}{\textbf{X}}^{c*}=n^{-1/2}\varvec{\varDelta }_n+o_P(1)\) and \(n^{-1}{\textbf{X}}^{c*\top }W{\textbf{X}}^{c*}=S_n+o_p(1)\). Then, \(\widetilde{\varvec{\theta }}_1=S_n^{*-1}(n^{-1/2}{\textbf{X}}^{c*\top }{\textbf{w}})=S_n^{*-1}(( n^{-1/2}\varvec{\varDelta }_n+o_p(1))^{\top }{\textbf{w}})\), which implies \(\Vert \widetilde{\varvec{\theta }}_1\Vert =O_P(\sqrt{p_{n1}})\) by conditions (C3) and (C5).
(b) Define
Let \(d_n=p_{n1}+p_{n2}m_n\), by the results of Lemmas 8.1\(-\)8.3 in [42] and Lemma 8.1 (d) that \({\textbf{X}}^{c*}\) is orthogonality to \(W\varPi \), we have, for any finite positive constant M,
where \(E(R_i(\varvec{\omega },\varvec{\theta }_1,\widetilde{\varvec{\theta }}_1,\varvec{\theta }_2))\) denotes the condition expectation \(E(R_i(\varvec{\omega },\varvec{\theta }_1,\widetilde{\varvec{\theta }}_1,\varvec{\theta }_2)\mid {\textbf{x}}_i, U_i)\). Applying the triangle inequality to above two expressions yields
In addition, based on previous arguments in the proof of (a), we have
which means
Therefore, it follows that
On the other hand, condition (C5) indicates \(\frac{1}{2}(\varvec{\theta }_1-\widetilde{\varvec{\theta }}_1)^{\top }S_n(\varvec{\theta }_1-\widetilde{\varvec{\theta }}_1)>CM^2\) for any \(\Vert \varvec{\theta }_1-\widetilde{\varvec{\theta }}_1\Vert >M\) and some finite constant \(C>0\). This means
By the definition of \(\widehat{\varvec{\theta }}_1\) and the convexity of function \(\rho _{\tau _k}(\cdot )\), \(k=1,\ldots ,K\), (8.1) implies that \(P(\Vert \varvec{\theta }_1-\widetilde{\varvec{\theta }}_1\Vert >M)\rightarrow 0\) for any finite \(M>0\) as \(n\rightarrow \infty \), that is \(\Vert \varvec{\theta }_1-\widetilde{\varvec{\theta }}_1\Vert =o_p(1)\). This completes the proof. \(\square \)
Proof of Theorem 3.4
-
(i)
This proof is directly followed by the results (a) and (b) of Lemma 8.3.
-
(ii)
We keep using the notations in Lemma 8.3 and further introduce some definitions. Let \(a_n\) be a sequence of positive numbers and \(\varvec{\vartheta }=(\varvec{\omega }^{\top },\varvec{\theta }_1^{\top },\varvec{\theta }_2^{\top })^{\top }\). Define
$$\begin{aligned} Q_i(\varvec{\vartheta },a_n)= & {} \sum _{k=1}^K{ \rho _{\tau _k}\left( \varepsilon _{ik}-a_n\varvec{\nu }_k^T\varvec{\omega }-a_n\tilde{{\textbf{x}}}^{c\top }_{i}\varvec{\theta }_1-a_n\widehat{{\textbf{B}}}(U_i,{\textbf{x}}^v_{i})^{\top } \varvec{\theta }_2-r_{ni} \right) }, \\ D_i(\varvec{\vartheta },a_n)= & {} Q_i(\varvec{\vartheta },a_n)-Q_i(\varvec{\vartheta },0)-E(Q_i(\varvec{\vartheta },a_n)-Q_i(\varvec{\vartheta },0))\\ {}{} & {} +a_n \sum _{k=1}^K{(\varvec{\nu }_k^T\varvec{\omega }+\tilde{{\textbf{x}}}^{c\top }_{i}\varvec{\theta }_1+\widehat{{\textbf{B}}}(U_i,{\textbf{x}}^v_{i})^{\top } \varvec{\theta }_2) \psi _{\tau _k}( \varepsilon _{ik}) }. \end{aligned}$$
Observing that \(\rho _{\tau }(u)=|u|/2+(\tau -1/2)u\), then
Let
and \(D_i(\varvec{\vartheta },a_n)\) can be rewritten as
We first prove that for any given \(\varpi >0\), there exists a constant \(L>0\) satisfying
Since
Following the similar arguments in the proof of Lemma B.1 in [32], we can verify
under conditions \(p_{n2}^3/m_n^{2(r-1)}\rightarrow 0\) and \(p_{n1}/(m_np_{n2})\rightarrow 0\). Thus \(\var** _1=o_p(1)\).
Let \(s_{ni}=\varvec{\nu }_k^T\varvec{\omega }+\tilde{{\textbf{x}}}^{c\top }_{i}\varvec{\theta }_1+\widehat{{\textbf{B}}}(U_i,{\textbf{x}}^v_{i})^{\top } \varvec{\theta }_2\), applying the Knight’s identity to \(\var** _2\) yields
where the fourth equality holds by Lemma 8.1 (d) that \(\sum _{i=1}^n{ w_i\tilde{{\textbf{x}}}^{c}_i\widehat{{\textbf{B}}}(U_i,{\textbf{x}}^{v}_{i})^{\top } }=0\). Based on conditions (C3) and (C7), Lemma 8.1 (c), Lemma 8.2 as well as the constraint \(\Vert \varvec{\vartheta }\Vert <L\), we can obtain that \(\var** _{21}=O_p(\Vert \varvec{\omega }\Vert ^2)\), \(\var** _{22}=O_p(\Vert \varvec{\theta }_1\Vert ^2)\), \(\var** _{23}=O_p(\Vert \varvec{\theta }_2\Vert ^2)\) and \(\var** _{24}=O_p(1)\).
For the term \(\var** _{25}\), as \(\Vert {\textbf{r}}_n\Vert =O_p(\sqrt{np_{n2}}m_n^{-r})\) by Lemma 8.1 (a), where \({\textbf{r}}_n=(r_{n1},\ldots ,r_{nn})^{\top }\). Obviously, \(d_n^{-1/2}\sum _{i=1}^n{ \sum _{k=1}^K{ f_i(b_{0\tau _k}) }r_{ni} \varvec{\nu }_k^{\top }\varvec{\omega }}=O_p(\Vert \varvec{\omega }\Vert )\). Moreover, it follows from conditions (C3) and (C4), Lemma 8.2 (a) and the Cauchy–Schwarz inequality that
where the last equality holds by the condition \(p_{n1}/(m_np_{n2})\rightarrow 0\) in condition (C7). Similarly,
Hence, \(\var** _2=O_p(\Vert \varvec{\vartheta }\Vert )\).
In the next, we consider \(\var** _3\). Obviously, \(E(\var** _3)=0\) holds. Meanwhile, condition (C3) implies that \(\varvec{\omega }^{\top } \sum _{i=1}^n{ \sum _{k=1}^K{\varvec{\nu }_k\varvec{\nu }_k^{\top }}\psi _{\tau _k}^2( \varepsilon _{ik}) } \varvec{\omega }=O_p(\Vert \varvec{\omega }\Vert ^2)\) and there exists a constant \(C>0\) such that
By the definition of \(\tilde{{\textbf{x}}}^{c}_{i}\) and Lemma 8.2 (a), we have
which means \(\var** _3=O_p(d_n^{-1/2} \Vert \varvec{\vartheta }\Vert ) \).
Therefore, (8.2) holds as the quadratic term dominate for sufficiently large L. Further by the convexity, we have \(\Vert \hat{\varvec{\vartheta }}\Vert =O_p(\sqrt{d_n})\) and then \(\Vert \varPi _W(\widehat{\varvec{\gamma }}_v-\varvec{\gamma }_{0v})\Vert =O_p(\sqrt{d_n})\) from the definition of \(\hat{\varvec{\vartheta }}\). Consequently, it follows from Lemma 8.1 (a) and conditions (C3) and (C4) that
This completes the proof. \(\Box \)
Proof of Theorem 3.5
We first demonstrate \(A_n\varSigma _n^{-1/2}\widetilde{\varvec{\theta }}_1~\mathop \rightarrow \limits ^D~ N({\varvec{0}},G)\). In fact, by the definition of \(\widetilde{\varvec{\theta }}_1\) and the proof of Lemma 8.2 (b), we have
Rewrite \(A_n\varSigma _n^{-1/2} S_n^{-1} (n^{-1/2}\varvec{\varDelta }_n^{\top }{\textbf{w}})=\sum _{i=1}^n{ H_{ni} }\), where \(H_{ni}=n^{-1/2}A_n\varSigma _n^{-1/2} S_n^{-1}\varvec{\delta }_{i}w_i\). Then, \(E(H_{ni})={\varvec{0}}\) and
Moreover, based on conditions (C3), (C4) and (C6), we can verify that for any \(\kappa >0\),
where the last inequality holds because \(\lambda _{\max }(A_n^TA_n)=\lambda _{\max }(A_nA_n^T)\) and G is positive definite, the last equality due to \(p_{n1}^2/n\rightarrow 0\) in condition (C7). Thus, the Lindeberg–Feller condition holds for \(\{H_{ni}\}\) and \(A_n\varSigma _n^{-1/2}\widetilde{\varvec{\theta }}_1~\mathop \rightarrow \limits ^D~ N(\varvec{0},G)\) is proved. Note that \(\widetilde{\varvec{\theta }}_1=\widehat{\varvec{\theta }}_1+o_P(1)\) from lemma 8.3 (b) and \(\widehat{\varvec{\theta }}_1=\sqrt{n}(\widehat{\varvec{\beta }}_c -\varvec{\beta }_{0c})\), the proof of Theorem 4.3 is followed. \(\Box \)
Lemma 8.4
Under conditions (C1)–(C8), we have
Proof
Let \(\xi _{ik}({\hat{b}}_{\tau _k}^o,\widehat{\varvec{\beta }}^{1o},\widehat{\varvec{\gamma }}^{2o})={\hat{b}}_{\tau _k}^o-b_{0\tau _k}+{\textbf{x}}_{i}^{1\top }(\widehat{\varvec{\beta }}^{1o}-\varvec{\beta }_{0}^{1}) +{\textbf{z}}_{i}^{2\top }(\widehat{\varvec{\gamma }}^{2o}-\varvec{\gamma }_{0}^{2})\), then we have
where \(R(u)=\sum _{j\in S_v}X_{ij}(\eta _{0j}(u)-\widetilde{{\textbf{B}}}(u)^{\top }\varvec{\gamma }_{0j})\), \(T_{nj}=\sum _{k=1}^K\sum _{i=1}^n X_{ij} [\tau _k-I(\varepsilon _i< b_{0\tau _k})]\),
and
with
Note that \(ET_{nj}=0\) and
then using condition (C4),
By Bernstein’s inequality and condition (C8),
Using Lemma 8.5, we have
and
Hence, \(\Pr \left( \max _{j\in {\widetilde{S}}_z}|g^1_j(\widehat{{\textbf{b}}}^o, \widehat{\varvec{\beta }}^o, \widehat{\varvec{\gamma }}^o)|\ge n\lambda _1\right) \rightarrow 0\).
Similarly, we can also prove that \( \Pr \left( \max _{j\notin S_v}\Vert {\textbf{g}}^2_j(\widehat{{\textbf{b}}}^o, \widehat{\varvec{\beta }}^o, \widehat{\varvec{\gamma }}^o)\Vert \ge n\sqrt{m_n}\lambda _2\right) \rightarrow 0. \) \(\square \)
Lemma 8.5
Let \(k_n=\sqrt{q_{n}}(\sqrt{ m_n/n}+m_n^{-r})\). For any finite positive constant C, define
under conditions (C1)–(C8), we have
where \(D^k_{n1j}\) and \(D^k_{n2j}\) are defined in (8.3) and (8.4).
Proof
. For any \((b_{\tau _k},\varvec{\beta }^{1},\varvec{\gamma }^{2})\in {\mathcal {B}}(b_{0\tau _k},\varvec{\beta }_{0}^{1},\varvec{\gamma }_{0}^{2})\), it follows from conditions (C3), (C4) and (C8) that
Using conditions (C3) and (C4),
Since \(E(D^k_{n2j})^2=E(D^k_{n3j})^2+E(D^k_{n1j})^2-2E(D^k_{n3j}D^k_{n1j})=E(D^k_{n3j})^2-E(D^k_{n1j})^2\), by condition (C8), we have
\(\square \)
Proof of Theorem 4.3
We only need to show that \((\widehat{{\textbf{b}}}^o, \widehat{\varvec{\beta }}^o, \widehat{\varvec{\gamma }}^o)\) satisfies Equations (4.1)–(4.5) of Lemma 4.2. By the definition of \((\widehat{{\textbf{b}}}^o, \widehat{\varvec{\beta }}^o, \widehat{\varvec{\gamma }}^o)\), it is easy to know that (4.1) holds. Note that
and
Hence, (4.2) and (4.3) hold based on Theorem 4.1 and conditions (C8)–(C9). (4.4) and (4.5) can be obtained by Lemma 8.4. \(\Box \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, J., Tian, GL., Lu, X. et al. Robust Model Structure Recovery for Ultra-High-Dimensional Varying-Coefficient Models. Commun. Math. Stat. (2023). https://doi.org/10.1007/s40304-023-00336-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40304-023-00336-8
Keywords
- Asymptotic properties
- Composite quantile regression
- Nonconvex penalty
- Robust structure recovery
- Ultra-high dimension