Log in

Poisson subsampling-based estimation for growing-dimensional expectile regression in massive data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

As an effective tool for data analysis, expectile regression is widely used in the fields of statistics, econometrics and finance. However, most studies focus on the case where the sample size is not massive and the dimension is low or fixed. This paper studies the parameter estimation and inference for large-scale expectile regression when the number of parameters grows to infinity. Specifically, an inverse probability weighted asymmetric least squares estimator based on Poisson subsampling (ALS-P) is proposed. Theoretically, the convergence rate and asymptotic normality for ALS-P are established. Furthermore, the optimal subsampling probabilities based on the L-optimality criterion are derived. Finally, extensive simulations and two real-world datasets are conducted to illustrate the effectiveness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3

Similar content being viewed by others

Data availability

Data is provided within the manuscript or supplementary information files.

References

Download references

Acknowledgements

The authors thank the Edi- tor and anonymous reviewers for their constructive comments and suggestions, which greatly improves the quality of the current work. The research of **aochao **a was supported by National Natural Science Foundation of China (Grant Number 11801202) and Fundamental Research Funds for the Central Universities (Grant Number 2021CDJQY-047). The research of Zhimin Zhang was supported by the National Natural Science Foundation of China [Grant Num- bers 12271066, 12171405, 11871121].

Author information

Authors and Affiliations

Authors

Contributions

**aoyan Li: Conceptualization, Methodology, Software, Validation, Writing - original draft. **aochao **a: Supervision, Writing - review & editing. Zhimin Zhang: Supervision, Writing - review & editing.

Corresponding author

Correspondence to **aochao **a.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs

Appendix A: Proofs

We provide the proofs of Theorems 1, 2 and 5 for diverging dimension below. The proof of fixed dimension is a special case of \(p_n=p\). In the proofs, we use C to denote a generic positive constant independent of \((n,N,p_n)\), whose magnitude may change from line to line. Define \( \mathcal {L}(\varvec{\beta })=\frac{1}{N}\sum _{i \in \mathcal {I}_{\mathcal {S}}}\frac{1}{\pi _i}\ell _{\tau }(y_i-\varvec{x}_i^{T}\varvec{\beta })=\frac{1}{N}\sum _{i=1}^{N} \frac{\eta _i}{\pi _i}\ell _{\tau }(y_i-\varvec{x}_i^{T}\varvec{\beta }), {Q}(\varvec{\beta }) =\frac{1}{N} \sum _{i=1}^{N} \frac{\eta _i}{\pi _i}\phi _{\tau }(y_i-\varvec{x}_i^{T}\varvec{\beta })\varvec{x}_i\).

Lemma A.1

(Tu and Wang 2020, Lemma A.1) For \(\ell _{\tau }(s)=s^2|\tau -I(s\le 0)|, \phi _{\tau }(s)=s|\tau -I(s \le 0)|, \psi _{\tau }(s)=|\tau -I(s \le 0)|, \Gamma (s,t)= I(s<0)-I(s+t<0)\), we have

  1. (i)

    \(\ell _{\tau }(s+t)-\ell _{\tau }(s)=2\phi _{\tau }(s)t+\psi _{\tau }(s)t^2 + (2\tau -1)(s+t)^2\Gamma (s,t)\),

  2. (ii)

    \(\phi _{\tau }(s+t)-\phi _{\tau }(s)=t\psi _{\tau }(s)+(2\tau -1)(s+t)\Gamma (s,t)\).

Lemma A.2

Let \(a_n=\sqrt{p_n/n}, \Gamma (s,t)= I(s<0)-I(s+t<0),\varvec{u} \in \mathbb {R}^{p_n} \) such that \(\Vert \varvec{u}\Vert \le c\), where c is a constant. Under the conditions (C2)-(C4), for \(k=2,8\), we have

$$\begin{aligned}&\left| E\left[ (\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^k\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \right| \\&\quad =O\left( a_n^{k+1}p_n^{\frac{k}{2}}\Vert \varvec{u}\Vert ^{k+1}\right) . \end{aligned}$$

Proof

It follows that

$$\begin{aligned}&\left| E\left[ (\varepsilon _i-a_n\varvec{x}_i^T\varvec{u})^k \Gamma (\varepsilon _i,-a_n\varvec{x}_i^T\varvec{u})\right] \right| \\ {}&=\left| E\left\{ E\left[ (\varepsilon _i-a_n\varvec{x}_i^T\varvec{u})^k \Gamma (\varepsilon _i,-a_n\varvec{x}_i^T\varvec{u})|\varvec{x}\right] \right\} \right| \\&\quad = \left| E \left[ \int _{a_n\varvec{x}_i^T\varvec{u}}^0 (\varepsilon _i-a_n\varvec{x}_i^T\varvec{u})^k f(\varepsilon _i|\varvec{x})d\varepsilon _i \right] \right| \\&\quad =\left| E \left[ \int _{0}^{-a_n\varvec{x}_i^T\varvec{u}} t_i^k f(t_i+a_n\varvec{x}_i^T\varvec{u}|\varvec{x})dt_i \right] \right| \\&\quad \le \frac{C}{k+1}a_n^{k+1}E|\varvec{x}_i^{T}\varvec{u}|^{k+1}\\&\quad \le Ca_n^{k+1}\left[ E\left( \varvec{x}_i^{T}\varvec{u}\right) ^{2k}\right] ^{\frac{1}{2}} \left[ E\left( \varvec{x}_i^{T}\varvec{u}\right) ^{2}\right] ^{\frac{1}{2}}\\&\quad \le Ca_n^{k+1}\Vert \varvec{u}\Vert ^{k+1}\left[ p_n^k \max _{m}E\left( {x}_{im}\right) ^{2k}\right] ^{\frac{1}{2}}\\&\qquad \times \left\{ \lambda _{max}\left[ E\left( \varvec{x}_i\varvec{x}_i^T \right) \right] \right\} ^{\frac{1}{2}} \\&\quad = O\left( a_n^{k+1}p_n^{\frac{k}{2}}\Vert \varvec{u}\Vert ^{k+1}\right) , \end{aligned}$$

where the first inequality invokes the condition (C2), the second inequality applies Cauchy-Schwarz inequality, the last inequality uses Schwarz and Loève’s \(c_r\) inequalities and the fact \(\varvec{u}^TE(\varvec{x}_i\varvec{x}_i^T)\varvec{u}\le \Vert \varvec{u}\Vert ^2 \lambda _{max}\left[ E\left( \varvec{x}_i\varvec{x}_i^T \right) \right] \), the last line holds by the conditions (C3)(ii) and (C4)(i). \(\square \)

Lemma A.3

Under the conditions (C1), (C3)-(C5), if \(p_n^3/n \rightarrow 0\), we have

$$\begin{aligned} \sqrt{n}A_nV^{-1/2}Q(\varvec{\beta }_0){\mathop {\rightarrow }\limits ^{d}} N(0,A_0). \end{aligned}$$

Proof

Let \(\sqrt{n}A_nV^{-1/2}Q(\varvec{\beta }_0)=\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}A_n \times V^{-1/2}\phi _{\tau }(\varepsilon _i)\varvec{x}_i=: \sum _{i=1}^{N}\varvec{\xi }_{i}\). Now, we check the condition of Lindeberg-Feller central limit theorem (Proposition 2.27 in Van der Vaart (1998)). For \(\forall \epsilon >0\),

$$\begin{aligned}&\sum _{i=1}^{N}E(\Vert \varvec{\xi }_{i}\Vert ^2I(\Vert \varvec{\xi }_{i} \Vert \ge \epsilon )) \nonumber \\&\quad \le \frac{1}{\epsilon }\sum _{i=1}^{N} E\left( \Vert \varvec{\xi }_{i}\Vert ^3 \right) = \frac{1}{\epsilon }\sum _{i=1}^{N} E\left[ E\left( \Vert \varvec{\xi }_{i}\Vert ^3|\mathcal {F}_N\right) \right] \nonumber \\&\quad \le \frac{n^{3/2} |||A_n V^{-1/2}|||^3 }{N\epsilon }\sum _{i=1}^{N} E\left[ \frac{1}{N^2\pi _i^2}|\phi _{\tau }^3(\varepsilon _i)|\Vert \varvec{x}_i\Vert ^3 \right] \nonumber \\&\quad \le \frac{Cn^{3/2} }{N\epsilon }\sum _{i=1}^{N} E\left[ \frac{1}{N^2\pi _i^2}|\varepsilon _i|^3\Vert \varvec{x}_i\Vert ^3 \right] \nonumber \\&\quad = \frac{Cn^{3/2} }{N\epsilon }\sum _{i=1}^{N} E\left[ \frac{1}{N^2\pi _i^2}\Vert \varvec{x}_i\Vert ^3E\left( |\varepsilon _i^3|\Big |\varvec{x}_i\right) \right] \nonumber \\&\quad \le \frac{Cn^{3/2} }{N\epsilon }\sum _{i=1}^{N} E\left[ \frac{1}{N^2\pi _i^2}\Vert \varvec{x}_i\Vert ^3 \right] \nonumber \\&\quad \le \frac{Cn^{3/2} }{N\epsilon }\sum _{i=1}^{N} \left[ E\left( \frac{1}{N^4\pi _i^4}\right) \right] ^{\frac{1}{2}}\left[ E\left( \Vert \varvec{x}_i\Vert ^6\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \le \frac{Cn^{3/2} }{\epsilon } \left[ \max \limits _{ i }E\left( \frac{1}{N^4\pi _i^4}\right) \right] ^{\frac{1}{2}}\left[ p_n^3\max _m E\left( {x}_{im}^6\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \le O_P\left( \frac{p_n^{3/2}}{\sqrt{n}}\right) \nonumber \\&\quad =o_P(1), \end{aligned}$$
(A.1)

where the fourth line applies the fact \(|\phi _{\tau }(\varepsilon _i)|\le |\varepsilon _i|\) and the conclusion \(|||A_n V^{-1/2}|||=O(1)\), the sixth line is due to condition (C3)(iii), the seventh line holds by Cauchy-Schwarz inequality, the eighth line uses Loève’s \(c_r\) inequality, the last inequality invokes tha conditions (C5) and (C3)(ii), the last equation uses the condition \(p_n^3/n \rightarrow 0\). Next, we show \( |||A_nV^{-1/2} |||=O(1)\). For any \(A_n\), by condition (C4)(ii), we have

$$\begin{aligned} |||A_nV^{-1/2}|||&=\left[ \lambda _{\max }\left( V^{-1/2}A_n^TA_n V^{-1/2}\right) \right] ^{1/2} \nonumber \\&\le \left[ tr(V^{-1/2}A_n^TA_n V^{-1/2})\right] ^{1/2} \nonumber \\&= \left[ tr(V^{-1}A_n^TA_n )\right] ^{1/2} \nonumber \\&\le \left[ \lambda _{\max }(V^{-1})tr(A_n^TA_n) \right] ^{1/2} \nonumber \\&=\left[ \lambda _{\max }(V^{-1}) \right] ^{1/2} \left[ tr(A_nA_n^T) \right] ^{1/2} \nonumber \\&=O(1), \end{aligned}$$
(A.2)

where the second inequality applies the conclusion \(tr(UW)\le \lambda _{\max }(U)tr(W)\) for any symmetric matrix U and positive semidefinite matrix W (Bernstein 2005). Thus the condition of Lindeberg-Feller central limit theorem is satisfied. Note that

$$\begin{aligned}&\sum _{i=1}^{N}E\left( \varvec{\xi }_i\right) \begin{aligned}{}[t]&=A_nV^{-1/2}\sum _{i=1}^{N}E\left[ E\left( \varvec{\xi }_i|\mathcal {F}_N\right) \right] \\&=\sqrt{n}A_nV^{-1/2}E\left[ \phi _{\tau }(\varepsilon _i)\varvec{x}_i\right] =0, \end{aligned}\\&\sum _{i=1}^{N}Var(\varvec{\xi }_{i}) =\sum _{i=1}^{N}E\left( \varvec{\xi }_{i}\varvec{\xi }_{i}^T\right) =\sum _{i=1}^{N}E \left[ E\left( \varvec{\xi }_{i}\varvec{\xi }_{i}^T \Big |\mathcal {F}_N\right) \right] \nonumber \\&= A_nV^{-1/2} \cdot E\left[ \frac{1}{N}\sum _{i=1}^{N}\frac{n}{N\pi _i}\phi _{\tau }^2(\varepsilon _i)\varvec{x}_i\varvec{x}_i^T \right] \cdot V^{-1/2}A_n^T \\&\rightarrow A_nA_n^T \rightarrow A_0, \end{aligned}$$

where the last line invokes condition (C4)(ii). Then, the desired result holds by Lindeberg–Feller central limit theorem. \(\square \)

Lemma A.4

Let \(Q_1(\Delta )=\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}\psi _{\tau }(\varepsilon _i)A_n \times V^{-1/2}\varvec{x}_i\varvec{x}_i^T\Delta \), where \(\Delta =\varvec{\beta }-\varvec{\beta }_0\). Under the conditions (C3) and (C5), if \(p_n^5/n \rightarrow 0\) we have

$$\begin{aligned} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| Q_1(\Delta )-\sqrt{n}A_nV^{-1/2}D\Delta \right\| =o_{P}(1). \end{aligned}$$

Proof

Note that \(Q_1(\Delta )=\sqrt{n}A_nV^{-1/2}\mathcal {H}({\varvec{\beta }} _0)\Delta \), where \(\mathcal {H}({\varvec{\beta }} _0)=\frac{1}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\psi _{\tau }(\varepsilon _i)\varvec{x}_i\varvec{x}_i^T\), then

$$\begin{aligned}&\left\| Q_1(\Delta )-\sqrt{n}A_nV^{-1/2}D\Delta \right\| \nonumber \\&\quad \le \sqrt{n} |||A_nV^{-1/2}||||||\mathcal {H}({\varvec{\beta }} _0)-D|||\Vert \Delta \Vert . \end{aligned}$$
(A.3)

Denote \({\mathcal {D}^*} = \mathcal {H}({\varvec{\beta }} _0)-D=(\mathcal {D}_{lm}^*), 1\le l,m \le p_n\). Using the fact that \(|||U |||\le q {\mathop {\max }\nolimits _{1 \le i,j \le q} |U_{ij}|}\) for a \(q \times q\) matrix U, we have, \(\forall \epsilon >0\),

$$\begin{aligned}&P\left\{ |||\mathcal {H}({\varvec{\beta }} _0)-D |||> \epsilon \right\} \nonumber \\&\quad \le P\left\{ \mathop {\max }\limits _{l,m} \left| {\mathcal {D}_{lm}^*} \right|> \frac{\epsilon }{p_n} \right\} \nonumber \\&\quad \le \sum \limits _{l = 1}^{p_n} \sum \limits _{m = 1}^{p_n} P\left\{ \left| \mathcal {D}_{lm}^* \right|> \frac{\epsilon }{p_n} \right\} \nonumber \\&\quad \le p_n^2\mathop {\max }\limits _{1\le l,m \le p_n} P\left\{ \left| \mathcal {D}_{lm}^* \right| > \frac{\epsilon }{p_n} \right\} \nonumber \\&\quad \le \frac{p_n^4}{\epsilon ^2} \mathop {\max } \limits _{1\le l,m \le p_n} E\left[ \left| \mathcal {D}_{lm}^* \right| ^2 \right] \nonumber \\&\quad = \frac{p_n^4}{\epsilon ^2N^2}\mathop {\max }\limits _{1\le l,m \le p_n} \sum \limits _{i = 1}^N E \Big \{ \frac{\eta _i}{\pi _i} \psi _{\tau }(\varepsilon _i){x}_{il}{x}_{im} \nonumber \\&\qquad -E\left[ \psi _{\tau }(\varepsilon _i){x}_{il}{x}_{im}\right] \Big \} ^2 \nonumber \\&\quad \le \frac{p_n^4}{\epsilon ^2N^2}\mathop {\max }\limits _{1\le l,m \le p_n} \sum \limits _{i = 1}^N E\left\{ \frac{\eta _i}{\pi _i} \psi _{\tau }(\varepsilon _i){x}_{il}{x}_{im}\right\} ^2 \nonumber \\&\quad = \frac{p_n^4}{\epsilon ^2N}\mathop {\max }\limits _{1\le l,m \le p_n} \sum \limits _{i = 1}^N E\left[ \frac{1}{N\pi _i} \psi _{\tau }^2(\varepsilon _i){x}_{il}^2{x}_{im}^2\right] \nonumber \\&\quad \le \frac{p_n^4}{\epsilon ^2N}\mathop {\max }\limits _{1\le l,m \le p_n} \sum \limits _{i = 1}^N E\left[ \frac{1}{N\pi _i} {x}_{il}^2{x}_{im}^2\right] \nonumber \\&\quad \le \frac{p_n^4}{\epsilon ^2N}\mathop {\max }\limits _{1\le l,m \le p_n} \sum \limits _{i = 1}^N \left[ E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}}\nonumber \\&\qquad \times \left[ E\left( x_{il}^8\right) \right] ^{\frac{1}{4}} \left[ E\left( x_{im}^8\right) \right] ^{\frac{1}{4}} \nonumber \\&\quad \le \frac{Cp_n^4}{\epsilon ^{2}n} , \end{aligned}$$
(A.4)

where the second inequality applies Boole’s inequality, the fourth inequality uses Markov’s inequality, the first equation is duo to the fact \(E(\mathcal {D}_{lm}^{*})=0\), the sixth inequality holds by the fact \(\psi _{\tau }(\varepsilon _i) \le 1\), the seventh inequality uses Hölder’s inequality, and the last line invokes conditions (C3)(ii) and (C5). (A.4) implies

$$\begin{aligned} |||\mathcal {H}({\varvec{\beta }} _0)-D |||= {O_P}(p_n^2/{\sqrt{n}}). \end{aligned}$$
(A.5)

By (A.2), (A.3), (A.5) and the condition \(p_n^5/n \rightarrow 0\), we have

$$\begin{aligned}&\sup _{\Vert \varvec{\beta }-\varvec{\beta }_0\Vert \le C\sqrt{p_n/n}} \left\| Q_1(\Delta )-\sqrt{n}A_nV^{-1/2}D\Delta \right\| \\&\quad \le \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \sqrt{n} |||A_nV^{-1/2}||||||\mathcal {H}({\varvec{\beta }} _0)-D|||\\&\qquad \times \Vert \Delta \Vert \\&\quad \le O_P(p_n^{5/2}/\sqrt{n})=o_P(1). \end{aligned}$$

\(\square \)

Lemma A.5

Let \(g_i(s)=(\varepsilon _i-s)[I(\varepsilon _i<0) -I(\varepsilon _i <s)]\), \(Q_2(\Delta ) =\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}g_i\left( \varvec{x}_i^T\Delta \right) A_n V^{-1/2}\varvec{x}_i\), where \(\Delta =\varvec{\beta }-\varvec{\beta }_0\). Under the conditions (C2)-(C6), if \(p_n^4/n \rightarrow 0\), we have

$$\begin{aligned} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| Q_2(\Delta )\right\| =o_{P}(1). \end{aligned}$$

Proof

Let \(\varvec{b}_i=A_n V^{-1/2}\varvec{x}_i=\varvec{b}_i^{+}-\varvec{b}_i^{-}\), where \({b}_{ij}^{+}=\max \{ {b}_{ij},0\}\), \({b}_{ij}^{-}=\max \{ -{b}_{ij},0\}\), \({b}_{ij}^{+}, {b}_{ij}^{-}\), and \({b}_{ij}\) denote the jth component of \(\varvec{b}_{ij}^{+}, \varvec{b}_{ij}^{-}\), and \(\varvec{b}_{ij}\) respectively (\(j=1,\ldots ,q\)). Note that,

$$\begin{aligned}&\sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| Q_2(\Delta )\right\| \\&\quad \le \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| Q_2(\Delta )-E\left[ Q_2(\Delta )\right] \right\| \\&\qquad + \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| E\left[ Q_2(\Delta )\right] \right\| \\&\quad =:J_1 +J_2. \end{aligned}$$

Firstly, we show \(J_1=o_{P}(1)\). By the triangle inequality,

$$\begin{aligned} J_1&=\sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| Q_2(\Delta )-E\left[ Q_2(\Delta )\right] \right\| \\&\le \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| Q_2^{+}(\Delta )-E\left[ Q_2^{+}(\Delta )\right] \right\| \\&~~~+ \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| Q_2^{-}(\Delta )-E\left[ Q_2^{-}(\Delta )\right] \right\| \\&=:J_{11}+J_{12}, \end{aligned}$$

where \(Q_2^{+}(\Delta )=\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}g_i\left( \varvec{x}_i^T\Delta \right) \varvec{b}_i^{+}\) and \(Q_2^{-}(\Delta )=\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}g_i\left( \varvec{x}_i^T\Delta \right) \varvec{b}_i^{-}\). It suffices to prove that \(J_{1l}=o_{P}(1)\) for \( l=1,2\). Now, we only show \(J_{11}=o_{P}(1)\) and the second term can be proved by the same token. Let \(\mathbb {D} = \left\{ \Delta \in \mathbb {R}^{p_n} \Big | \Vert \Delta \Vert \le C\sqrt{p_n/n} \right\} \), then by selecting \(N_0=n^{2p_n}\) grid points \(\{ \Delta _t \}_{1}^{m}\), \(\mathbb {D}\) can be covered by \(\bigcup _{t=1}^{m}\mathbb {D}_t\), where \(\mathbb {D}_t = \left\{ \Delta \in \mathbb {R}^{p_n} \Big | \Vert \Delta - \Delta _t\Vert _{\infty } \le \delta _n \right\} \) with \(\delta _n= Cp_n^{1/2}n^{-5/2}\). Define \(w_{it}(s)=g_i(\varvec{x}_i^T \Delta _t - s\Vert \varvec{x}_i\Vert )\). Note that \(g_i(s)\) is monotone, by the triangle inequality, we have

$$\begin{aligned} J_{11}&=\sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| Q_2^{+}(\Delta )-E\left[ Q_2^{+}(\Delta )\right] \right\| \\&\le \max _{1\le t \le N_0} \sup _{\Delta \in \mathbb {D}_t } \left\| Q_2^{+}(\Delta )-E\left[ Q_2^{+}(\Delta )\right] \right\| \\&\le \max _{1\le t \le N_0} \left\| Q_2^{+}(\Delta _t)-E\left[ Q_2^{+}(\Delta _t) \right] \right\| \\&\quad + \max _{1\le t \le N_0} \Bigg \Vert \frac{1}{N}\sum _{i=1}^{N}\Bigg \{E\left[ \frac{\sqrt{n}\eta _i}{\pi _i}w_{it}(\delta _n)\varvec{b}_i^{+}\right] \\&\quad -E\left[ \frac{\sqrt{n}\eta _i}{\pi _i}w_{it}(-\delta _n)\varvec{b}_i^{+}\right] \Bigg \} \Bigg \Vert \\&\quad +\max _{1\le t \le N_0} \Bigg \Vert \frac{1}{N} \sum _{i=1}^{N}\Bigg \{\frac{\sqrt{n}\eta _i}{\pi _i}[w_{it}(\delta _n)I(\varepsilon _i\ge 0)\\&\quad +w_{it}(-\delta _n)I(\varepsilon _i< 0) - w_{it}(0) ]\varvec{b}_i^{+}\\&\quad -E\Big [\frac{\sqrt{n}\eta _i}{\pi _i}[w_{it}(\delta _n)I(\varepsilon _i\ge 0)\\&\quad +w_{it}(-\delta _n)I(\varepsilon _i < 0) - w_{it}(0) ]\varvec{b}_i^{+}\Big ]\Bigg \} \Bigg \Vert \\&=: J_{111}+ J_{112}+ J_{113}. \end{aligned}$$

Next, we consider \(J_{111}\). Let \(d_i =\frac{\sqrt{n}\eta _i}{\pi _i} \varvec{x}_i^T\Delta _t \varvec{b}_i^{+}\), \(\zeta _{it}=\frac{\sqrt{n}\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta _t \right) \varvec{b}_i^{+}\), \(\zeta _{it}^*=\zeta _{it} - E(\zeta _{it})\) and \(e_n= Nn^{-1/2}p_n^{3/2}\). Then

$$\begin{aligned}&Q_2^{+}(\Delta _t)-E\left[ Q_2^{+}(\Delta _t) \right] \\&\quad =\frac{1}{N}\sum _{i=1}^{N}\Bigg \{\frac{\sqrt{n}\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta _t \right) \varvec{b}_i^{+} \\&\qquad - E\left[ \frac{\sqrt{n}\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta _t \right) \varvec{b}_i^{+}\right] \Bigg \} \\&\quad =\frac{1}{N}\sum _{i=1}^{N}\left[ \zeta _{it} - E(\zeta _{it}) \right] =\frac{1}{N}\sum _{i=1}^{N}\zeta _{it}^* \\&\quad = \frac{1}{N}\sum _{i=1}^{N}\zeta _{it}^* I(\Vert d_i \Vert \le e_{n}) + \frac{1}{N}\sum _{i=1}^{N}\zeta _{it}^* I(\Vert d_i \Vert> e_{n})\\&\quad = \frac{1}{N}\sum _{i=1}^{N}\left\{ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n}) - E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \right\} \\&\qquad + \frac{1}{N}\sum _{i=1}^{N}E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \\&\qquad + \frac{1}{N}\sum _{i=1}^{N}\zeta _{it}^* I(\Vert d_i\Vert > e_{n})\\&\quad =: J_{1t}^{*} + J_{2t}^{*} +J_{3t}^{*}. \end{aligned}$$

It is sufficient to show \(J_{111}=o_{P}(1)\) by demonstrating that \(\max _{1\le t \le N_0}\Vert J_{lt}^{*}\Vert =o_{P}(1)\) for \(l=1,2,3\). First, let’s consider the first item \(J_{1t}^{*}\), note that \( E\Bigg \{\zeta _{it}^* I(\Vert d_i \Vert \le e_{n})-E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \Bigg \}=\varvec{0} \) and

$$\begin{aligned}&\quad \frac{1}{N}\sum _{i=1}^{N}E\left\| \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})-E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \right\| ^2 \\&\quad \le \frac{1}{N}\sum _{i=1}^{N}E\left\| \zeta _{it}^{*2} I(\Vert d_i \Vert \le e_{n}) \right\| ^2 \\&\quad \le \frac{1}{N} \sum _{i=1}^{N}\Bigg \{E\left[ \Vert \zeta _{it}\Vert ^{2} I(\Vert d_i \Vert \le e_{n})\right] +2 E(\Vert \zeta _{it}\Vert )\\&\qquad \times E\left\| \zeta _{it} I(\Vert d_i \Vert \le e_{n})\right\| \\&\qquad + \left( E\Vert \zeta _{it}\Vert \right) ^2E\left[ I(\Vert d_i \Vert \le e_{n})\right] \Bigg \}, \end{aligned}$$

where

$$\begin{aligned}&\quad E\left\| \zeta _{it}^{2} I(\Vert d_i \Vert \le e_{n})\right\| \\&\quad = E\left[ \frac{n\eta _i^2}{\pi _i^2}g_i^2\left( \varvec{x}_i^T\Delta _t \right) \Vert \varvec{b}_i^{+}\Vert ^2I(\Vert d_i \Vert \le e_{n})\right] \\&\quad = E\left\{ \frac{n\eta _i^2}{\pi _i^2}\Vert \varvec{b}_i^{+}\Vert ^2I(\Vert d_i \Vert \le e_{n})E\left[ g_i^2(\varvec{x}_i^T \Delta _t) |\varvec{x}_i\right] \right\} \\&\quad = E\Bigg \{\frac{n\eta _i^2}{\pi _i^2}\Vert \varvec{b}_i^{+}\Vert ^2I(\Vert d_i \Vert \le e_{n}) \\&\qquad \times \int _{\varvec{x}_i^T \Delta _t}^{0} (\varepsilon _i-\varvec{x}_i^T \Delta _t)^2 f(\varepsilon _i|\varvec{x})d{\varepsilon _i} \Bigg \} \\&\quad \le {C} E\left[ \frac{n\eta _i^2}{\pi _i^2}|\varvec{x}_i^T \Delta _t|^3 \Vert \varvec{b}_i^{+}\Vert ^2I(\Vert d_i \Vert \le e_{n}) \right] \\&\quad = C E\left[ \Vert d_i\Vert ^{2}I(\Vert d_i \Vert \le e_{n}) |\varvec{x}_i^T \Delta _t| \right] \\&\quad \le C e_n^2 \left[ E(\varvec{x}_i^T \Delta _t)^2\right] ^{1/2} \\&\quad \le Ce_n^2 \left\{ \lambda _{\max }\left[ E\left( \varvec{x}_i \varvec{x}_i^T\right) \right] \Vert \Delta _t\Vert ^2\right\} ^{1/2}\\&\quad = O(e_n^2 \sqrt{p_n/n}), \end{aligned}$$

where the second inequality applies Jensen’s inequality, and the last line holds by condition (C4)(i). Similarly, we have \(E(\Vert \zeta _{it}\Vert )=O(p_n^{3/2}/\sqrt{n})\) and \(E\Vert \zeta _{it}I(\Vert d_i \Vert \le e_{n})\Vert =O(e_n\sqrt{p_n/n})\). Thus

$$\begin{aligned}&\quad \frac{1}{N}\sum _{i=1}^{N}E\left\| \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})-E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \right\| ^2\\&\quad = O(N^{2}n^{-3/2}p_n^{7/2}). \end{aligned}$$

By the fact \(|g_i(s)|\le |s|\),

$$\begin{aligned}&\left\| \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }\\&\quad \le \left\| \zeta _{it} I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }+\left\| E\left( \zeta _{it}\right) I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }\\&\quad =\left\| \frac{\sqrt{n}\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta _t \right) \varvec{b}_i^{+}I(\Vert d_i \Vert \le e_{n})\right\| _{\infty } \\&\qquad + \left\| E\left[ \frac{\sqrt{n}\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta _t \right) \varvec{b}_i^{+}\right] I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }\\&\quad \le \left\| \frac{\sqrt{n}\eta _i}{\pi _i}\varvec{x}_i^T\Delta _t \varvec{b}_i^{+}I(\Vert d_i \Vert \le e_{n})\right\| _{\infty } \\&\qquad + E\left\| \left[ \frac{\sqrt{n}\eta _i}{\pi _i}\varvec{x}_i^T\Delta _t \varvec{b}_i^{+}\right] I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }\\&\quad =\left\| d_i I(\Vert d_i \Vert \le e_{n})\right\| _{\infty } + \left\| E(d_i)I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }\\&\quad \le \left\| d_i I(\Vert d_i \Vert \le e_{n})\right\| + \left\| E(d_i)I(\Vert d_i \Vert \le e_{n})\right\| \le 2e_n. \end{aligned}$$

Then

$$\begin{aligned}&\left\| \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})-E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \right\| _{\infty } \\&\quad \le \left\| \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right\| _{\infty }+\left\| E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \right\| _{\infty } \\&\quad \le 4Nn^{-1/2}p_n^{3/2}. \end{aligned}$$

Thus, applying Boole’s and Bernstein’s inequalities (Serfling 1980), \(\forall \varepsilon >0\), we have

$$\begin{aligned}&P\left\{ \max _{1\le t \le N_0}\Vert J_{1t}^{*}\Vert \ge \epsilon \right\} \\&\quad \le q N_0 \max _{1\le t \le N_0}P\Bigg \{\Bigg \Vert \frac{1}{N}\sum _{i=1}^{N}\{\zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\\&\qquad - E\left[ \zeta _{it}^* I(\Vert d_i \Vert \le e_{n})\right] \} \Bigg \Vert \ge \epsilon \Bigg \} \\&\quad \le 2qN_0 \exp \left( \frac{-N\epsilon ^2}{2CN^{2} n^{-3/2} p_n^{7/2} + 2C \epsilon Nn^{-1/2}p_n^{3/2}/3 } \right) \\&\quad \le 2q\exp (2p_n \log n) \cdot \exp (-3p_n \log n) =o(1), \end{aligned}$$

where the last inequality is duo to \(n^{1/2}p_n^{-3/2} \gg n^{3/2}N^{-1}p_n^{-7/2}\) and \(n^{3/2}N^{-1}p_n^{-7/2} \gg p_n\log n \) by condition (C6). This implies

$$\begin{aligned} \max _{1\le t \le m}\Vert J_{1t}^{*}\Vert =o_{P}(1). \end{aligned}$$
(A.6)

Note that

$$\begin{aligned}&E\left( \left\| \zeta _{it}\right\| ^2\right) \nonumber \\&\quad = E\left[ \frac{n\eta _i^2}{\pi _i^2}g_i^2\left( \varvec{x}_i^T\Delta _t \right) \Vert \varvec{b}_i^{+}\Vert ^2\right] \nonumber \\&\quad = E\left[ \frac{n}{\pi _i}g_i^2\left( \varvec{x}_i^T\Delta _t \right) \Vert \varvec{b}_i^{+}\Vert ^2\right] \nonumber \\&\quad = E\left\{ \frac{n}{\pi _i}\Vert \varvec{b}_i^{+}\Vert ^2E\left[ g_i^2(\varvec{x}_i^T \Delta _t) |\varvec{x}_i\right] \right\} \nonumber \\&\quad = E\left\{ \frac{n}{\pi _i}\Vert \varvec{b}_i^{+}\Vert ^2 \int _{\varvec{x}_i^T \Delta _t}^{0} (\varepsilon _i-\varvec{x}_i^T \Delta _t)^2 f(\varepsilon _i|\varvec{x})d{\varepsilon _i} \right\} \nonumber \\&\quad \le {C} E\left[ \frac{n}{\pi _i}|\varvec{x}_i^T \Delta _t|^3 \Vert A_nV^{-1/2}\varvec{x}_i\Vert ^2 \right] \nonumber \\&\quad \le C Nn\Vert \Delta _t\Vert ^3|||A_nV^{-1/2}|||^2 E\left[ \frac{1}{N\pi _i} \Vert \varvec{x}_i\Vert ^5 \right] \nonumber \\&\quad \le \frac{CNp_n^{3/2}}{\sqrt{n}} \left[ E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}} \left[ E\left( \Vert \varvec{x}_i\Vert ^{10}\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \le \frac{CNp_n^{3/2}}{\sqrt{n}} \left[ \max _i E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}} \left[ p_n^5 \max _mE\left( {x}_{im}^{10}\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad = O(Nn^{-3/2}p_n^{4}), \end{aligned}$$
(A.7)

where the first inequality invokes the condition (C2), the third inequality is duo (A.2) and Cauchy-Schwarz inequality, the last inequality holds by Loève’s \(c_r\) inequality, the last line invokes conditions (C5) and (C3)(ii). Similarly, we have

$$\begin{aligned}&E(\Vert d_i\Vert ^2) \nonumber \\&\quad = E\left[ \frac{n\eta _i^2}{\pi _i^2}(\varvec{x}_i^T \Delta _t)^2\Vert A_nV^{-1/2}\varvec{x}_i\Vert ^2\right] \nonumber \\&\quad = E\left[ \frac{n}{\pi _i}(\varvec{x}_i^T \Delta _t)^2\Vert A_nV^{-1/2}\varvec{x}_i\Vert ^2\right] \nonumber \\&\quad \le Nn |||A_nV^{-1/2} |||^2\Vert \Delta _t\Vert ^2 E\left[ \frac{1}{N\pi _i}\Vert \varvec{x}_i\Vert ^4\right] \nonumber \\&\quad \le CNp_n \left[ E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}}\left[ E\left( \Vert \varvec{x}_i\Vert ^{8}\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad = O(Nn^{-1}p_n^{3}). \end{aligned}$$
(A.8)

Thus, by (A.7) and (A.8)

$$\begin{aligned}&\Vert J_{2t}^{*}\Vert \le \frac{1}{N}\sum _{i=1}^{N}\left\| E\left[ \zeta _{it}^*I(\Vert d_i \Vert \le e_{n}) \right] \right\| \nonumber \\&\quad \le \frac{1}{N}\sum _{i=1}^{N}E\left\| \left[ \zeta _{it}^*I(\Vert d_i \Vert \le e_{n}) \right] \right\| \nonumber \\&\quad \le \frac{1}{N}\sum _{i=1}^{N}\left[ E\left( \Vert \zeta _{it}^*\Vert ^{2}\right) \right] ^{\frac{1}{2}} \left\{ E\left[ I(\Vert d_i\Vert \le e_n)\right] \right\} ^{\frac{1}{2}} \nonumber \\ {}&=\frac{1}{N}\sum _{i=1}^{N}\left[ E\left( \Vert \zeta _{it}^*\Vert ^{2}\right) \right] ^{\frac{1}{2}} \left[ P\left( \Vert d_i\Vert \le e_n \right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \le \frac{1}{N}\sum _{i=1}^{N} \left[ E\left( \Vert \zeta _{it}\Vert ^{2}\right) \right] ^{\frac{1}{2}} \left[ e_n^{-2}E(\Vert d_i\Vert ^2)\right] ^{\frac{1}{2}}\nonumber \\ {}&=O(p_n^{2}/n^{3/4})\nonumber \\&\quad =o(1), \end{aligned}$$
(A.9)

where the second line applies Cauchy-Schwarz inequality, the last line holds by condition \(p_n^4/n \rightarrow 0\). For \(J_{3t}^{*}\), let \(d_i^{*}=n^{1/2}N^{-1/2}p_n^{-3/2}d_i\), then \(E\Vert d_i^{*}\Vert ^2=O(1)\) by (A.8). By the monotone probability and Boole’s inequalities and Lebesgue dominated convergence theorem, we can derive that

$$\begin{aligned}&P\left\{ \max _{1\le t \le N_0}\Vert J_{3t}^{*}\Vert \ge \epsilon \right\} \\&\quad \le P\left\{ \max _{1\le i \le N} \Vert d_i\Vert> e_n \right\} \le \sum _{i=1}^{N} P\left\{ \Vert d_i\Vert> e_n \right\} \nonumber \\&\quad \le \sum _{i=1}^{N} P\left\{ \Vert d_i^*\Vert ^2> nN^{-1}p_n^{-3}e_n^2 \right\} \nonumber \\&\quad = \sum _{i=1}^{N} E\left[ I\left( \Vert d_i^*\Vert ^2>nN^{-1}p_n^{-3}e_n^2\right) \right] \nonumber \\&\quad \le \frac{N^2p_n^3}{ne_n^2} E\left[ \Vert d_i^*\Vert ^{2}I\left( \Vert d_i^*\Vert ^2>nN^{-1}p_n^{-3}e_n^2\right) \right] \\&\quad =o(1), \end{aligned}$$

which results in

$$\begin{aligned} \max _{1\le t \le N_0}\Vert J_{3t}^{*}\Vert =o_{P}(1). \end{aligned}$$
(A.10)

(A.6), (A.9) and (A.10) imply \(J_{111}=o_{P}(1)\).

For \(J_{112}\), we have

$$\begin{aligned}&~~~J_{112}\\ {}&\le \max _{1\le t \le N_0} \frac{1}{N}\sum _{i=1}^{N} E\left\| \frac{\sqrt{n}\eta _i}{\pi _i} \left[ w_{it}(\delta _n)-w_{it}(-\delta _n)\right] \varvec{b}_i^{+} \right\| \\&\le \sqrt{n} \max _{1\le t \le N_0} \frac{1}{N}\sum _{i=1}^{N} E\left\| \left[ w_{it}(\delta _n)-w_{it}(-\delta _n)\right] \varvec{b}_i^{+} \right\| \\&\le \sqrt{n} \max _{1\le t \le N_0} \frac{1}{N}\sum _{i=1}^{N} E\Big \Vert [g_i(\varvec{x}_i^T \Delta _t - \delta _n\Vert \varvec{x}_i\Vert )\\&~~~-g_i(\varvec{x}_i^T \Delta _t + \delta _n\Vert \varvec{x}_i\Vert ) ]\varvec{b}_i^{+}\Big \Vert \\&\le 2\sqrt{n} \delta _n E \left( \Vert \varvec{x}_i\Vert \Vert \varvec{b}_i^{+}\Vert \right) \\&\le 2\sqrt{n} \delta _n |||A_n V^{-1/2}|||E \left[ \Vert \varvec{x}_i\Vert ^2\right] \\&=O(p_n^{3/2}n^{-2}) \\&=o(1), \end{aligned}$$

where the first equation holds by (A.2), condition (C3)(ii) and Loève’s \(c_r\) inequality.

Adoption of similar discussions, we can obtain \(J_{113}=o_{P}(1)\). Thus \(J_{1}=o_{P}(1)\).

Finally, we show \(J_{2}=o_{p}(1)\). Note that

$$\begin{aligned} J_2&= \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}}\left\| E\left[ Q_2(\Delta )\right] \right\| \\&= \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| \frac{\sqrt{n}}{N} \sum _{i=1}^{N} E\left[ \frac{\eta _i}{\pi _i}g_i\left( \varvec{x}_i^T\Delta \right) \varvec{b}_i\right] \right\| \\&= \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| \frac{\sqrt{n}}{N} \sum _{i=1}^{N} E\left[ g_i\left( \varvec{x}_i^T\Delta \right) \varvec{b}_i\right] \right\| \\&= \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| \frac{\sqrt{n}}{N} \sum _{i=1}^{N} E\left\{ \varvec{b}_iE\left[ g_i(\varvec{x}_i^T \Delta _t) |\varvec{x}_i\right] \right\} \right\| \\&= \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \Bigg \Vert \frac{\sqrt{n}}{N} \sum _{i=1}^{N} E\Bigg \{\varvec{b}_i \int _{\varvec{x}_i^T \Delta _t}^{0} (\varepsilon _i-\varvec{x}_i^T \Delta _t)^2\\&\quad \times f(\varepsilon _i|\varvec{x})d{\varepsilon _i} \Bigg \} \Bigg \Vert \nonumber \\&\le C\sqrt{n} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\| E\left[ (\varvec{x}_i^T \Delta )^2 A_n V^{-1/2} \varvec{x}_i\right] \right\| \\&\le C\sqrt{n} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} E\left\| (\varvec{x}_i^T \Delta )^2 A_n V^{-1/2} \varvec{x}_i \right\| \\&\le C\sqrt{n} |||A_n V^{-1/2}|||\sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \Vert \Delta \Vert \\&\quad \times E\left[ |\varvec{x}_i^T \Delta |\Vert \varvec{x}_i\Vert ^2 \right] \\&\le C\sqrt{p_n} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left[ E(\varvec{x}_i^T \Delta )^2\right] ^{\frac{1}{2}} \left[ E\left( \Vert \varvec{x}_i\Vert ^4\right) \right] ^{\frac{1}{2}} \\&\le C\sqrt{p_n} \sup _{\Vert \Delta \Vert \le C\sqrt{p_n/n}} \left\{ \lambda _{\max }\left[ E\left( \varvec{x}_i \varvec{x}_i^T\right) \right] \Vert \Delta \Vert ^2\right\} ^{\frac{1}{2}}\\&\quad \times \left[ p_n^2 \max \limits _m E(x_{im}^{4})\right] ^{\frac{1}{2}} \\&= O(p_n^{2}/\sqrt{n})\\&=o(1), \end{aligned}$$

where the last line invokes condition \(p_n^4/n \rightarrow 0\).

The proof of the Lemma A.5 is completed. \(\square \)

Proof of Theorem 1

Let \(\mathscr {B}=\{\varvec{\beta }: \varvec{\beta }=\varvec{\beta }_0+\varvec{u} a_n, \varvec{u} \in \mathbb {R}^{p_n}, \Vert \varvec{u}\Vert \le c\}\) for \(a_n=\sqrt{{p_n/}{n}}\) and some constant c. By Fan and Li (2001), it is sufficient to demonstrate that for \(\forall \epsilon >0\), there exists a sufficiently large constant c, such that

$$\begin{aligned} P\left\{ \inf _{\Vert \varvec{u}\Vert =c} {\mathcal {L}}\left( \varvec{\beta }_0+{\varvec{u} }a_n\right)>{\mathcal {L}} \left( \varvec{\beta }_0\right) \right\} >1-\epsilon \end{aligned}$$
(A.11)

for large enough n. This implies that there is a local minimizer \(\widetilde{\varvec{\beta }}_{\mathcal {S}}\) of \({\mathcal {L}}(\varvec{\beta })\) in \(\mathscr {B}\) satisfies \(\Vert \widetilde{\varvec{\beta }}_{\mathcal {S}}- {\varvec{\beta }_0}\Vert =O_{p}(a_n)\). The local minimizer is the global minimizer because of the convexity of \({\mathcal {L}}(\varvec{\beta })\). By Lemma A.1 (i), we have

$$\begin{aligned}&{\mathcal {L}}\left( {\varvec{\beta }}_{0}+{\varvec{u}}a_n\right) - {\mathcal {L}}\left( \varvec{\beta }_0\right) \nonumber \\&\quad =\frac{1}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\left[ \ell _{\tau }(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})-\ell _{\tau }(\varepsilon _i)\right] \nonumber \\&\quad =\frac{1}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\Big \{-2a_n\phi _{\tau }(\varepsilon _i)\varvec{x_i}^T\varvec{u} + a_n^2\psi _{\tau }(\varepsilon _i)(\varvec{x_i}^T\varvec{u})^2 \nonumber \\&\qquad + (2\tau -1)(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u}) \Big \} \nonumber \\&=\frac{-2a_n}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\phi _{\tau }(\varepsilon _i)\varvec{x_i}^T\varvec{u} \nonumber \\&\qquad + \frac{a_n^2}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\psi _{\tau }(\varepsilon _i)\varvec{u}^T\varvec{x_i}\varvec{x_i}^T\varvec{u} \nonumber \\&\qquad + \sum _{i=1}^{N}\frac{\eta _i}{N\pi _i}(2\tau -1)(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u}) \nonumber \\&\quad =:I_1+I_2+I_3. \end{aligned}$$
(A.12)

Firstly, we consider the term \(I_1\). Note that

$$\begin{aligned} E(I_1^2)&=\frac{4a_n^2}{N^2}E\left[ \sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\phi _{\tau }(\varepsilon _i)\varvec{x_i}^T\varvec{u}\right] ^2 \nonumber \\&=\frac{4a_n^2}{N^2}\sum _{i=1}^{N}E\left[ \frac{\eta _i^2}{\pi _i}\phi _{\tau }^2(\varepsilon _i)(\varvec{x_i}^T\varvec{u})^2\right] \nonumber \\&=\frac{4a_n^2}{N^2}\sum _{i=1}^{N} E\left\{ E\left[ \frac{\eta _i^2}{\pi _i}\phi _{\tau }^2(\varepsilon _i)(\varvec{x_i}^T\varvec{u})^2\Bigg |\mathcal {F}_N\right] \right\} \nonumber \\&=\frac{4a_n^2}{N}\sum _{i=1}^{N} E\left[ \frac{1}{N\pi _i}\phi _{\tau }^2(\varepsilon _i)(\varvec{x_i}^T\varvec{u})^2 \right] \nonumber \\&\le \frac{4a_n^2}{N}\sum _{i=1}^{N} E\left[ \frac{1}{N\pi _i}\varepsilon _i^2(\varvec{x_i}^T\varvec{u})^2 \right] \nonumber \\&= \frac{4a_n^2}{N}\sum _{i=1}^{N} E\left[ \frac{1}{N\pi _i}(\varvec{x_i}^T\varvec{u})^2 E\left( \varepsilon _i^2|\varvec{x}_i\right) \right] \nonumber \\&= \frac{4Ca_n^2\Vert \varvec{u}\Vert ^2}{N}\sum _{i=1}^{N} E\left[ \frac{1}{N\pi _i}\Vert \varvec{x_i}\Vert ^2 \right] \nonumber \\&\le \frac{4Ca_n^2\Vert \varvec{u}\Vert ^2}{N}\sum _{i=1}^{N} \left[ E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}}\left[ E(\Vert \varvec{x_i}\Vert ^4\right] ^{\frac{1}{2}} \nonumber \\&\le 4Ca_n^2 \Vert \varvec{u}\Vert ^2\left[ \max \limits _{1\le i \le N}E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \times \left[ p_n^2 \max \limits _m E({x}_{im}^4) \right] ^{\frac{1}{2}} \nonumber \\&=O(a_n^4\Vert \varvec{u}\Vert ^2), \end{aligned}$$
(A.13)

where the second equation is duo to the fact \(E\left[ \frac{\eta _i}{\pi _i}\phi _{\tau }(\varepsilon _i)\varvec{x_i}^T\varvec{u}\right] =0\), the first inequality uses the fact \(|\phi _{\tau }(\varepsilon _i)|\le |\varepsilon _i|\), the fourth line applies condition (C3)(iii) and Cauchy-Schwarz inequality, the last inequality holds by Loève’s \(c_r\) inequality, the last line invokes conditions (C5) and (C3)(ii). Then, by Markov’s inequality and (A.13), we have

$$\begin{aligned} |I_1|=O_P(a_n^2\Vert \varvec{u}\Vert ). \end{aligned}$$
(A.14)

The second term of (A.12), \(I_2\), can be decomposed as follows

$$\begin{aligned} I_2=\frac{a_n^2}{2} \varvec{u}^T D \varvec{u} +\frac{a_n^2}{2} \varvec{u}^T [\mathcal {H}({\varvec{\beta }} _0)-D]\varvec{u}, \end{aligned}$$
(A.15)

where \(\mathcal {H}({\varvec{\beta }} _0)=\frac{1}{N}\sum _{i=1}^{N}\frac{\eta _i}{\pi _i}\psi _{\tau }(\varepsilon _i)\varvec{x}_i\varvec{x}_i^T\). Combining (A.5), (A.15), conditions (C4)(i) and \(p_n^4/n \rightarrow 0\), we have

$$\begin{aligned} I_{2} \ge \frac{c_1}{2}a_n^2\Vert \varvec{u}\Vert ^2 + o_P(1)a_n^2 \Vert \varvec{u}\Vert ^2. \end{aligned}$$
(A.16)

For \(I_3\), by Lemma A.2,

$$\begin{aligned} E(I_3)&=E\left[ \frac{\eta _i}{\pi _i}(2\tau -1)(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \nonumber \\&=(2\tau -1)E\left[ (\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \nonumber \\&=O\left( a_n^3p_n \Vert \varvec{u}\Vert ^3\right) , \end{aligned}$$
(A.17)

and

$$\begin{aligned} E(I_3^2)&=CE\left[ \sum _{i=1}^{N}\frac{\eta _i}{N\pi _i}(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] ^2 \nonumber \\&=\frac{C}{N^2}\sum _{i=1}^{N}E\left[ \frac{\eta _i^2}{\pi _i^2}(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^4\Gamma ^2(\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \nonumber \\&\quad + \frac{C}{N^2}\sum _{i=1}^{N}\sum _{j \ne i }E\Bigg [\frac{\eta _i}{\pi _i}(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^2 \nonumber \\&\quad \times \Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\Bigg ]\nonumber \\&\quad \times E\left[ \frac{\eta _j}{\pi _j}(\varepsilon _j-a_n\varvec{x_j}^T\varvec{u})^2\Gamma (\varepsilon _j,-a_n\varvec{x_j}^T\varvec{u})\right] \nonumber \\&\le \frac{C}{N^2}\sum _{i=1}^{N}E\left[ \frac{\eta _i^2}{\pi _i^2}(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^4\Gamma ^2(\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \nonumber \\&\quad + \left[ E(I_3)\right] ^2 \nonumber \\&=\frac{C}{N}\sum _{i=1}^{N}E\left[ \frac{1}{N\pi _i}(\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^4\Gamma ^2(\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \nonumber \\&\quad + \left[ E(I_3)\right] ^2 \nonumber \\&=\frac{C}{N}\sum _{i=1}^{N}\left[ E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}} \nonumber \\&\quad \times \left\{ E\left[ (\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^8\Gamma ^4(\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \right\} ^{\frac{1}{2}} \nonumber \\&\quad + \left[ E(I_3)\right] ^2 \nonumber \\&\le C\left[ \max \limits _{1\le i \le N}E\left( \frac{1}{N^2\pi _i^2}\right) \right] ^{\frac{1}{2}}\nonumber \\&\quad \times \left\{ E\left[ (\varepsilon _i-a_n\varvec{x_i}^T\varvec{u})^8\Gamma (\varepsilon _i,-a_n\varvec{x_i}^T\varvec{u})\right] \right\} ^{\frac{1}{2}} \nonumber \\&\quad + \left[ E(I_3)\right] ^2 \nonumber \\&=O\left( n^{-1}a_n^{9/2}p_n^{2}\Vert \varvec{u}\Vert ^{9/2}\right) + O\left( a_n^6p_n^{2} \Vert \varvec{u}\Vert ^6\right) \nonumber \\&=O\left( a_n^6p_n^{2} \Vert \varvec{u}\Vert ^6\right) , \end{aligned}$$
(A.18)

where \(C=(2\tau -1)^2\). Thus

$$\begin{aligned} I_{3}=O_P(a_n^3p_n \Vert \varvec{u}\Vert ^2)=o_P(1)a_n^2 \Vert \varvec{u}\Vert ^2 \end{aligned}$$
(A.19)

by Chebyshev’s inequality and the condition \(p_n^4/n\rightarrow 0\). Combining (A.14), (A.16) and (A.19), the second term \(I_2\), which is positive, dominates other two terms for a sufficiently large c with probability approaching one. Thus, (A.11) holds.

Proof of Theorem 2

Let \(\Delta =\varvec{\beta }- \varvec{\beta }_0 \) and \(\widetilde{\Delta }=\widetilde{\varvec{\beta }}_{\mathcal {S}}- \varvec{\beta }_0 \). By Lemma A.1 (ii),

$$\begin{aligned}&\sqrt{n}A_n V^{-1/2} \left[ Q(\varvec{\beta })-Q(\varvec{\beta }_0)\right] \\&\quad =\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N \pi _i} \left[ \phi _{\tau } \left( \varepsilon _i-\varvec{x}_i^T\Delta \right) - \phi _{\tau }(\varepsilon _i) \right] A_n V^{-1/2}\varvec{x}_i \\&\quad =\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N \pi _i} \{-\varvec{x}_i^T\Delta \psi _{\tau }(\varepsilon _i) + (2\tau - 1) \left( \varepsilon _i-\varvec{x}_i^T\Delta \right) \\&\qquad \times \Gamma (\varepsilon _i,-\varvec{x}_i^T\Delta ) \} A_n V^{-1/2}\varvec{x}_i \\&\quad =-\sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}\psi _{\tau }(\varepsilon _i)A_nV^{-1/2}\varvec{x}_i\varvec{x}_i^T\Delta \\&\qquad +(2\tau - 1) \sum _{i=1}^{N}\frac{\sqrt{n}\eta _i}{N\pi _i}\left( \varepsilon _i-\varvec{x}_i^T\Delta \right) \Gamma (\varepsilon _i,-\varvec{x}_i^T\Delta )\\&\qquad \times A_n V^{-1/2}\varvec{x}_i \\&\quad =-Q_1(\Delta ) + (2\tau - 1) Q_2(\Delta ). \end{aligned}$$

Then, we have \(\sqrt{n}A_n V^{-1/2} \left[ Q(\widetilde{\varvec{\beta }}_{\mathcal {S}})-Q(\varvec{\beta }_0) \right] =-Q_1(\widetilde{\Delta }) + (2\tau - 1) Q_2(\widetilde{\Delta })\). By Lemma A.4A.5 and the fact \(Q(\widetilde{\varvec{\beta }}_{\mathcal {S}})=0\),

$$\begin{aligned}&\sqrt{n}A_n V^{-1/2} Q(\varvec{\beta }_0)\\&\quad =Q_1(\widetilde{\Delta }) + (1-2\tau ) Q_2(\widetilde{\Delta })\\&\quad =Q_1(\widetilde{\Delta })-\sqrt{n}A_n V^{-1/2}D\widetilde{\Delta } + (1-2\tau ) Q_2(\widetilde{\Delta }) \\&\qquad + \sqrt{n}A_n V^{-1/2}D(\widetilde{\varvec{\beta }}_{\mathcal {S}}-\varvec{\beta }_0) \\&\quad =\sqrt{n}A_n V^{-1/2}D(\widetilde{\varvec{\beta }}_{\mathcal {S}}-\varvec{\beta }_0) +o_{P}(1). \end{aligned}$$

Then the desired result holds by Lemma A.3 and Slutsky’s theorem.

Table 1 Mean squared error (MSE) for ALS-LP estimator versus different H values with \(p_n=60\) and \(\widehat{\varvec{\beta }}\) replaced by \(\widehat{\varvec{\beta }}_{\mathcal {P}}\) in Experiment 1, where \(H=E\) denotes ALS-LP with \(H=E\) and E represents the exact H value, \(H=\infty \) denotes ALS-LP with \(H=\infty \)
Table 2 Empirical coverage probabilities and average lengths (in the parentheses) of 95% confidence intervals for the second component \((\Lambda _n{\beta }_{t})_2\) with \(p_n=20\) and \(\Lambda _n=I_{p_n\times p_n}\) in Experiment 1
Table 3 Variable description for the Bei**g multi-site air-quality dataset
Fig. 1
figure 1

Mean squared error (MSE) versus shrinkage parameter \(\rho \) with \(p_n=60, n=1000\) and \(\Lambda _n=(1,\ldots ,1)_{1\times p_n}\) (Case 2) in Experiment 1

Fig. 2
figure 2

Mean squared error (MSE) versus subsample size n with \(p_n=60\) and \(\Lambda _n=(1,\ldots ,1)_{1 \times p_n}\) (Case 2) in Experiment 1

Fig. 3
figure 3

Mean squared error (MSE) versus subsample size n with \({\varepsilon } \sim \mathcal {N}(0,1)\) and \(\tau =0.1\) in Experiment 1

Fig. 4
figure 4

Mean squared error (MSE) versus subsample size n with \({\varepsilon } \sim \mathcal {N}(0,1)\) and \(\tau =0.5\) in Experiment 1

Fig. 5
figure 5

Mean squared error (MSE) versus subsample size n with \({\varepsilon } \sim \mathcal {N}(0,1)\) and \(\tau =0.9\) in Experiment 1

Fig. 6
figure 6

Mean squared error (MSE) versus subsample size n with \(p_n=60\) and \(\Lambda _n=I_{p_n \times p_n}\) in Experiment 2, where WALS-AP uses the average of 100 pilot estimators as the initial value and ALS-AP uses one pilot estimator as the initial value

Fig. 7
figure 7

Average calculation time (ACT, in seconds) versus subsample size n with \(p_n=60\) and \(\Lambda _n=I_{p_n \times p_n}\) in Experiment 2, where WALS-AP uses the average of 100 pilot estimators as the initial value and ALS-AP uses one pilot estimator as the initial value

Fig. 8
figure 8

Data exploration for superconductivty data and Bei**g multi-site air-quality data, where the curves in histograms are the corresponding density curves

Fig. 9
figure 9

Average Prediction error (APE) of one run based on 1000 random partitions for the Superconductivty dataset

Fig. 10
figure 10

Average Prediction error (APE) of one run based on 1000 random partitions for the air-quality dataset

Proof of Theorem 5

Without loss of generality, we assume \(h_i>0, i=1,\ldots ,N, h_{(N+1)}=+\infty \) and \(h_1 \le h_2 \le \dots \le h_N\). According to the L-optimal criterion, minimizing the empirical AMSE of \(\Lambda _n\widetilde{\varvec{\beta }}\) is equivalent to minimizing \(tr(\Lambda _nD_N^{-1}V_ND_N^{-1}\Lambda _n)\). Thus the optimization problem can be described as follows:

$$\begin{aligned} \begin{aligned} \min G (\varvec{p})&:= \min \Bigg \{ tr\Bigg [\frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i}\phi _{\tau }^2(\varepsilon _i)\Lambda _nD_N^{-1}\\&\quad \times ~\varvec{x}_i\varvec{x}_i^TD_N^{-1}\Lambda _n^{T}\Bigg ] \Bigg \},\\&s.t. \sum \limits _{i=1}^{N}\pi _i=n,0\le \pi _i \le 1,i=1,\ldots ,N. \end{aligned} \end{aligned}$$
(A.20)

By Cauchy–Schwarz inequality,

$$\begin{aligned} G(\varvec{p})&= tr \left[ \frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i}\phi _{\tau }^2(\varepsilon _i)\Lambda _nD_N^{-1}\varvec{x}_i\varvec{x}_i^TD_N^{-1}\Lambda _n^{T}\right] \nonumber \\&=\frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i}\phi _{\tau }^2(\varepsilon _i)\Vert \Lambda _nD_N^{-1}\varvec{x}_i\Vert ^2 \nonumber \\&=\frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i}(h_i)^2 \nonumber \\&=\frac{n}{N^2}\frac{1}{n}\left( \sum \limits _{i=1}^{N}\pi _i\right) \left( \sum \limits _{i=1}^{N}\frac{1}{\pi _i}(h_i)^2\right) \nonumber \\&\ge \frac{1}{N^2}\left( \sum \limits _{i=1}^{N}h_i\right) ^2, \end{aligned}$$
(A.21)

where the equation in the last line holds if and only if \(\pi _i\propto h_i\), that is to say, when \(\pi _i\propto h_i\), \(tr(\Lambda _nD_N^{-1}V_ND_N^{-1}\Lambda _n)\) attains the minimum. Note that all \(\pi _i\)s need to satisfy \(0\le \pi _i \le 1\). We consider two scenarios:

(1) If \(nh_i/(\sum _{j=1}^{N}h_j)\le 1\) for \(i=i,\ldots ,N\), then \(\pi _i^{opt}= \frac{nh_i }{\sum _{j=1}^{N}(h_i)}\).

(2) If there are some is that make \(nh_i/(\sum _{i=j}^{N}h_j)> 1\), then there must be k of those is. To be specific, by the definition of k, \(kh_i>\sum _{i=1}^{N}h_i=\sum _{j=1}^{N-k}h_j+ \sum _{j=N-k+1}^{N}h_j > (n-k)h_{N-k}+kh_{N-k} = nh_{N-k}\), then \(h_i>h_{N-k}\), which yields \(i>N-k\). In this case, the original optimization problem (A.20) is equivalent to

$$\begin{aligned} \begin{aligned}&\min \left\{ \frac{n}{N^2}\sum \limits _{i=1}^{N-k}\frac{1}{\pi _i} \phi _{\tau }^2(\varepsilon _i)\Vert \Lambda _nD_N^{-1}\varvec{x}_i\Vert ^2\right\} ,\\&s.t. \sum \limits _{i=1}^{N-k}\pi _i=n-k,0\le \pi _i \le 1,i=1,\ldots ,N-k,\\&~~~~~~~~ p_{N-k+1},\ldots ,p_N=1. \end{aligned} \end{aligned}$$
(A.22)

Similarly, applying Cauchy-Schwarz inequality,

$$\begin{aligned}&\frac{n}{N^2}\sum \limits _{i=1}^{N-k}\frac{1}{\pi _i}\phi _{\tau }^2(\varepsilon _i) \Vert \Lambda _nD_N^{-1}\varvec{x}_i\Vert ^2 \nonumber \\&=\frac{n}{N^2}\frac{1}{n-k}\left( \sum \limits _{i=1}^{N-k}\pi _i\right) \left( \sum \limits _{i=1}^{N-k}\frac{1}{\pi _i}(h_i)^2\right) \nonumber \\&\ge \frac{n}{N^2(n-k)}\left( \sum \limits _{i=1}^{N-k}h_i\right) ^2, \end{aligned}$$
(A.23)

where the equation in the last line holds if and only if \(\pi _i\propto h_i\), namely, when

$$\begin{aligned} \pi _i= \left\{ \begin{aligned}&(n-k)h_i/(\sum _{j=1}^{N-k}h_j),i=1,\ldots ,N-k \\&1, i=N-k+1,\ldots ,N \end{aligned} \right. , \end{aligned}$$

\(tr(\Lambda _nD_N^{-1}V_ND_N^{-1}\Lambda _n)\) attains the minimum. Next, we are eager to unify the results of \(\pi _i\). Suppose there exits an H such that

$$\begin{aligned} \max \limits _{i=1,\ldots ,N}n \frac{h_i\wedge H}{\sum _{j=1}^{N}(h_j\wedge H)}=1, \end{aligned}$$

and \(h_{N-k}<H<h_{N-k+1}\), then it follows that

$$\begin{aligned} \sum _{j=1}^{N-k}h_i=(n-k)H. \end{aligned}$$
(A.24)

By (A.21) and (A.24), it follows that

$$\begin{aligned}&G_{\min }(\varvec{\pi })\\&\quad =\frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i}(h_i)^2\\&=\frac{n}{N^2}\sum \limits _{i=1}^{N-k}\frac{1}{\pi _i}(h_i)^2 +\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}\frac{1}{\pi _i}(h_i)^2 \\&\quad =\frac{n}{N^2(n-k)}\left( \sum \limits _{i=1}^{N-k}h_i\right) ^2+\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}(h_i)^2\\&\quad =\frac{H^2n(n-k)}{N^2}+\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}(h_i)^2. \end{aligned}$$

Let \(\pi _i^{opt}=n \frac{h_i\wedge H}{\sum _{j=1}^{N}(h_j\wedge H)}\) and \(\varvec{\pi }^{opt}=(\pi _1^{opt},\ldots ,\pi _N^{opt})\), substitute \(\varvec{\pi }^{opt}\) into (A.22), we can get

$$\begin{aligned}&G(\varvec{\pi }^{opt})\\&\quad =tr\left[ \frac{1}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i^{opt}}\phi _{\tau }^2(\varepsilon _i)\Lambda _nD_N^{-1}\varvec{x}_i\varvec{x}_i^TD_N^{-1}\Lambda _n^T\right] \nonumber \\&\quad =\frac{n}{N^2}\sum \limits _{i=1}^{N}\frac{1}{\pi _i^{opt}}h_i^2 \nonumber \\&\quad =\frac{n}{N^2}\sum \limits _{i=1}^{N-k}\frac{1}{\pi _i^{opt}}h_i^2 +\frac{1}{N^2}\sum \limits _{i=N-k+1}^{N}\frac{1}{\pi _i^{opt}}(h_i)^2 \\&\quad =\frac{1}{N^2}\sum \limits _{i=1}^{N-k} \frac{\sum _{j=1}^{N}(h_i\wedge H)}{h_i\wedge H}h_i^2 +\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}h_i^2\\&\quad =\frac{1}{N^2}\sum \limits _{i=1}^{N-k} \frac{\sum _{j=1}^{N-k}h_i+kH}{h_i }h_i^2 +\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}h_i^2\\&\quad =\frac{H^2n(n-k)}{N^2}+\frac{n}{N^2}\sum \limits _{i=N-k+1}^{N}h_i^2\\&\quad =G_{\min }(\varvec{p}), \end{aligned}$$

which implies \(\pi _i^{opt}\) is the optimal solution of (A.22).

Finally, we verify the existence of the aforementioned H and it satisfies \(h_{N-k}<H<h_{N-k+1}\). The definition of k implies that

$$\begin{aligned} \frac{(n-k+1)h_{N-k+1}}{\sum _{i=1}^{N-k+1}h_{i}}\ge 1 ~~~~ \text{ and } ~~~~ \frac{(n-k)h_{N-k}}{\sum _{i=1}^{N-k}h_{i}}< 1. \end{aligned}$$

Let \(H_1=h_{N-k+1}, H_2=h_{N-k}\), then

$$\begin{aligned} \frac{(n-k+1)h_{N-k+1}+(k-1)H_1}{\sum _{i=1}^{N-k+1}h_{i}+(k-1)H_1}\ge 1 \end{aligned}$$

and

$$\begin{aligned} \frac{(n-k)h_{N-k}+kH_2}{\sum _{i=1}^{N-k}h_{i}+kH_2}< 1. \end{aligned}$$

As a result,

$$\begin{aligned} n\frac{h_{N}\wedge H_1}{\sum _{j=1}^{N}h_{j}\wedge H_1}\ge 1 ~~~~ \text{ and } ~~~~ n\frac{h_{N}\wedge H_2}{\sum _{j=1}^{N-k}h_{j}\wedge H_2}< 1. \end{aligned}$$

Note that \(\max \limits _{i=1,\ldots ,N}n \frac{h_i\wedge H}{\sum _{j=1}^{N}(h_j\wedge H)}\) is continuous with respect to H, then there must exist \(h_{N-k}<H<h_{N-k+1}\) such that \(\max \limits _{i=1,\ldots ,N}n \frac{h_i\wedge H}{\sum _{j=1}^{N}(h_j\wedge H)}=1\).

Scenario (1) is a special case of scenario (2), which is the case when \(k=0\). The proof of the Theorem 5 is completed. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., **a, X. & Zhang, Z. Poisson subsampling-based estimation for growing-dimensional expectile regression in massive data. Stat Comput 34, 133 (2024). https://doi.org/10.1007/s11222-024-10449-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10449-x

Keywords

Navigation