Abstract
The paper proposes a novel difference-based method for testing the hypothesis of no relationship between the dependent and independent variables. We construct three test statistics for nonparametric regression with Gaussian and non-Gaussian random errors. These test statistics have the standard normal as the asymptotic null distribution. Furthermore, we show that these tests can detect local alternatives that converge to the null hypothesis at a rate close to \(n^{-1/2}\) previously achieved only by the residual-based tests. We also propose a permutation test as a flexible alternative. Our difference-based method does not require estimating the mean function or its first derivative, making it easy to implement and computationally efficient. Simulation results demonstrate that our new tests are more powerful than existing methods, especially when the sample size is small. The usefulness of the proposed tests is also illustrated using two real data examples.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-024-01479-0/MediaObjects/180_2024_1479_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00180-024-01479-0/MediaObjects/180_2024_1479_Fig2_HTML.png)
Similar content being viewed by others
References
Barry D, Hartigan JA (1990) An omnibus test for departures from constant mean. Ann Stat 18:1340–1357
Bliznyuk N, Carroll R, Genton M et al (2012) Variogram estimation in the presence of trend. Stat Interface 5(2):159–168
Brabanter KD, Brabanter JD, Moor BD et al (2013) Derivative estimation with local polynomial fitting. J Mach Learn Res 14:281–301
Chen JC (1994) Testing for no effect in nonparametric regression via spline smoothing techniques. Ann Inst Stat Math 46:251–265
Cox D, Koh E (1989) A smoothing spline based test of model adequacy in polynomial regression. Ann Inst Stat Math 41:383–400
Cox D, Koh E, Wahba G et al (1988) Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models. Ann Stat 16:113–119
Cui Y, Levine M, Zhou Z (2021) Estimation and inference of time-varying auto-covariance under complex trend: a difference-based approach. Electr J Stat 15(2):4264–4294
Dai W, Tong T, Genton M (2016) Optimal estimation of derivatives in nonparametric regression. J Mach Learn Res 17:1–25
Dai W, Tong T, Zhu L (2017) On the choice of difference sequence in a unified framework for variance estimation in nonparametric regression. Stat Sci 32:455–468
Einmahl JH, Van Keilegom I (2008) Tests for independence in nonparametric regression. Stat Sin 18:601–615
Eubank RL (2000) Testing for no effect by cosine series methods. Scand J Stat 27:747–763
Eubank RL, Li CS, Wang S (2005) Testing lack-of-fit of parametric regression models using nonparametric regression techniques. Stat Sin 15:135–152
Evans D, Jones AJ (2008) Nonparametric estimation of residual moments and covariance. Proc Royal Soc A 464:2831–2846
Gasser T, Sroka L, Jennen-Steinmetz C (1986) Residual variance and residual pattern in nonlinear regression. Biometrika 73:625–633
González-Manteiga W, Crujeiras RM (2013) An updated review of goodness-of-fit tests for regression models. TEST 22:361–411
Hall P, Kay JW, Titterington DM (1990) Asymptotically optimal difference-based estimation of variance in nonparametric regression. Biometrika 77:521–528
Lauer SA, Grantz KH, Bi Q et al (2020) The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Int Med 172:577–582
Li CS (2012) Testing for no effect via splines. Computat Stat 27:343–357
Liu A, Wang Y (2004) Hypothesis testing in smoothing spline models. J Statist Computat Simul 74:581–597
Liu X, He Y, Ma X et al (2020) Statistical data analysis on the incubation and suspected period of COVID-19 based on 2172 confirmed cases outside Hubei province. Acta Math Appl Sin 43:278–294
McManus DA (1991) Who invented local power analysis? Econom Theory 7:265–268
Neumeyer N, Dette H (2003) Nonparametric comparison of regression curves: an empirical process approach. Ann Stat 31:880–920
Raz J (1990) Testing for no effect when estimating a smooth function by nonparametric regression: a randomization approach. J Am Stat Assoc 85:132–138
Rice J (1984) Bandwidth choice for nonparametric regression. Ann Stat 12:1215–1230
Storey JD, **ao W, Leek JT et al (2005) Significance analysis of time course microarray experiments. Proc Natl Acad Sci 102(36):12837–12842
Tan WYT, Wong LY, Leo YS et al (2020) Does incubation period of COVID-19 vary with age? A study of epidemiologically linked cases in Singapore. Epidemiol Infection 148:e197
Tong T, Wang Y (2005) Estimating residual variance in nonparametric regression using least squares. Biometrika 92:821–830
Tong T, Ma Y, Wang Y (2013) Optimal variance estimation without estimating the mean function. Bernoulli 19:1839–1854
Van Keilegom I, González Manteiga W, Sánchez Sellero C (2008) Goodness-of-fit tests in parametric regression based on the estimation of the error distribution. TEST 17:401–415
Wang W, Lin L (2015) Derivative estimation based on difference sequence via locally weighted least squares regression. J Mach Learn Res 16:2617–2641
Wang W, Yu P, Lin L et al (2019) Robust estimation of derivatives using locally weighted least absolute deviation regression. J Mach Learn Res 20:1–49
Wang Y (2011) Smoothing splines: methods and applications. Chapman and Hall, New York, pp 12–45
Wang Y (2011b) Smoothing splines: methods and applications. CRC Press
Whittle P (1962) On the convergence to normality of quadratic forms in independent variables. Theory Probab Appl 9:103–108
Yatchew A (1999) An elementary nonparametric differencing test of equality of regression functions. Econom Lett 62:271–278
Yatchew A (2003) Semiparametric regression for the applied econometrician. Cambridge University Press, Cambridge, pp 10–25
Zhang M, Dai W (2023) On difference-based gradient estimation in nonparametric regression. Sci J Stat Anal Data Mining. https://doi.org/10.1002/sam.11644
Zhang X, Zhong H, Li Y et al (2021) Sex- and age-related trajectories of the adult human gut microbiota shared across populations of different ethnicities. Nature Aging 1:87–100
Acknowledgements
Zhijian Li was supported in part by the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, project code 2022B1212010006 and UIC research grant R0400001-22, and UIC Start-up Research Fund/Grant UICR0700024-22. Tiejun Tong was supported in part by the General Research Fund of Hong Kong (HKBU12300123, HKBU12303421) and the National Natural Science Foundation of China (12071305). The authors thank the editor, the associate editor and two reviewers for their constructive comments and suggestions that have led to a substantial improvement in the paper and thank the authors of Zhang et al (2021) and Liu et al (2020) for providing the real data sets.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Some lemmas and their proofs
Lemma 1
Assume that \(m \rightarrow \infty\) and \(m = o(n)\). We have
-
(a)
\(\sum _{k=1}^{m}h_k= \frac{15}{16}n+o(n)\);
-
(b)
\(\sum _{k=1}^{m} k^2 h_k = n^2m +o(n^2m)\);
-
(c)
\(\sum _{k=1}^{i-1}h_k= \frac{15n^2}{4\,m^4}(i^3 - m^2i) + O\big (\frac{n^2}{m^2}\big ) + o\big (\frac{n^2i}{m^2}\big )\);
-
(d)
\(\sum _{k=i}^{m}k h_k= O(n^2)\);
-
(e)
\(\sum _{k=1}^{i-1} k^2 h_k = O(\frac{n^2i^3}{m^2})\);
-
(f)
\(\sum _{k=1}^{m}h_k^2 = \frac{45n^4}{4m^3} +o(\frac{n^4}{m^3})\);
-
(g)
\(\sum _{k=1}^{m}k h_k^2 = \frac{225n^4}{32m^2} +o(\frac{n^4}{m^2})\).
Proof
Following the Appendix in Tong and Wang (2005), we have
-
(a)
$$\begin{aligned} \sum _{k=1}^{m} h_k&= \frac{\sum _{k=1}^{m} (d_k -\bar{d}_w)}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} = \frac{\frac{m^4}{12n^3} + o(\frac{m^4}{n^3})}{\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})} = \frac{15n}{16} +o(n). \end{aligned}$$
-
(b)
$$\begin{aligned} \sum _{k=1}^{m} k^2 h_k&= \frac{\sum _{k=1}^{m} k^2(d_k -\bar{d}_w)}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} = \frac{\frac{4m^5}{45n^2} + o\big (\frac{m^5}{n^2}\big )}{\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})} = n^2m +o(n^2m), \end{aligned}$$
where
$$\begin{aligned} \sum _{k=1}^{m} k^2(d_k -\bar{d}_w)&= \frac{1}{n^2} \Big (\frac{m^5}{5} + O(m^4)\Big ) - \Big (\frac{m^3}{3} +O(m^2)\Big )\Big [\frac{m^2}{3n^2} + o\Big (\frac{m^2}{n^2}\Big )\Big ] \\&= \frac{4m^5}{45n^2} + o\Big (\frac{m^5}{n^2}\Big ). \end{aligned}$$ -
(c)
For \(1 \le i \le m\), by (A2) and (A3) we have
$$\begin{aligned} \sum _{k=1}^{i-1}h_k&= \frac{\sum _{k=1}^{i-1}(d_k -\bar{d}_w)}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} \\&= \frac{ \frac{1}{3n^2}(i^3 - m^2i) +O\big (\frac{m^2}{n^2}\big ) +o\big (\frac{m^2i}{n^2}\big )}{\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})} \\&= \frac{15n^2}{4m^4}(i^3 - m^2i) + O\Big (\frac{n^2}{m^2}\Big ) + o\Big (\frac{n^2i}{m^2}\Big ), \end{aligned}$$where
$$\begin{aligned} \sum _{k=1}^{i-1}(d_k -\bar{d}_w) = \sum _{k=1}^{i-1} (\frac{k}{n})^2 - (i-1)\bar{d}_w = \frac{1}{3n^2}(i^3 - m^2i) +O\Big (\frac{m^2}{n^2}\Big ) +o\Big (\frac{m^2i}{n^2}\Big ). \end{aligned}$$ -
(d)
For \(1\le i \le m\), by (A2) we have
$$\begin{aligned} \sum _{k=i}^{m}k h_k&= \frac{\sum _{k=i}^{m} k(d_k -\bar{d}_w)}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} =\frac{\sum _{k=i}^{m} \frac{k^3}{n^2} - \bar{d}_w \sum _{k=i}^{m}k}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} = O(n^2). \end{aligned}$$ -
(e)
For \(1\le i \le m\), by (A2) we have
$$\begin{aligned} \sum _{k=1}^{i-1} k^2 h_k&= \frac{\sum _{k=1}^{i-1} k^2(d_k -\bar{d}_w)}{\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2} \\&= \frac{\frac{1}{n^2}(\frac{i^5}{5}+O(i^4)) - [\frac{m^2}{3n^2} + o(\frac{m^2}{n^2})][\frac{i^3}{3}+O(i^2)] }{\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})}\\&= O\Big (\frac{n^2i^3}{m^2}\Big ). \end{aligned}$$ -
(f)
By (A2), we have
$$\begin{aligned} \sum _{k=1}^{m} h_k^2&=\frac{\sum _{k=1}^{m} (d_k -\bar{d}_w)^2}{(\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2)^2}\\&= \frac{\sum _{k=1}^{m}d_k^2 -2\bar{d}_w\sum _{k=1}^{m}d_k+m(\bar{d}_w) ^2}{(\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2)^2}\\&= \frac{\frac{m^5}{n^4}(\frac{1}{5} -\frac{2}{9}+\frac{1}{9} ) +o(\frac{m^5}{n^4})}{[\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})]^2}\\&= \frac{45n^4}{4m^3} + o\Big (\frac{n^4}{m^3}\Big ). \end{aligned}$$ -
(g)
By (A2), we have
$$\begin{aligned} \sum _{k=1}^{m} k h_k^2&=\frac{\sum _{k=1}^{m} k(d_k -\bar{d}_w)^2}{(\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2)^2}\\&= \frac{\sum _{k=1}^{m}kd_k^2 -2\bar{d}_w\sum _{k=1}^{m}kd_k+(\bar{d}_w) ^2\sum _{k=1}^{m}k}{(\sum _{k=1}^{m} w_k (d_k - \bar{d}_w)^2)^2}\\&= \frac{\frac{m^6}{n^4}(\frac{1}{6} -\frac{1}{6}+\frac{1}{18} ) +o(\frac{m^6}{n^4})}{[\frac{4m^4}{45n^4} + o(\frac{m^4}{n^4})]^2}\\&= \frac{225n^4}{32m^2} + o\Big (\frac{n^4}{m^2}\Big ). \end{aligned}$$
\(\square\)
Lemma 2
Assume that \(m \rightarrow \infty\) and \(m=o(n)\), and let \(\textbf{g} = (g(x_1), \dots , g(x_n))^T\). We have
-
(a)
\(\textbf{g}^T \varvec{B} \textbf{g} = 2\beta mn+O(m^2)\);
-
(b)
\(\textbf{g}^T \varvec{B}^2\textbf{g} = O(n^2m)\).
Proof
-
(a)
Let \(\varvec{A}= (a_{ij})_{n\times n}\) be a symmetric matrix with \(a_{ij}\) having the same form as \(b_{ij}\) in (7) but \(h_0 = 0\) and \(h_k = 1\) for \(k = 1, \dots , m\). Let \(\varvec{D} = (d_{ij})_{n\times n}\) is the matrix defined in Theorem 1 of Tong and Wang (2005). Then,
$$\begin{aligned} {\textbf {g}}^T \varvec{B} {\textbf {g}} = \frac{{\textbf {g}}^T (\varvec{A}-\varvec{D}) {\textbf {g}}}{\bar{d}_w}. \end{aligned}$$To simplify the notation, we let \(g_i=g(x_i)\). We can show that
$$\begin{aligned} {\textbf {g}}^T \varvec{A} {\textbf {g}}&= \sum _{k=1}^{m}\sum _{i=1}^{n-k} (g_{i+k} - g_i)^2\\&=\sum _{k=1}^{m}\sum _{i=1}^{n-k} \Big [\frac{k^2}{n^2}(g_i')^2 + O\Big (\frac{k^3}{n^3}\Big ) \Big ]\\&=\sum _{k=1}^{m} \frac{k^2}{n^2} \sum _{i=1}^{n-k}(g_i')^2 + \sum _{k=1}^{m} O\Big (\frac{(n-k)k^3}{n^3}\Big )\\&=\sum _{k=1}^{m} \frac{k^2}{n} \Big [\frac{1}{n}\sum _{i=1}^{n}(g_i')^2 - \frac{1}{n}\sum _{i=n-k+1}^{n}(g_i')^2\Big ] + O\Big (\frac{m^4}{n^2}\Big )\\&=\sum _{k=1}^{m} \frac{k^2}{n} \Big [2\beta +O\Big (\frac{k}{n}\Big )\Big ] +O\Big (\frac{m^4}{n^2}\Big )\\&= \frac{2\beta m^3}{3n} +O\Big (\frac{m^4}{n^2}\Big ), \end{aligned}$$where \(\beta = \int _{0}^{1} (g'(x))^2 \, dx/2\). Note also that \({\textbf {g}}^T \varvec{D} {\textbf {g}} = O(m^4/n^2)\) by Lemma 2 in Tong et al (2013). Then by (A3), we have
$$\begin{aligned} {\textbf {g}}^T \varvec{B} {\textbf {g}} = \frac{\frac{2\beta m^3}{3n}+ O(\frac{m^4}{n^2})}{\frac{m^2}{3n^2} + o(\frac{m^2}{n^2})} = 2\beta mn+O(m^2). \end{aligned}$$ -
(b)
Noting that \(\varvec{B}\) is a symmetric matrix, we let \({\textbf {g}}^T \varvec{B}^2 {\textbf {g}} = (\varvec{B}{} {\textbf {g}})^T (\varvec{B}{} {\textbf {g}}) = \varvec{q}^T \varvec{q}\), where \(\varvec{q} = \varvec{B}{} {\textbf {g}} = (q_1, \dots , q_n)^T\). For \(i \in [1,m]\), by parts (b), (d) and (e) of Lemma 1, we have
$$\begin{aligned} q_i&= \sum _{k=1}^{i-1}h_k (g_i - g_{i-k}) - \sum _{k=1}^{m}h_k (g_{i+k} - g_{i})\\&=\sum _{k=1}^{i-1}h_k \Big (\frac{k}{n}g_i' -\frac{k^2}{2n^2}g_i''+o\big (\frac{k^2}{n^2}\big )\Big ) - \sum _{k=1}^{m}h_k \Big (\frac{k}{n}g_i' +\frac{k^2}{2n^2}g_i''+o\big (\frac{k^2}{n^2}\big )\Big )\\&= -\frac{g_i'}{n}\sum _{k=i}^{m}k h_k - \Big [\frac{g_i''}{2n^2}\big (\sum _{k=1}^{i-1}k^2h_k+ \sum _{k=1}^{m}k^2h_k\big )\Big ]+o\Big (\frac{1}{n}\sum _{k=i}^{m}k^2h_k \Big )\\&= O(n)+O(m)+O\Big (\frac{i^3}{m^2}\Big ) +o(m)\\&= O(n). \end{aligned}$$Similary, we can show that \(q_i=O(n)\) for \(i \in [n-m+1,n]\). While for \(i \in [m+1,n-m]\), by Lemma 1(b) we have
$$\begin{aligned} q_i&= \sum _{k=1}^{m}h_k (g_i - g_{i-k}) - \sum _{k=1}^{m}h_k (g_{i+k} - g_{i})\\&= \sum _{k=1}^{m}h_k \Big (\frac{k}{n}g_i' -\frac{k^2}{2n^2}g_i''+o\Big (\frac{k^2}{n^2}\Big )\Big ) - \sum _{k=1}^{m}h_k \Big (\frac{k}{n}g_i' +\frac{k^2}{2n^2}g_i''+o\Big (\frac{k^2}{n^2}\Big )\Big )\\&= -\frac{1}{n^2}g_i''\sum _{k=1}^{m}k^2h_k+o\Big (\frac{\sum _{k=1}^{m}k^2h_k}{n^2}\Big )\\&= O(m). \end{aligned}$$Taken together the above results, it yields that
$$\begin{aligned} {\textbf {g}}^T\varvec{B}^2{\textbf {g}} = \sum _{i=1}^{m} q_{i}^{2}+\sum _{i=m+1}^{n-m} q_{i}^{2}+\sum _{i=n-m+1}^{n} q_{i}^{2} = O(n^2m). \end{aligned}$$
\(\square\)
Lemma 3
Assume that \(m \rightarrow \infty\) and \(m = o(n)\). We have
-
(a)
\(\sum _{i=1}^{n}b_{ii}^2= \frac{15n^4}{14m}+ o(\frac{n^4}{m})\);
-
(b)
\(\sum _{i=1}^{n}\sum _{j=1,j\ne i}^{n} b_{ij}^2=\frac{45n^5}{2\,m^3} + o(\frac{n^5}{m^3})\).
Proof
-
(a)
By parts (a) and (c) of Lemma 1, we have
$$\begin{aligned} \sum _{i=1}^{n}b_{ii}^2&= 2\sum _{i=1}^{m}\big (\sum _{k=1}^{m}h_k +\sum _{k=1}^{i-1}h_k\big )^2 + \sum _{i=m+1}^{n-m}\big (2 \sum _{k=1}^{m} h_k\big )^2 \\&=2\sum _{i=1}^{m}\Big [\frac{15}{16}n+o(n)+ \frac{15n^2}{4m^4}(i^3 - m^2i) + O\Big (\frac{n^2}{m^2}\Big ) + o\Big (\frac{n^2i}{m^2}\Big )\Big ]^2 \\&\qquad + 4(n-2m)\Big [\frac{15}{16}n+o(n)\Big ]^2\\&= 2\Big (\frac{15}{16}\Big )^2n^2m + \Big (\frac{15n^2}{4m^4}\Big )^2\Big (\sum _{i=1}^{m} i^6 + m^4\sum _{i=1}^{m}i^2-2m^2\sum _{i=1}^{m}i^4\Big ) \\&\qquad + \frac{15^2n^3}{32m^4}\Big (\sum _{i=1}^{m}i^3 - m^2\sum _{i=1}^{m}i\Big ) + o\Big [\frac{n^4}{m^6}\Big (\sum _{i=1}^{m}i^4-m^2\sum _{i=1}^{m}i^2\Big ) \Big ]\\&\qquad +4\Big [\Big (\frac{15}{16}\Big )^2 n^3+o(n^3) \Big ]\\&= \frac{15n^4}{14m}+o\Big (\frac{n^4}{m}\Big ). \end{aligned}$$ -
(b)
By parts (f) and (g) of Lemma 1, we have
$$\begin{aligned} \sum _{i=1}^{n}\sum _{j=1,j\ne i}^{n} b_{ij}^2&= 2 \sum _{k=1}^{m}(n-k)h_k^2\\&=2n\Big [ \frac{45n^4}{4m^3} +o\Big (\frac{n^4}{m^3}\Big ) \Big ] -2\Big [\frac{225n^4}{32m^2} +o\Big (\frac{n^4}{m^2}\Big )\Big ]\\&= \frac{45n^5}{2m^3}+o\Big (\frac{n^5}{m^3}\Big ). \end{aligned}$$
\(\square\)
Appendix 2: Proof of Theorem 1
Proof
Let \({\textbf {g}} = (g(x_1), \dots , g(x_n))^T\) and \(\varvec{\epsilon } = (\epsilon _1, \dots , \epsilon _n)^T\). By (1) and (6), we have
From Lemma 2(a) we have
Using Lemma 2(b), we have \({{E}}({\textbf {g}}^T \varvec{B} \varvec{\epsilon } /N)^2 = \sigma ^2 {\textbf {g}}^T \varvec{B}^2{\textbf {g}}/N^2 = O(1/m)\). This leads to
Let \(\varvec{\epsilon }^T \varvec{B} \varvec{\epsilon }/(2N) = \varvec{\epsilon }^T \varvec{C} \varvec{\epsilon } - \varvec{\epsilon }^T \varvec{U} \varvec{\epsilon }\), where the elements of matrix \(\varvec{C}\) are
and \(\varvec{U}=\textrm{diag}(u_1, \cdots , u_n)\) with \(u_i = \sum _{k=min\{i,n+1-i,m+1\} }^{m+1} h_k/(2N)\), for \(i = 1, \dots , n\) and \(h_{m+1} = 0\). Let \(c_0 = \sum _{k=1}^{m}h_k/N\), \(c_{i-j} = c_{j-i} = -h_{|i-j|}/(2N)\) for \(1 \le |i-j| \le m\), and \(c_{i-j} = c_{j-i} = 0\) for \(|i-j| >m\). Then \(\varvec{\epsilon }^T \varvec{C} \varvec{\epsilon } = \sum _{i=1}^{n}\sum _{j=1}^{n} c_{i-j} \epsilon _i \epsilon _j\), where \(\epsilon _i\) are i.i.d. with mean zero. Thus by parts (a) and (f) of Lemma 1,
as \(m = \lceil n^r\rceil\) with \(2/5 \le r<1\). Assuming \(E(\epsilon ^6) < \infty\), by Theorem 2 in Whittle (1962), \(\varvec{\epsilon }^T \varvec{C} \varvec{\epsilon }\) is asymptotically normally distributed.
We have \(\varvec{\epsilon }^T \varvec{U} \varvec{\epsilon } = \sum _{i=1}^{n} u_i \epsilon _i^2\). Let \(X_i = u_i \epsilon _i^2\), then \(X_1, X_2, \dots , X_n\) are independent random variables, where \(X_i = \sum _{k=i}^{m} h_k \epsilon _i^2/(2N)\) for \(1 \le i \le m\), \(X_i = \sum _{k=n-i+1}^{m} h_k \epsilon _i^2/(2N)\) for \(n-m+1 \le i \le n\), and \(X_i = 0\) for \(m+1 \le i \le n-m\). For \(1 \le i \le m\), using parts (a) and (c) of Lemma 1 we have
as \(m = \lceil n^r\rceil\) with \(1/2< r<1\). For \(n-m+1 \le i \le n\), the results are similar. It is intuitive to show that for \(1 \le i \le m\), the variance of \(X_i\) is
as \(n \rightarrow \infty\) and \(m = \lceil n^r\rceil\) with \(1/2< r<1\). We have similar results for \(n-m+1 \le i \le n\), and \(\text {Var}(X_i) = 0\) for \(m+1 \le i \le n-m\). Noting also that \(\sum _{i=1}^{m} \text {Var}(X_i) = \sum _{i=n-m+1}^{n} \text {Var}(X_i)\), we can derive the sum of variance as
Thus \(s_n^2\) is finite as \(m = \lceil n^r\rceil\) with \(2/3 \le r<1\). Moreover, we have
and
where \(\tau _0\) and \(\tau _1\) are some constants and \(m \rightarrow \infty\) with \(n \rightarrow \infty\). Thus
By the Lyapunov CLT, \(\varvec{\epsilon }^T \varvec{U} \varvec{\epsilon }\) is asymptotically normally distributed. Therefore, \(\varvec{\epsilon }^T \varvec{B} \varvec{\epsilon }/(2N)\) is asymptotically normally distributed. The mean of \(\varvec{\epsilon }^T \varvec{B} \varvec{\epsilon }/(2N)\) can be shown to be
and the variance is
Using parts (a) and (b) of Lemma 3 and combining the above results, we have
where \(m = \lceil n^r\rceil\) with \(2/3< r<1\). This then leads to
as \(n \rightarrow \infty\), where \(\sigma _{b}=\sqrt{15(\gamma _4-1)\sigma ^4/56}\). \(\square\)
Appendix 3: Proofs of Theorem 2 and Theorem 3
Proof of Theorem 2
The estimated error variance of \(\hat{\beta }\) given in (9) can be written as \(\tilde{\sigma }_{\beta }^2 = \tau _n \hat{\sigma }^4\). As \(n \rightarrow \infty\), \(\tau _n \rightarrow (15/28)n^{2-3r}\) with \(m = \lceil n^r\rceil\) in (9). Let \(\hat{\sigma }^2\) be a consistent estimator of \(\sigma ^2\), and \(\sigma _\beta ^2=(15/28)n^{2-3r}\sigma ^4\). Under Theorem 1 and the null hypothesis \(H_0\) in (4), we have \(\hat{\beta }/\sigma _{\beta } \xrightarrow []{D} N(0,1)\) when the random errors are normally distributed. In addition, we have \(\sigma _{\beta }/\tilde{\sigma }_{\beta } \rightarrow 1\) as \(n\rightarrow \infty\). Thus by Slutsky’s theorem,
\(\square\)
Proof of Theorem 3
Given that \(\hat{\kappa }\) and \(\hat{\sigma }^2\) are consistent estimators of \(\kappa\) and \(\sigma ^2\) respectively, we note that \(\check{\sigma }_{\beta g}^2\) in (12) is also a consistent estimator of \(\sigma _{\beta }^2= (15/56)n^{2-3r}(\kappa -(\sigma ^2)^2)\). Therefore under Theorem 1 and the null hypothesis \(H_0\) in (4), by Slutsky’s theorem we have
\(\square\)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Z., Tong, T. & Wang, Y. A difference-based method for testing no effect in nonparametric regression. Comput Stat (2024). https://doi.org/10.1007/s00180-024-01479-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-024-01479-0