Handling missing extremes in tail estimation

Xu, Hui; Davis, Richard; Samorodnitsky, Gennady

doi:10.1007/s10687-021-00429-z

Handling missing extremes in tail estimation

Published: 23 December 2021

Volume 25, pages 199–227, (2022)
Cite this article

Extremes Aims and scope Submit manuscript

436 Accesses
1 Altmetric
Explore all metrics

Abstract

In some data sets, it may be the case that a portion of the extreme observations is missing. This might arise in cases where the extreme observations are just not available or are imprecisely measured. For example, considering human lifetimes, a topic of recent interest, birth certificates of centenarians may not even exist and many such individuals may not even be included in the data sets that are currently available. In essence, one does not have a clear record of the largest lifetimes of human populations. If there are missing extreme observations, then the assessment of risk can be severely underestimated resulting in rare events occurring more often than originally thought. In concrete terms, this may mean a 500 year flood is in fact a 100 (or even a 20) year flood. In this paper, we present methods for estimating the number of missing extremes together with the tail index associated with tail heaviness of the data. Ignoring one or the other can severely impact the estimation of risk. Our estimates are based on the HEWE (Hill estimate without extremes) of the tail index that adjusts for missing extremes. Based on a functional convergence of this process to a limit process, we consider an asymptotic likelihood-based procedure for estimating both the number of missing extremes and the tail index. We derive the asymptotic distribution of the resulting estimates. By artificially removing segments of extremes in the data, this methodology can be used for assessing the reliability of the underlying assumptions that are imposed on the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

On the study of extremes with dependent random right-censoring

Article Open access 26 June 2018

Estimation of extremes for Weibull-tail distributions in the presence of random censoring

Article 22 June 2019

Likelihood estimators for multivariate extremes

Article 17 November 2015

Data availibility

The Danish fire data and the Natural and Climate Disasters in the U.S. data are publicly available as indicated in the text. The Google+ data are not publicly available.

Notes

The general form of the GPD that combines light- and heavy-tailed cases has a similar form, see de Haan and Ferreira (2006).

References

Aban, I., Meerschaert, M., Panorska, A.: Parameter estimation for the truncated Pareto distribution. J. Am. Stat. Assoc. 101, 270–277 (2006)
Article MathSciNet Google Scholar
Beirlant, J., Alves, I., Gomes, I.: Tail fitting for truncated and non-truncated Pareto-type distributions. Extremes 19, 429–462 (2016)
Article MathSciNet Google Scholar
Beirlant, J., Fraga Alves, I., Reynkens, T.: Fitting tails affected by truncation. Electron. J. Stat. 11, 2026–2065 (2017)
Article MathSciNet Google Scholar
Benchaira, S., Meraghmi, D., Necir, A.: Tail product-limit process for truncated data with application to extreme value index estimation. Extremes 19, 219–251 (2016)
Article MathSciNet Google Scholar
Bhattacharya, S., Kallitsis, M., Stoev, S.: Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data. Electronic J. Stat. 13, 1872–1925 (2019)
Article MathSciNet Google Scholar
de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer, New York (2006)
Book Google Scholar
Drees, H.: On smooth statistical tail functionals. Scand. J. Stat. 25, 187–210 (1998)
Article MathSciNet Google Scholar
Drees, H., de Haan, L., Resnick, S.: How to make a Hill plot. Ann. Stat. 28, 254–274 (2000)
Article MathSciNet Google Scholar
Einmahl, J.H., Fils-Villetard, A., Guillou, A., et al.: Statistics of extremes under random censoring. Bernoulli 14, 207–227 (2008)
MathSciNet MATH Google Scholar
Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events for Insurance and Finance. Springer-Verlag, Berlin (1997)
Book Google Scholar
Hill, B.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3, 1163–1174 (1975)
Article MathSciNet Google Scholar
Newman, M.: Networks: An Introduction. Oxford University Press, Oxford (2010)
Book Google Scholar
Reiss, R.-D.: Asymptotic Distribution of Order Statistics. Springer, New York (1989)
Google Scholar
Resnick, S.: Discussion of the Danish data on large fire insurance losses. Astin Bull. 27, 139–151 (1997)
Article MathSciNet Google Scholar
Resnick, S.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York (2007)
MATH Google Scholar
Smith, A., Katz, R.: US billion-dollar weather and climate disasters: data sources, trends, accuracy and biases. Nat. Hazards 67, 387–410 (2013)
Article Google Scholar
Stupfler, G.: Estimating the conditional extreme-value index under random right-censoring. J. Multivar. Anal. 144, 1–24 (2016)
Article MathSciNet Google Scholar
Worms, J., Worms, R.: New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes 17, 337–358 (2014)
Article MathSciNet Google Scholar
Zou, J., Davis, R., Samorodnitsky, G.: Extreme value analysis without the largest values: what can be done? Probab. Eng. Inf. Sci. (2019). https://doi.org/10.1017/S0269964818000542:1-21

Download references

Acknowledgements

We would like to thank the two anonymous referees for their useful comments that forced us to make changes that were truly necessary.

Author information

Authors and Affiliations

Center for Applied Mathematics, Cornell University, Ithaca, New York, USA
Hui Xu
Department of Statistics, Columbia University, New York, New York, USA
Richard Davis
School of Operations Research and Information Engineering, Cornell University, Ithaca, New York, USA
Gennady Samorodnitsky

Authors

Hui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Richard Davis
View author publications
You can also search for this author in PubMed Google Scholar
Gennady Samorodnitsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard Davis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Richard Davis: This research was supported in part by NSF Grant DMS-2015379. Gennady Samorodnitsky: This research was partially supported by the ARO Grant W911NF-18 -10318 at Cornell University.

Appendix

1.1 Second-order regular variation

Second-order regular variation can be thought of as a way to quantify the vanishing difference between the left hand side and the right hand side of (1.1). It assumes that there is $\rho \le 0$ and a positive or negative function A that is regularly varying with exponent $\rho$ and $\lim _{t\rightarrow \infty } A(t) = 0$, such that for $x > 0$,

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\log U(tx) - \log U(t) - \gamma \log x}{A(t)} = \left\{ \begin{array}{ll} \frac{x^\rho - 1}{\rho } &{} \rho <0,\\ \log x &{} \rho =0, \end{array}\right. \end{aligned}$$

(7.1)

where $U(t) = F^{\leftarrow }(1- 1/t)$ and $F^{\leftarrow }$ is the generalized inverse of F; see e.g. de Haan and Ferreira (2006).

The results of this paper assume that the sequence $(k_n)$ used to define our estimators satisfies

$$\begin{aligned} \lim _{n \rightarrow \infty } \sqrt{k_n}A(n/k_n) = \lambda \end{aligned}$$

(7.2)

for some $\lambda \in {{\mathbb {R}}}$. Since $k_n\rightarrow \infty$, condition (7.2) implies that $n / k_n \rightarrow \infty$.

Distributions that satisfy the second-order condition include the Student’s $t_\nu$, stable, and Fréchet distributions; see, e.g. Drees (1998) and Drees et al. (2000). In fact, any distribution with $\bar{F}(x) = c_1 x^{-\alpha } + c_2 x^{-\alpha + \alpha \rho } (1 + o(1))$ as $x \rightarrow \infty$, where $c_1 > 0$, $c_2 \ne 0$, $\alpha > 0$ and $\rho < 0$, satisfies the second-order condition with the indicated values of $\alpha$ and $\rho$ (de Haan and Ferreira 2006).

1.2 Proofs

In this section we present the proofs of the results in the earlier parts of the paper.

Proof of Lemma 3.1

Since

$$\begin{aligned} \partial _1 L_{n}(\gamma _0, \delta _0)= & {} \frac{2m}{\gamma _0}-\frac{2}{\gamma _0}\sum _{i=1}^{m}\omega _{i,\delta _0}Y_{ni}^2- \frac{2\sqrt{k_n}}{\gamma _0}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}Y_{ni} \end{aligned}$$

and

$$\begin{aligned} \partial _2 L_{n}(\gamma _0,\delta _0)= & {} -\sum _{i=1}^{m}\frac{\omega _{i,\delta _0}^{\prime }}{\omega _{i,\delta _0}} +\sum _{i=1}^{m}\omega _{i,\delta _0}^{\prime }Y_{ni}^2 -2\sqrt{k_n}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}^{\prime }Y_{ni}\,, \end{aligned}$$

the claim of the lemma follows from (3.4). $\square$

Proof of Lemma 3.2

We proceed as in the proof of Lemma 3.1, except now one needs to take second derivatives. For example, elementary calculations give us

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}= & {} -\frac{2k}{{\tilde{\gamma }}^2k_n}+\frac{6\gamma _0^2}{{\tilde{\gamma }}^4k_n} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}Y_{ni}^2-\frac{12\gamma _0}{{\tilde{\gamma }}^4\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0 h_{\delta _0,i})Y_{ni}\nonumber \\&+\frac{6}{{\tilde{\gamma }}^4}\sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}} ({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0 h_{\delta _0,i})^2+\frac{8\gamma _0}{{\tilde{\gamma }}^3\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}}Y_{ni}\nonumber \\&-\frac{8}{{\tilde{\gamma }}^3} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}} ({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0h_{\delta _0,i})+ \frac{2}{{\tilde{\gamma }}^2}\sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}}^2. \end{aligned}$$

Using (3.4) and the fact that $({\tilde{\gamma }},{\tilde{\delta }}){\mathop {\rightarrow }\limits ^{P}}(\gamma _0,\delta _0)$ we see that

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}{\mathop {\rightarrow }\limits ^{P}}\frac{2}{\gamma _0^2}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}^2=\frac{2b_m}{\gamma _0^2}\,.\end{aligned}$$

The other terms of the Hessian matrix can be handled in a similar manner. $\square$

Proof of Lemma 3.3

Denote

$$\begin{aligned} L(\gamma ,\delta )=\gamma ^{-2}\sum _{i=1}^{m}\omega _{i,\delta } (\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})^2, \ (\gamma ,\delta )\in \Theta \,. \end{aligned}$$

Since we can write

$$\begin{aligned} L_n(\gamma , \delta )/k_n= & {} \frac{2m\log \gamma }{k_n}-\frac{1}{k_n}\sum _{i=1}^{m}\log \omega _{i,\delta } +\frac{\gamma _0^2}{\gamma ^2k_n}\sum _{i=1}^{m}\omega _{i,\delta }Y_{ni}^2\\&-\frac{2\gamma _0}{\gamma ^2\sqrt{k_n}}\sum _{i=1}^{m}\omega _{i,\delta } (\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})Y_{ni} +\frac{1}{\gamma ^2}\sum _{i=1}^{m}\omega _{i,\delta }(\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})^2\,, \end{aligned}$$

we have

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{L_n(\gamma ,\delta )}{k_n}-L(\gamma ,\delta )\bigg |\\&\quad \le \sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{2m\log \gamma }{k_n}-\frac{1}{k_n} \sum _{i=1}^{m}\log \omega _{i,\delta }\bigg |+\sup _{(\gamma ,\delta )\in \Theta }\bigg | \frac{\gamma _0^2}{\gamma ^2k_n}\sum _{i=1}^{m}\omega _{i,\delta }Y_{ni}^2\bigg |\\&\qquad +\sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{2\gamma _0}{\gamma ^2\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,\delta }(\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})Y_{ni}\bigg | {\mathop {\rightarrow }\limits ^{P}}0, \ \ n\rightarrow \infty \,, \end{aligned}$$

by (3.4), since we know that, by assumption, $\gamma , \omega _{i,\delta }$ and $h_{\delta ,i}$ are bounded away from 0 and infinity on $\Theta$.

Clearly, the point $(\gamma _0,\delta _0)$ is a minimizer of the function $\gamma ^2L$. Furthermore, it is elementary to check that the Hessian matrix of $\gamma ^2L$ at that point is equal to $2\gamma _0^2\Gamma _m$. We will see in the proof of Proposition 3.1 below that the matrix $\Gamma _m$ is invertible, hence the point $(\gamma _0,\delta _0)$ is the unique minimizer of the function $\gamma ^2L$, hence also of the function L. The uniform convergence in probability of the function $L_n/k_n$ to the function L implies that any minimizer of the former function converges in probability to the unique minimizer of the limit function. Hence the statement of the lemma. $\square$

Proof of Proposition 3.1

Introduce functions of $x>0$

$$\begin{aligned} l_{\delta }(x)=x^2/(x+\delta ), \ \ m_{\delta }(x)=x^2v(x/\delta )/\delta =x-2\delta \log (1+x/\delta )+\delta x/(x+\delta )\,, \end{aligned}$$

so that

$$\begin{aligned} \omega _{i,\delta _0}&= \theta _{i}^2/\big (m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})\big ), \ g_{\delta _0,i}= \big (m_{\delta _0}(\theta _{i})+l_{\delta _0}(\theta _{i})\big )/2\theta _{i}, \\ g_{\delta _0,i}^{\prime }&= \big (m_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i})\big )/2\delta _0\theta _{i}, \ \ i=1,\ldots , m\,. \end{aligned}$$

Therefore we can write

$$\begin{aligned} b_{m}= & {} \frac{m_{\delta _0}(\theta _{m})+2l_{\delta _0}(\theta _{m})}{4} +\frac{1}{4}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i}) -l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}, \\ c_{m}= & {} \frac{m_{\delta _0}(\theta _{m})-2l_{\delta _0}(\theta _{m})}{4\delta _0^2} +\frac{1}{4\delta _0^2}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i}) -l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}, \\ d_{m}= & {} \frac{m_{\delta _0}(\theta _{m})}{4\delta _0} -\frac{1}{4\delta _0}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}. \end{aligned}$$

We now show that the matrix $\Gamma _m$ is invertible. A direct computation shows that

$$\begin{aligned} 4\delta _0^2(b_mc_m- d_{m}^2) =m_{\delta _0}(\theta _{m})\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}-l^2_{\delta _0}(\theta _{m})\,. \end{aligned}$$

It is easy to check that the functions $l_{\delta _0}$ and $m_{\delta _0}$ are increasing on $(0,\infty )$, so that for any $i\ge 1$, $l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})>0$ and $m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})>0$. Further, by the Cauchy-Schwarz inequality,

$$\begin{aligned} m_{\delta _0}(\theta _{m})\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}= & {} \sum _{i=1}^{m} \big (m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})\big ) \sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}\\\ge & {} \bigg (\sum _{i=1}^{m} \big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )\bigg )^2=l^2_{\delta _0}(\theta _{m}), \end{aligned}$$

and the equality holds if and only if

$$\begin{aligned} \frac{l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{1})} =\frac{l_{\delta _0}(\theta _{2})-l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{2})-m_{\delta _0}(\theta _{1})} =\cdots =\frac{l_{\delta _0}(\theta _{m})-l_{\delta _0}(\theta _{m-1})}{m_{\delta _0}(\theta _{m})-m_{\delta _0}(\theta _{m-1})}\,. \end{aligned}$$

The latter requirement is equivalent to

$$\begin{aligned} \frac{l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{1})} =\frac{l_{\delta _0}(\theta _{2})}{m_{\delta _0}(\theta _{2})} =\cdots =\frac{l_{\delta _0}(\theta _{m})}{m_{\delta _0}(\theta _{m})}\,, \end{aligned}$$

(7.3)

so invertibility of $\Gamma _m$ will follow once we show that (7.3) cannot hold. If we put

$$\begin{aligned} Q(x)=:m_{\delta _0}(x)-\frac{l_{\delta _0}(x)m_{\delta _0}^{\prime }(x)}{l_{\delta _0}^{\prime }(x)} =m_{\delta _0}(x)-\frac{x^3}{(x+\delta _0)(x+2\delta _0)}\,, \end{aligned}$$

then $Q(0)=0$ and

$$\begin{aligned} Q^{\prime }(x)=-\frac{2\delta _0x^2}{(x+\delta _0)(x+2\delta _0)^2}<0, \ \ x>0, \end{aligned}$$

which implies that

$$\begin{aligned} Q(x)= \frac{l_{\delta _0}^{\prime }(x)}{m_{\delta _0}^2(x)}\left( \frac{l_{\delta _0}(x)}{m_{\delta _0}(x)}\right) ^\prime <0 \end{aligned}$$

for any $x>0$. Since $l_{\delta _0}^{\prime }(x)>0$, we conclude that the function $l_{\delta _0}(x)/m_{\delta _0}(x)$ is strictly decreasing on the positive half line, and so (7.3) cannot hold. Hence the matrix $\Gamma _m$ is invertible.

It is elementary to check that, as $\delta _0\rightarrow 0$,

$$\begin{aligned} b_m\rightarrow \theta _m, \ \ c_m\sim \theta _1^{-1}(\log \delta _0)^2, \ \ d_m\rightarrow \log \delta _0\,. \end{aligned}$$

(7.4)

Substituting this into (3.10) shows convergence of the variance in (3.12).

Similarly, it is elementary to check that, as $\delta _0\rightarrow 0$,

$$\begin{aligned} a_1\rightarrow \frac{2\lambda }{1-\rho }\gamma _0^{-2} \theta _m^{1-\rho }, \ \ a_2\sim \frac{2\lambda }{1-\rho }\gamma _0^{-1}\theta _1^{-\rho } \log \delta _0\,. \end{aligned}$$

(7.5)

Substituting (7.4) and (7.5) into (3.10) and (3.11) proves convergence of the mean in (3.12). $\square$

Proof of Lemma 4.1

By (4.3),

$$\begin{aligned} \partial _1L_n(\gamma _0,\delta _0)=\frac{2k_n}{\gamma _0} -\frac{2\omega _{1,\delta _0}\eta _n^2}{\gamma _0}-\frac{2\sqrt{k_n}}{\gamma _0}\omega _{1,\delta _0}g_{\delta _0,1}\eta _n -\frac{2}{\gamma _0}\sum _{i=2}^{k_n}Z_{i,n} \end{aligned}$$

and

$$\begin{aligned} \partial _2L_n(\gamma _0,\delta _0)= & {} -\frac{\omega _{1,\delta _0}^{\prime }}{\omega _{1,\delta _0}} -\sum _{i=2}^{k_n}\frac{2}{\delta _0+\theta _{i,n}}+\omega _{1,\delta _0}^{\prime }\eta _n^2\\&-2\sqrt{k_n}\omega _{1,\delta _0}g_{\delta _0,1}^{\prime }\eta _n+2\sum _{i=2}^{k_n}\frac{Z_{i,n}}{\delta _0+\theta _{i,n}}. \end{aligned}$$

Since

$$\begin{aligned} \left( \begin{matrix} -\frac{2}{\gamma _0}\omega _{1,\delta _0}g_{\delta _0,1}\eta _n \\ -2\omega _{1,\delta _0}g_{\delta _0,1}^{\prime }\eta _n \end{matrix}\right) \Rightarrow N\big (0,4\Gamma _1\big )\,, \end{aligned}$$

the claim of the lemma will follow once we show that

$$\begin{aligned} \left( \begin{matrix} -k_n^{-1/2}\gamma _0^{-1}\sum _{i=2}^{k_n} \bigl ( Z_{i,n}-1\bigr ) \\ k_n^{-1/2}\sum _{i=2}^{k_n}\frac{Z_{i,n}-1}{\delta _0+\theta _{i,n}} \end{matrix}\right) \Rightarrow N\big (0,\Gamma _0\big )\,, \end{aligned}$$

where $\Gamma _0$ is the second matrix in the right hand side of (4.4). By (4.1) we only need to prove that

$$\begin{aligned} \left( \begin{matrix} -k_n^{-1/2}\gamma _0^{-1}\sum _{i=2}^{k_n} \frac{k_n(\delta _0+\theta _{i,n})}{[\delta _0k_n]+[\theta _{i,n}k_n]}\bigl ( E^*_{[\delta _0k_n]+[\theta _{i,n} k_n]}-1\bigr ) \\ k_n^{-1/2}\sum _{i=2}^{k_n} \frac{k_n}{[\delta _0k_n]+[\theta _{i,n}k_n]}\bigl ( E^*_{[\delta _0k_n]+[\theta _{i,n} k_n]}-1\bigr ) \end{matrix}\right) \Rightarrow N\big (0,\Gamma _0\big )\,. \end{aligned}$$

(7.6)

Since the covariance matrix of the random vector in the left hand side of (7.6) converges to $\Gamma _0$, only the Lyapunov condition needs to be checked for an application of the central limit theorem. The latter can be performed component-wise and is elementary when taking, for instance, the 4th powers of the terms. $\square$

Proof of Lemma 4.2

Once again, computing the second derivatives, we obtain, for example,

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}= & {} -\frac{2}{{{\tilde{\gamma }}}^2} + \frac{6 \omega _{1,{\tilde{\delta }}}}{{\tilde{\gamma }}^4k_n}\Big (\gamma _0\eta _n- \sqrt{k_n}({\tilde{\gamma }} g_{{\tilde{\delta }},1}-\gamma _0 g_{\delta _0,1})\Big )^2 \\&+\frac{8 \omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}}{{\tilde{\gamma }}^3\sqrt{k_n}}\Big (\gamma _0\eta _n- \sqrt{k_n}({\tilde{\gamma }} g_{{\tilde{\delta }},1}-\gamma _0 g_{\delta _0,1})\Big )+ \frac{2}{{\tilde{\gamma }}^2}\omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}^2 \\&+\frac{4\gamma _0}{{\tilde{\gamma }}^3k_n}\sum _{i=2}^{k_n}\frac{({\tilde{\delta }} +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}\,. \end{aligned}$$

Clearly, the second and the third terms in the right hand side are $o_p(1)$ as $n\rightarrow \infty$. Furthermore,

$$\begin{aligned} -\frac{2}{{{\tilde{\gamma }}}^2} \rightarrow -\frac{2}{\gamma _0^2}, \ \ \ \frac{2}{{\tilde{\gamma }}^2}\omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}^2 \rightarrow \frac{2}{{\gamma _0}^2}\omega _{1,{\delta _0}}g_{{\delta _0},1}^2 \ \ \text {in probability,} \end{aligned}$$

and by computing the mean and the variance we see that

$$\begin{aligned} \frac{4\gamma _0}{{\tilde{\gamma }}^3k_n}\sum _{i=2}^{k_n}\frac{({\tilde{\delta }} +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}} \rightarrow \frac{4}{\gamma _0^2} \end{aligned}$$

in probability. Therefore,

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n} \rightarrow \frac{2}{\gamma _0^2} + \frac{2}{{\gamma _0}^2}\omega _{1,{\delta _0}}g_{{\delta _0},1}^2 \end{aligned}$$

in probability, and the limit is the appropriate entry in the matrix $2\Gamma _\infty$. The other terms of the Hessian matrix can be handled in a similar manner. $\square$

Proof of Lemma 4.3

We proceed as in the proof of Lemma 3.3. Denote now

$$\begin{aligned} L(\gamma ,\delta )= & {} 2\log \gamma +\frac{{{\tilde{\omega }}}_{1,\delta }}{\gamma ^2} (\gamma {\tilde{g}}_{\delta ,1}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\\&-2\int _{\varepsilon }^{1+\varepsilon }\log (\delta +x)\, dx+ \frac{2\gamma _0}{\gamma }\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}\, dx\,, \end{aligned}$$

where ${{\tilde{\omega }}}_{1,\delta }$ is defined as $\omega _{1,\delta }$ and ${\tilde{g}}_{\delta ,1}$ is defined as $g_{\delta ,1}$, both with $\theta _1=\varepsilon$. Since we can write

$$\begin{aligned} L_n(\gamma ,\delta )/k_n= & {} 2\log \gamma -\frac{1}{k_n}\log \omega _{1,\delta }-\frac{2}{k_n} \sum _{i=2}^{k_n}\log (\delta +\theta _{i,n}) +\frac{\gamma _0^2\omega _{1,\delta }}{\gamma ^2k_n}\eta _n^2\\&- \frac{2\gamma _0\omega _{1,\delta }}{\gamma ^2\sqrt{k_n}} (\gamma g_{\delta ,1}-\gamma _0g_{\delta _0,1})\eta _{n} +\frac{\omega _{1,\delta }}{\gamma ^2}(\gamma g_{\delta ,1} -\gamma _0g_{\delta _0,1})^2\\&+\frac{2\gamma _0}{\gamma k_n}\sum _{i=2}^{k_n} \frac{(\delta +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}, \end{aligned}$$

it follows that

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta } \bigg |\frac{L_n(\gamma ,\delta )}{k_n}-L(\gamma ,\delta )\bigg |\\&\quad \le \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{\omega _{1,\delta }}{\gamma ^2}(\gamma g_{\delta ,1} -\gamma _0g_{\delta _0,1})^2 -\frac{{{\tilde{\omega }}}_{1,\delta }}{\gamma ^2} (\gamma {\tilde{g}}_{\delta ,1}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\right| \\&\quad \quad +\sup _{(\gamma ,\delta )\in \Theta }\left| -\frac{1}{k_n} \log \omega _{1,\delta } +\frac{\gamma _0^2\omega _{1,\delta }}{\gamma ^2k_n}\eta _n^2- \frac{2\gamma _0\omega _{1,\delta }}{\gamma ^2\sqrt{k_n}} (\gamma g_{\delta ,1}-\gamma _0g_{\delta _0,1})\eta _n\right| \\&\quad \quad + \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2}{k_n}\sum _{i=2}^{k_n} \log (\delta +\theta _{i,n})-2\int _{\varepsilon }^{1+\varepsilon }\log (\delta +x)\, dx\right| \\&\quad \quad +\sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0}{\gamma k_n}\sum _{i=2}^{k_n} \frac{(\delta +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}- \frac{2\gamma _0}{\gamma }\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}dx\right| \,. \end{aligned}$$

It is clear that the first three terms in the right hand side vanish as $n\rightarrow \infty$. The same is true for the last term in the right hand side because we can bound the latter by

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0\delta }{\gamma } \right| \cdot \left| \frac{1}{k_n}\sum _{i=2}^{k_n}\frac{Z_{i,n}}{\delta _0+\theta _{i,n}} -\int _{\varepsilon }^{1+\varepsilon }\frac{1}{\delta _0+x}dx\right| \\&\quad + \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0}{\gamma } \right| \cdot \left| \frac{1}{k_n}\sum _{i=2}^{k_n}\frac{\theta _{i,n}Z_{i,n}}{\delta _0+\theta _{i,n}} -\int _{\varepsilon }^{1+\varepsilon }\frac{x}{\delta _0+x}dx\right| \,. \end{aligned}$$

It is clear that both suprema are finite, while by computing once again the means and the variances we see that the two differences converge to zero in probability.

Clearly, the point $(\gamma _0,\delta _0)$ is a minimizer of the function ${{\tilde{\omega }}}_{1,\delta }\gamma ^{-2}(\gamma \tilde{g}_{\delta ,0}-\gamma _0{\tilde{g}}_{\delta _0,1})^2$. Let us denote the remaining part of the function $L(\gamma ,\delta )$ by $L_1(\gamma ,\delta )$. To check that the point $(\gamma _0,\delta _0)$ is a unique minimizer of the latter function, note that for a fixed value of $\delta$ the unique minimizer of $L_1(\cdot ,\delta )$ is the point

$$\begin{aligned} \gamma (\delta ) = \gamma _0\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}\, dx. \end{aligned}$$

Since, up to $\delta$-independent terms,

$$\begin{aligned} L_1(\gamma (\delta ),\delta )= \log \left( \int _{\varepsilon }^{1+\varepsilon }\frac{\delta +x}{\delta _0+x}\, dx\right) - \int _{\varepsilon }^{1+\varepsilon } \log \left( \frac{\delta +x}{\delta _0+x}\right) dx\,, \end{aligned}$$

which vanishes for $\delta =\delta _0$ and is strictly positive by Jensen’s inequality for $\delta \not =\delta _0$, we see that $\delta =\delta _0$ and $\gamma =\gamma (\delta _0=\gamma _0$ is the unique minimizer of $L_1$ and, hence, also of L.

As before, the uniform convergence of $L_n/k_n$ to L implies now that any minimizer of the former function convergence in probability to $(\gamma _0,\delta _0)$. Lemma 4.2 and the fact that $\Gamma _\infty$ is invertible mean that, with probability converging to 1, the minimizer of $L_n$ is unique. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, H., Davis, R. & Samorodnitsky, G. Handling missing extremes in tail estimation. Extremes 25, 199–227 (2022). https://doi.org/10.1007/s10687-021-00429-z

Download citation

Received: 09 June 2020
Revised: 07 September 2021
Accepted: 08 September 2021
Published: 23 December 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10687-021-00429-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Handling missing extremes in tail estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the study of extremes with dependent random right-censoring

Estimation of extremes for Weibull-tail distributions in the presence of random censoring

Likelihood estimators for multivariate extremes

Data availibility

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 Second-order regular variation

1.2 Proofs

Proof of Lemma 3.1

Proof of Lemma 3.2

Proof of Lemma 3.3

Proof of Proposition 3.1

Proof of Lemma 4.1

Proof of Lemma 4.2

Proof of Lemma 4.3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Handling missing extremes in tail estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the study of extremes with dependent random right-censoring

Estimation of extremes for Weibull-tail distributions in the presence of random censoring

Likelihood estimators for multivariate extremes

Data availibility

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Second-order regular variation

1.2 Proofs

Proof of Lemma 3.1

Proof of Lemma 3.2

Proof of Lemma 3.3

Proof of Proposition 3.1

Proof of Lemma 4.1

Proof of Lemma 4.2

Proof of Lemma 4.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation