Log in

Handling missing extremes in tail estimation

  • Published:
Extremes Aims and scope Submit manuscript

Abstract

In some data sets, it may be the case that a portion of the extreme observations is missing. This might arise in cases where the extreme observations are just not available or are imprecisely measured. For example, considering human lifetimes, a topic of recent interest, birth certificates of centenarians may not even exist and many such individuals may not even be included in the data sets that are currently available. In essence, one does not have a clear record of the largest lifetimes of human populations. If there are missing extreme observations, then the assessment of risk can be severely underestimated resulting in rare events occurring more often than originally thought. In concrete terms, this may mean a 500 year flood is in fact a 100 (or even a 20) year flood. In this paper, we present methods for estimating the number of missing extremes together with the tail index associated with tail heaviness of the data. Ignoring one or the other can severely impact the estimation of risk. Our estimates are based on the HEWE (Hill estimate without extremes) of the tail index that adjusts for missing extremes. Based on a functional convergence of this process to a limit process, we consider an asymptotic likelihood-based procedure for estimating both the number of missing extremes and the tail index. We derive the asymptotic distribution of the resulting estimates. By artificially removing segments of extremes in the data, this methodology can be used for assessing the reliability of the underlying assumptions that are imposed on the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availibility

The Danish fire data and the Natural and Climate Disasters in the U.S. data are publicly available as indicated in the text. The Google+ data are not publicly available.

Notes

  1. The general form of the GPD that combines light- and heavy-tailed cases has a similar form, see de Haan and Ferreira (2006).

References

  • Aban, I., Meerschaert, M., Panorska, A.: Parameter estimation for the truncated Pareto distribution. J. Am. Stat. Assoc. 101, 270–277 (2006)

    Article  MathSciNet  Google Scholar 

  • Beirlant, J., Alves, I., Gomes, I.: Tail fitting for truncated and non-truncated Pareto-type distributions. Extremes 19, 429–462 (2016)

    Article  MathSciNet  Google Scholar 

  • Beirlant, J., Fraga Alves, I., Reynkens, T.: Fitting tails affected by truncation. Electron. J. Stat. 11, 2026–2065 (2017)

    Article  MathSciNet  Google Scholar 

  • Benchaira, S., Meraghmi, D., Necir, A.: Tail product-limit process for truncated data with application to extreme value index estimation. Extremes 19, 219–251 (2016)

    Article  MathSciNet  Google Scholar 

  • Bhattacharya, S., Kallitsis, M., Stoev, S.: Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data. Electronic J. Stat. 13, 1872–1925 (2019)

    Article  MathSciNet  Google Scholar 

  • de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer, New York (2006)

    Book  Google Scholar 

  • Drees, H.: On smooth statistical tail functionals. Scand. J. Stat. 25, 187–210 (1998)

    Article  MathSciNet  Google Scholar 

  • Drees, H., de Haan, L., Resnick, S.: How to make a Hill plot. Ann. Stat. 28, 254–274 (2000)

    Article  MathSciNet  Google Scholar 

  • Einmahl, J.H., Fils-Villetard, A., Guillou, A., et al.: Statistics of extremes under random censoring. Bernoulli 14, 207–227 (2008)

    MathSciNet  MATH  Google Scholar 

  • Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events for Insurance and Finance. Springer-Verlag, Berlin (1997)

    Book  Google Scholar 

  • Hill, B.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3, 1163–1174 (1975)

    Article  MathSciNet  Google Scholar 

  • Newman, M.: Networks: An Introduction. Oxford University Press, Oxford (2010)

    Book  Google Scholar 

  • Reiss, R.-D.: Asymptotic Distribution of Order Statistics. Springer, New York (1989)

    Google Scholar 

  • Resnick, S.: Discussion of the Danish data on large fire insurance losses. Astin Bull. 27, 139–151 (1997)

    Article  MathSciNet  Google Scholar 

  • Resnick, S.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York (2007)

    MATH  Google Scholar 

  • Smith, A., Katz, R.: US billion-dollar weather and climate disasters: data sources, trends, accuracy and biases. Nat. Hazards 67, 387–410 (2013)

    Article  Google Scholar 

  • Stupfler, G.: Estimating the conditional extreme-value index under random right-censoring. J. Multivar. Anal. 144, 1–24 (2016)

    Article  MathSciNet  Google Scholar 

  • Worms, J., Worms, R.: New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes 17, 337–358 (2014)

    Article  MathSciNet  Google Scholar 

  • Zou, J., Davis, R., Samorodnitsky, G.: Extreme value analysis without the largest values: what can be done? Probab. Eng. Inf. Sci. (2019). https://doi.org/10.1017/S0269964818000542:1-21

Download references

Acknowledgements

We would like to thank the two anonymous referees for their useful comments that forced us to make changes that were truly necessary.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Davis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Richard Davis: This research was supported in part by NSF Grant DMS-2015379. Gennady Samorodnitsky: This research was partially supported by the ARO Grant W911NF-18 -10318 at Cornell University.

Appendix

Appendix

1.1 Second-order regular variation

Second-order regular variation can be thought of as a way to quantify the vanishing difference between the left hand side and the right hand side of (1.1). It assumes that there is \(\rho \le 0\) and a positive or negative function A that is regularly varying with exponent \(\rho\) and \(\lim _{t\rightarrow \infty } A(t) = 0\), such that for \(x > 0\),

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\log U(tx) - \log U(t) - \gamma \log x}{A(t)} = \left\{ \begin{array}{ll} \frac{x^\rho - 1}{\rho } &{} \rho <0,\\ \log x &{} \rho =0, \end{array}\right. \end{aligned}$$
(7.1)

where \(U(t) = F^{\leftarrow }(1- 1/t)\) and \(F^{\leftarrow }\) is the generalized inverse of F; see e.g. de Haan and Ferreira (2006).

The results of this paper assume that the sequence \((k_n)\) used to define our estimators satisfies

$$\begin{aligned} \lim _{n \rightarrow \infty } \sqrt{k_n}A(n/k_n) = \lambda \end{aligned}$$
(7.2)

for some \(\lambda \in {{\mathbb {R}}}\). Since \(k_n\rightarrow \infty\), condition (7.2) implies that \(n / k_n \rightarrow \infty\).

Distributions that satisfy the second-order condition include the Student’s \(t_\nu\), stable, and Fréchet distributions; see, e.g. Drees (1998) and Drees et al. (2000). In fact, any distribution with \(\bar{F}(x) = c_1 x^{-\alpha } + c_2 x^{-\alpha + \alpha \rho } (1 + o(1))\) as \(x \rightarrow \infty\), where \(c_1 > 0\), \(c_2 \ne 0\), \(\alpha > 0\) and \(\rho < 0\), satisfies the second-order condition with the indicated values of \(\alpha\) and \(\rho\) (de Haan and Ferreira 2006).

1.2 Proofs

In this section we present the proofs of the results in the earlier parts of the paper.

Proof of Lemma 3.1

Since

$$\begin{aligned} \partial _1 L_{n}(\gamma _0, \delta _0)= & {} \frac{2m}{\gamma _0}-\frac{2}{\gamma _0}\sum _{i=1}^{m}\omega _{i,\delta _0}Y_{ni}^2- \frac{2\sqrt{k_n}}{\gamma _0}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}Y_{ni} \end{aligned}$$

and

$$\begin{aligned} \partial _2 L_{n}(\gamma _0,\delta _0)= & {} -\sum _{i=1}^{m}\frac{\omega _{i,\delta _0}^{\prime }}{\omega _{i,\delta _0}} +\sum _{i=1}^{m}\omega _{i,\delta _0}^{\prime }Y_{ni}^2 -2\sqrt{k_n}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}^{\prime }Y_{ni}\,, \end{aligned}$$

the claim of the lemma follows from (3.4). \(\square\)

Proof of Lemma 3.2

We proceed as in the proof of Lemma 3.1, except now one needs to take second derivatives. For example, elementary calculations give us

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}= & {} -\frac{2k}{{\tilde{\gamma }}^2k_n}+\frac{6\gamma _0^2}{{\tilde{\gamma }}^4k_n} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}Y_{ni}^2-\frac{12\gamma _0}{{\tilde{\gamma }}^4\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0 h_{\delta _0,i})Y_{ni}\nonumber \\&+\frac{6}{{\tilde{\gamma }}^4}\sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}} ({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0 h_{\delta _0,i})^2+\frac{8\gamma _0}{{\tilde{\gamma }}^3\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}}Y_{ni}\nonumber \\&-\frac{8}{{\tilde{\gamma }}^3} \sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}} ({\tilde{\gamma }} h_{\tilde{\delta ,i}}-\gamma _0h_{\delta _0,i})+ \frac{2}{{\tilde{\gamma }}^2}\sum _{i=1}^{m}\omega _{i,{\tilde{\delta }}}h_{\tilde{\delta ,i}}^2. \end{aligned}$$

Using (3.4) and the fact that \(({\tilde{\gamma }},{\tilde{\delta }}){\mathop {\rightarrow }\limits ^{P}}(\gamma _0,\delta _0)\) we see that

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}{\mathop {\rightarrow }\limits ^{P}}\frac{2}{\gamma _0^2}\sum _{i=1}^{m}\omega _{i,\delta _0}h_{\delta _0,i}^2=\frac{2b_m}{\gamma _0^2}\,.\end{aligned}$$

The other terms of the Hessian matrix can be handled in a similar manner. \(\square\)

Proof of Lemma 3.3

Denote

$$\begin{aligned} L(\gamma ,\delta )=\gamma ^{-2}\sum _{i=1}^{m}\omega _{i,\delta } (\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})^2, \ (\gamma ,\delta )\in \Theta \,. \end{aligned}$$

Since we can write

$$\begin{aligned} L_n(\gamma , \delta )/k_n= & {} \frac{2m\log \gamma }{k_n}-\frac{1}{k_n}\sum _{i=1}^{m}\log \omega _{i,\delta } +\frac{\gamma _0^2}{\gamma ^2k_n}\sum _{i=1}^{m}\omega _{i,\delta }Y_{ni}^2\\&-\frac{2\gamma _0}{\gamma ^2\sqrt{k_n}}\sum _{i=1}^{m}\omega _{i,\delta } (\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})Y_{ni} +\frac{1}{\gamma ^2}\sum _{i=1}^{m}\omega _{i,\delta }(\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})^2\,, \end{aligned}$$

we have

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{L_n(\gamma ,\delta )}{k_n}-L(\gamma ,\delta )\bigg |\\&\quad \le \sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{2m\log \gamma }{k_n}-\frac{1}{k_n} \sum _{i=1}^{m}\log \omega _{i,\delta }\bigg |+\sup _{(\gamma ,\delta )\in \Theta }\bigg | \frac{\gamma _0^2}{\gamma ^2k_n}\sum _{i=1}^{m}\omega _{i,\delta }Y_{ni}^2\bigg |\\&\qquad +\sup _{(\gamma ,\delta )\in \Theta }\bigg |\frac{2\gamma _0}{\gamma ^2\sqrt{k_n}} \sum _{i=1}^{m}\omega _{i,\delta }(\gamma h_{\delta ,i}-\gamma _0 h_{\delta _0,i})Y_{ni}\bigg | {\mathop {\rightarrow }\limits ^{P}}0, \ \ n\rightarrow \infty \,, \end{aligned}$$

by (3.4), since we know that, by assumption, \(\gamma , \omega _{i,\delta }\) and \(h_{\delta ,i}\) are bounded away from 0 and infinity on \(\Theta\).

Clearly, the point \((\gamma _0,\delta _0)\) is a minimizer of the function \(\gamma ^2L\). Furthermore, it is elementary to check that the Hessian matrix of \(\gamma ^2L\) at that point is equal to \(2\gamma _0^2\Gamma _m\). We will see in the proof of Proposition 3.1 below that the matrix \(\Gamma _m\) is invertible, hence the point \((\gamma _0,\delta _0)\) is the unique minimizer of the function \(\gamma ^2L\), hence also of the function L. The uniform convergence in probability of the function \(L_n/k_n\) to the function L implies that any minimizer of the former function converges in probability to the unique minimizer of the limit function. Hence the statement of the lemma. \(\square\)

Proof of Proposition 3.1

Introduce functions of \(x>0\)

$$\begin{aligned} l_{\delta }(x)=x^2/(x+\delta ), \ \ m_{\delta }(x)=x^2v(x/\delta )/\delta =x-2\delta \log (1+x/\delta )+\delta x/(x+\delta )\,, \end{aligned}$$

so that

$$\begin{aligned} \omega _{i,\delta _0}&= \theta _{i}^2/\big (m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})\big ), \ g_{\delta _0,i}= \big (m_{\delta _0}(\theta _{i})+l_{\delta _0}(\theta _{i})\big )/2\theta _{i}, \\ g_{\delta _0,i}^{\prime }&= \big (m_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i})\big )/2\delta _0\theta _{i}, \ \ i=1,\ldots , m\,. \end{aligned}$$

Therefore we can write

$$\begin{aligned} b_{m}= & {} \frac{m_{\delta _0}(\theta _{m})+2l_{\delta _0}(\theta _{m})}{4} +\frac{1}{4}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i}) -l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}, \\ c_{m}= & {} \frac{m_{\delta _0}(\theta _{m})-2l_{\delta _0}(\theta _{m})}{4\delta _0^2} +\frac{1}{4\delta _0^2}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i}) -l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}, \\ d_{m}= & {} \frac{m_{\delta _0}(\theta _{m})}{4\delta _0} -\frac{1}{4\delta _0}\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}. \end{aligned}$$

We now show that the matrix \(\Gamma _m\) is invertible. A direct computation shows that

$$\begin{aligned} 4\delta _0^2(b_mc_m- d_{m}^2) =m_{\delta _0}(\theta _{m})\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}-l^2_{\delta _0}(\theta _{m})\,. \end{aligned}$$

It is easy to check that the functions \(l_{\delta _0}\) and \(m_{\delta _0}\) are increasing on \((0,\infty )\), so that for any \(i\ge 1\), \(l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})>0\) and \(m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})>0\). Further, by the Cauchy-Schwarz inequality,

$$\begin{aligned} m_{\delta _0}(\theta _{m})\sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}= & {} \sum _{i=1}^{m} \big (m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})\big ) \sum _{i=1}^{m}\frac{\big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )^2}{m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})}\\\ge & {} \bigg (\sum _{i=1}^{m} \big (l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})\big )\bigg )^2=l^2_{\delta _0}(\theta _{m}), \end{aligned}$$

and the equality holds if and only if

$$\begin{aligned} \frac{l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{1})} =\frac{l_{\delta _0}(\theta _{2})-l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{2})-m_{\delta _0}(\theta _{1})} =\cdots =\frac{l_{\delta _0}(\theta _{m})-l_{\delta _0}(\theta _{m-1})}{m_{\delta _0}(\theta _{m})-m_{\delta _0}(\theta _{m-1})}\,. \end{aligned}$$

The latter requirement is equivalent to

$$\begin{aligned} \frac{l_{\delta _0}(\theta _{1})}{m_{\delta _0}(\theta _{1})} =\frac{l_{\delta _0}(\theta _{2})}{m_{\delta _0}(\theta _{2})} =\cdots =\frac{l_{\delta _0}(\theta _{m})}{m_{\delta _0}(\theta _{m})}\,, \end{aligned}$$
(7.3)

so invertibility of \(\Gamma _m\) will follow once we show that (7.3) cannot hold. If we put

$$\begin{aligned} Q(x)=:m_{\delta _0}(x)-\frac{l_{\delta _0}(x)m_{\delta _0}^{\prime }(x)}{l_{\delta _0}^{\prime }(x)} =m_{\delta _0}(x)-\frac{x^3}{(x+\delta _0)(x+2\delta _0)}\,, \end{aligned}$$

then \(Q(0)=0\) and

$$\begin{aligned} Q^{\prime }(x)=-\frac{2\delta _0x^2}{(x+\delta _0)(x+2\delta _0)^2}<0, \ \ x>0, \end{aligned}$$

which implies that

$$\begin{aligned} Q(x)= \frac{l_{\delta _0}^{\prime }(x)}{m_{\delta _0}^2(x)}\left( \frac{l_{\delta _0}(x)}{m_{\delta _0}(x)}\right) ^\prime <0 \end{aligned}$$

for any \(x>0\). Since \(l_{\delta _0}^{\prime }(x)>0\), we conclude that the function \(l_{\delta _0}(x)/m_{\delta _0}(x)\) is strictly decreasing on the positive half line, and so (7.3) cannot hold. Hence the matrix \(\Gamma _m\) is invertible.

It is elementary to check that, as \(\delta _0\rightarrow 0\),

$$\begin{aligned} b_m\rightarrow \theta _m, \ \ c_m\sim \theta _1^{-1}(\log \delta _0)^2, \ \ d_m\rightarrow \log \delta _0\,. \end{aligned}$$
(7.4)

Substituting this into (3.10) shows convergence of the variance in (3.12).

Similarly, it is elementary to check that, as \(\delta _0\rightarrow 0\),

$$\begin{aligned} a_1\rightarrow \frac{2\lambda }{1-\rho }\gamma _0^{-2} \theta _m^{1-\rho }, \ \ a_2\sim \frac{2\lambda }{1-\rho }\gamma _0^{-1}\theta _1^{-\rho } \log \delta _0\,. \end{aligned}$$
(7.5)

Substituting (7.4) and (7.5) into (3.10) and (3.11) proves convergence of the mean in (3.12). \(\square\)

Proof of Lemma 4.1

By (4.3),

$$\begin{aligned} \partial _1L_n(\gamma _0,\delta _0)=\frac{2k_n}{\gamma _0} -\frac{2\omega _{1,\delta _0}\eta _n^2}{\gamma _0}-\frac{2\sqrt{k_n}}{\gamma _0}\omega _{1,\delta _0}g_{\delta _0,1}\eta _n -\frac{2}{\gamma _0}\sum _{i=2}^{k_n}Z_{i,n} \end{aligned}$$

and

$$\begin{aligned} \partial _2L_n(\gamma _0,\delta _0)= & {} -\frac{\omega _{1,\delta _0}^{\prime }}{\omega _{1,\delta _0}} -\sum _{i=2}^{k_n}\frac{2}{\delta _0+\theta _{i,n}}+\omega _{1,\delta _0}^{\prime }\eta _n^2\\&-2\sqrt{k_n}\omega _{1,\delta _0}g_{\delta _0,1}^{\prime }\eta _n+2\sum _{i=2}^{k_n}\frac{Z_{i,n}}{\delta _0+\theta _{i,n}}. \end{aligned}$$

Since

$$\begin{aligned} \left( \begin{matrix} -\frac{2}{\gamma _0}\omega _{1,\delta _0}g_{\delta _0,1}\eta _n \\ -2\omega _{1,\delta _0}g_{\delta _0,1}^{\prime }\eta _n \end{matrix}\right) \Rightarrow N\big (0,4\Gamma _1\big )\,, \end{aligned}$$

the claim of the lemma will follow once we show that

$$\begin{aligned} \left( \begin{matrix} -k_n^{-1/2}\gamma _0^{-1}\sum _{i=2}^{k_n} \bigl ( Z_{i,n}-1\bigr ) \\ k_n^{-1/2}\sum _{i=2}^{k_n}\frac{Z_{i,n}-1}{\delta _0+\theta _{i,n}} \end{matrix}\right) \Rightarrow N\big (0,\Gamma _0\big )\,, \end{aligned}$$

where \(\Gamma _0\) is the second matrix in the right hand side of (4.4). By (4.1) we only need to prove that

$$\begin{aligned} \left( \begin{matrix} -k_n^{-1/2}\gamma _0^{-1}\sum _{i=2}^{k_n} \frac{k_n(\delta _0+\theta _{i,n})}{[\delta _0k_n]+[\theta _{i,n}k_n]}\bigl ( E^*_{[\delta _0k_n]+[\theta _{i,n} k_n]}-1\bigr ) \\ k_n^{-1/2}\sum _{i=2}^{k_n} \frac{k_n}{[\delta _0k_n]+[\theta _{i,n}k_n]}\bigl ( E^*_{[\delta _0k_n]+[\theta _{i,n} k_n]}-1\bigr ) \end{matrix}\right) \Rightarrow N\big (0,\Gamma _0\big )\,. \end{aligned}$$
(7.6)

Since the covariance matrix of the random vector in the left hand side of (7.6) converges to \(\Gamma _0\), only the Lyapunov condition needs to be checked for an application of the central limit theorem. The latter can be performed component-wise and is elementary when taking, for instance, the 4th powers of the terms. \(\square\)

Proof of Lemma 4.2

Once again, computing the second derivatives, we obtain, for example,

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n}= & {} -\frac{2}{{{\tilde{\gamma }}}^2} + \frac{6 \omega _{1,{\tilde{\delta }}}}{{\tilde{\gamma }}^4k_n}\Big (\gamma _0\eta _n- \sqrt{k_n}({\tilde{\gamma }} g_{{\tilde{\delta }},1}-\gamma _0 g_{\delta _0,1})\Big )^2 \\&+\frac{8 \omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}}{{\tilde{\gamma }}^3\sqrt{k_n}}\Big (\gamma _0\eta _n- \sqrt{k_n}({\tilde{\gamma }} g_{{\tilde{\delta }},1}-\gamma _0 g_{\delta _0,1})\Big )+ \frac{2}{{\tilde{\gamma }}^2}\omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}^2 \\&+\frac{4\gamma _0}{{\tilde{\gamma }}^3k_n}\sum _{i=2}^{k_n}\frac{({\tilde{\delta }} +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}\,. \end{aligned}$$

Clearly, the second and the third terms in the right hand side are \(o_p(1)\) as \(n\rightarrow \infty\). Furthermore,

$$\begin{aligned} -\frac{2}{{{\tilde{\gamma }}}^2} \rightarrow -\frac{2}{\gamma _0^2}, \ \ \ \frac{2}{{\tilde{\gamma }}^2}\omega _{1,{\tilde{\delta }}}g_{{\tilde{\delta }},1}^2 \rightarrow \frac{2}{{\gamma _0}^2}\omega _{1,{\delta _0}}g_{{\delta _0},1}^2 \ \ \text {in probability,} \end{aligned}$$

and by computing the mean and the variance we see that

$$\begin{aligned} \frac{4\gamma _0}{{\tilde{\gamma }}^3k_n}\sum _{i=2}^{k_n}\frac{({\tilde{\delta }} +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}} \rightarrow \frac{4}{\gamma _0^2} \end{aligned}$$

in probability. Therefore,

$$\begin{aligned} \frac{\partial _1^2 L_n ({\tilde{\gamma }},{\tilde{\delta }})}{k_n} \rightarrow \frac{2}{\gamma _0^2} + \frac{2}{{\gamma _0}^2}\omega _{1,{\delta _0}}g_{{\delta _0},1}^2 \end{aligned}$$

in probability, and the limit is the appropriate entry in the matrix \(2\Gamma _\infty\). The other terms of the Hessian matrix can be handled in a similar manner. \(\square\)

Proof of Lemma 4.3

We proceed as in the proof of Lemma 3.3. Denote now

$$\begin{aligned} L(\gamma ,\delta )= & {} 2\log \gamma +\frac{{{\tilde{\omega }}}_{1,\delta }}{\gamma ^2} (\gamma {\tilde{g}}_{\delta ,1}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\\&-2\int _{\varepsilon }^{1+\varepsilon }\log (\delta +x)\, dx+ \frac{2\gamma _0}{\gamma }\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}\, dx\,, \end{aligned}$$

where \({{\tilde{\omega }}}_{1,\delta }\) is defined as \(\omega _{1,\delta }\) and \({\tilde{g}}_{\delta ,1}\) is defined as \(g_{\delta ,1}\), both with \(\theta _1=\varepsilon\). Since we can write

$$\begin{aligned} L_n(\gamma ,\delta )/k_n= & {} 2\log \gamma -\frac{1}{k_n}\log \omega _{1,\delta }-\frac{2}{k_n} \sum _{i=2}^{k_n}\log (\delta +\theta _{i,n}) +\frac{\gamma _0^2\omega _{1,\delta }}{\gamma ^2k_n}\eta _n^2\\&- \frac{2\gamma _0\omega _{1,\delta }}{\gamma ^2\sqrt{k_n}} (\gamma g_{\delta ,1}-\gamma _0g_{\delta _0,1})\eta _{n} +\frac{\omega _{1,\delta }}{\gamma ^2}(\gamma g_{\delta ,1} -\gamma _0g_{\delta _0,1})^2\\&+\frac{2\gamma _0}{\gamma k_n}\sum _{i=2}^{k_n} \frac{(\delta +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}, \end{aligned}$$

it follows that

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta } \bigg |\frac{L_n(\gamma ,\delta )}{k_n}-L(\gamma ,\delta )\bigg |\\&\quad \le \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{\omega _{1,\delta }}{\gamma ^2}(\gamma g_{\delta ,1} -\gamma _0g_{\delta _0,1})^2 -\frac{{{\tilde{\omega }}}_{1,\delta }}{\gamma ^2} (\gamma {\tilde{g}}_{\delta ,1}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\right| \\&\quad \quad +\sup _{(\gamma ,\delta )\in \Theta }\left| -\frac{1}{k_n} \log \omega _{1,\delta } +\frac{\gamma _0^2\omega _{1,\delta }}{\gamma ^2k_n}\eta _n^2- \frac{2\gamma _0\omega _{1,\delta }}{\gamma ^2\sqrt{k_n}} (\gamma g_{\delta ,1}-\gamma _0g_{\delta _0,1})\eta _n\right| \\&\quad \quad + \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2}{k_n}\sum _{i=2}^{k_n} \log (\delta +\theta _{i,n})-2\int _{\varepsilon }^{1+\varepsilon }\log (\delta +x)\, dx\right| \\&\quad \quad +\sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0}{\gamma k_n}\sum _{i=2}^{k_n} \frac{(\delta +\theta _{i,n})Z_{i,n}}{\delta _0+\theta _{i,n}}- \frac{2\gamma _0}{\gamma }\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}dx\right| \,. \end{aligned}$$

It is clear that the first three terms in the right hand side vanish as \(n\rightarrow \infty\). The same is true for the last term in the right hand side because we can bound the latter by

$$\begin{aligned}&\sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0\delta }{\gamma } \right| \cdot \left| \frac{1}{k_n}\sum _{i=2}^{k_n}\frac{Z_{i,n}}{\delta _0+\theta _{i,n}} -\int _{\varepsilon }^{1+\varepsilon }\frac{1}{\delta _0+x}dx\right| \\&\quad + \sup _{(\gamma ,\delta )\in \Theta }\left| \frac{2\gamma _0}{\gamma } \right| \cdot \left| \frac{1}{k_n}\sum _{i=2}^{k_n}\frac{\theta _{i,n}Z_{i,n}}{\delta _0+\theta _{i,n}} -\int _{\varepsilon }^{1+\varepsilon }\frac{x}{\delta _0+x}dx\right| \,. \end{aligned}$$

It is clear that both suprema are finite, while by computing once again the means and the variances we see that the two differences converge to zero in probability.

Clearly, the point \((\gamma _0,\delta _0)\) is a minimizer of the function \({{\tilde{\omega }}}_{1,\delta }\gamma ^{-2}(\gamma \tilde{g}_{\delta ,0}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\). Let us denote the remaining part of the function \(L(\gamma ,\delta )\) by \(L_1(\gamma ,\delta )\). To check that the point \((\gamma _0,\delta _0)\) is a unique minimizer of the latter function, note that for a fixed value of \(\delta\) the unique minimizer of \(L_1(\cdot ,\delta )\) is the point

$$\begin{aligned} \gamma (\delta ) = \gamma _0\int _{\varepsilon }^{1+\varepsilon } \frac{\delta +x}{\delta _0+x}\, dx. \end{aligned}$$

Since, up to \(\delta\)-independent terms,

$$\begin{aligned} L_1(\gamma (\delta ),\delta )= \log \left( \int _{\varepsilon }^{1+\varepsilon }\frac{\delta +x}{\delta _0+x}\, dx\right) - \int _{\varepsilon }^{1+\varepsilon } \log \left( \frac{\delta +x}{\delta _0+x}\right) dx\,, \end{aligned}$$

which vanishes for \(\delta =\delta _0\) and is strictly positive by Jensen’s inequality for \(\delta \not =\delta _0\), we see that \(\delta =\delta _0\) and \(\gamma =\gamma (\delta _0=\gamma _0\) is the unique minimizer of \(L_1\) and, hence, also of L.

As before, the uniform convergence of \(L_n/k_n\) to L implies now that any minimizer of the former function convergence in probability to \((\gamma _0,\delta _0)\). Lemma 4.2 and the fact that \(\Gamma _\infty\) is invertible mean that, with probability converging to 1, the minimizer of \(L_n\) is unique. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, H., Davis, R. & Samorodnitsky, G. Handling missing extremes in tail estimation. Extremes 25, 199–227 (2022). https://doi.org/10.1007/s10687-021-00429-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10687-021-00429-z

Keywords

Mathematics Subject Classification

Navigation