Abstract
In some data sets, it may be the case that a portion of the extreme observations is missing. This might arise in cases where the extreme observations are just not available or are imprecisely measured. For example, considering human lifetimes, a topic of recent interest, birth certificates of centenarians may not even exist and many such individuals may not even be included in the data sets that are currently available. In essence, one does not have a clear record of the largest lifetimes of human populations. If there are missing extreme observations, then the assessment of risk can be severely underestimated resulting in rare events occurring more often than originally thought. In concrete terms, this may mean a 500 year flood is in fact a 100 (or even a 20) year flood. In this paper, we present methods for estimating the number of missing extremes together with the tail index associated with tail heaviness of the data. Ignoring one or the other can severely impact the estimation of risk. Our estimates are based on the HEWE (Hill estimate without extremes) of the tail index that adjusts for missing extremes. Based on a functional convergence of this process to a limit process, we consider an asymptotic likelihood-based procedure for estimating both the number of missing extremes and the tail index. We derive the asymptotic distribution of the resulting estimates. By artificially removing segments of extremes in the data, this methodology can be used for assessing the reliability of the underlying assumptions that are imposed on the data.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10687-021-00429-z/MediaObjects/10687_2021_429_Fig9_HTML.png)
Similar content being viewed by others
Data availibility
The Danish fire data and the Natural and Climate Disasters in the U.S. data are publicly available as indicated in the text. The Google+ data are not publicly available.
Notes
The general form of the GPD that combines light- and heavy-tailed cases has a similar form, see de Haan and Ferreira (2006).
References
Aban, I., Meerschaert, M., Panorska, A.: Parameter estimation for the truncated Pareto distribution. J. Am. Stat. Assoc. 101, 270–277 (2006)
Beirlant, J., Alves, I., Gomes, I.: Tail fitting for truncated and non-truncated Pareto-type distributions. Extremes 19, 429–462 (2016)
Beirlant, J., Fraga Alves, I., Reynkens, T.: Fitting tails affected by truncation. Electron. J. Stat. 11, 2026–2065 (2017)
Benchaira, S., Meraghmi, D., Necir, A.: Tail product-limit process for truncated data with application to extreme value index estimation. Extremes 19, 219–251 (2016)
Bhattacharya, S., Kallitsis, M., Stoev, S.: Data-adaptive trimming of the Hill estimator and detection of outliers in the extremes of heavy-tailed data. Electronic J. Stat. 13, 1872–1925 (2019)
de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer, New York (2006)
Drees, H.: On smooth statistical tail functionals. Scand. J. Stat. 25, 187–210 (1998)
Drees, H., de Haan, L., Resnick, S.: How to make a Hill plot. Ann. Stat. 28, 254–274 (2000)
Einmahl, J.H., Fils-Villetard, A., Guillou, A., et al.: Statistics of extremes under random censoring. Bernoulli 14, 207–227 (2008)
Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events for Insurance and Finance. Springer-Verlag, Berlin (1997)
Hill, B.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3, 1163–1174 (1975)
Newman, M.: Networks: An Introduction. Oxford University Press, Oxford (2010)
Reiss, R.-D.: Asymptotic Distribution of Order Statistics. Springer, New York (1989)
Resnick, S.: Discussion of the Danish data on large fire insurance losses. Astin Bull. 27, 139–151 (1997)
Resnick, S.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York (2007)
Smith, A., Katz, R.: US billion-dollar weather and climate disasters: data sources, trends, accuracy and biases. Nat. Hazards 67, 387–410 (2013)
Stupfler, G.: Estimating the conditional extreme-value index under random right-censoring. J. Multivar. Anal. 144, 1–24 (2016)
Worms, J., Worms, R.: New estimators of the extreme value index under random right censoring, for heavy-tailed distributions. Extremes 17, 337–358 (2014)
Zou, J., Davis, R., Samorodnitsky, G.: Extreme value analysis without the largest values: what can be done? Probab. Eng. Inf. Sci. (2019). https://doi.org/10.1017/S0269964818000542:1-21
Acknowledgements
We would like to thank the two anonymous referees for their useful comments that forced us to make changes that were truly necessary.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Richard Davis: This research was supported in part by NSF Grant DMS-2015379. Gennady Samorodnitsky: This research was partially supported by the ARO Grant W911NF-18 -10318 at Cornell University.
Appendix
Appendix
1.1 Second-order regular variation
Second-order regular variation can be thought of as a way to quantify the vanishing difference between the left hand side and the right hand side of (1.1). It assumes that there is \(\rho \le 0\) and a positive or negative function A that is regularly varying with exponent \(\rho\) and \(\lim _{t\rightarrow \infty } A(t) = 0\), such that for \(x > 0\),
where \(U(t) = F^{\leftarrow }(1- 1/t)\) and \(F^{\leftarrow }\) is the generalized inverse of F; see e.g. de Haan and Ferreira (2006).
The results of this paper assume that the sequence \((k_n)\) used to define our estimators satisfies
for some \(\lambda \in {{\mathbb {R}}}\). Since \(k_n\rightarrow \infty\), condition (7.2) implies that \(n / k_n \rightarrow \infty\).
Distributions that satisfy the second-order condition include the Student’s \(t_\nu\), stable, and Fréchet distributions; see, e.g. Drees (1998) and Drees et al. (2000). In fact, any distribution with \(\bar{F}(x) = c_1 x^{-\alpha } + c_2 x^{-\alpha + \alpha \rho } (1 + o(1))\) as \(x \rightarrow \infty\), where \(c_1 > 0\), \(c_2 \ne 0\), \(\alpha > 0\) and \(\rho < 0\), satisfies the second-order condition with the indicated values of \(\alpha\) and \(\rho\) (de Haan and Ferreira 2006).
1.2 Proofs
In this section we present the proofs of the results in the earlier parts of the paper.
Proof of Lemma 3.1
Since
and
the claim of the lemma follows from (3.4). \(\square\)
Proof of Lemma 3.2
We proceed as in the proof of Lemma 3.1, except now one needs to take second derivatives. For example, elementary calculations give us
Using (3.4) and the fact that \(({\tilde{\gamma }},{\tilde{\delta }}){\mathop {\rightarrow }\limits ^{P}}(\gamma _0,\delta _0)\) we see that
The other terms of the Hessian matrix can be handled in a similar manner. \(\square\)
Proof of Lemma 3.3
Denote
Since we can write
we have
by (3.4), since we know that, by assumption, \(\gamma , \omega _{i,\delta }\) and \(h_{\delta ,i}\) are bounded away from 0 and infinity on \(\Theta\).
Clearly, the point \((\gamma _0,\delta _0)\) is a minimizer of the function \(\gamma ^2L\). Furthermore, it is elementary to check that the Hessian matrix of \(\gamma ^2L\) at that point is equal to \(2\gamma _0^2\Gamma _m\). We will see in the proof of Proposition 3.1 below that the matrix \(\Gamma _m\) is invertible, hence the point \((\gamma _0,\delta _0)\) is the unique minimizer of the function \(\gamma ^2L\), hence also of the function L. The uniform convergence in probability of the function \(L_n/k_n\) to the function L implies that any minimizer of the former function converges in probability to the unique minimizer of the limit function. Hence the statement of the lemma. \(\square\)
Proof of Proposition 3.1
Introduce functions of \(x>0\)
so that
Therefore we can write
We now show that the matrix \(\Gamma _m\) is invertible. A direct computation shows that
It is easy to check that the functions \(l_{\delta _0}\) and \(m_{\delta _0}\) are increasing on \((0,\infty )\), so that for any \(i\ge 1\), \(l_{\delta _0}(\theta _{i})-l_{\delta _0}(\theta _{i-1})>0\) and \(m_{\delta _0}(\theta _{i})-m_{\delta _0}(\theta _{i-1})>0\). Further, by the Cauchy-Schwarz inequality,
and the equality holds if and only if
The latter requirement is equivalent to
so invertibility of \(\Gamma _m\) will follow once we show that (7.3) cannot hold. If we put
then \(Q(0)=0\) and
which implies that
for any \(x>0\). Since \(l_{\delta _0}^{\prime }(x)>0\), we conclude that the function \(l_{\delta _0}(x)/m_{\delta _0}(x)\) is strictly decreasing on the positive half line, and so (7.3) cannot hold. Hence the matrix \(\Gamma _m\) is invertible.
It is elementary to check that, as \(\delta _0\rightarrow 0\),
Substituting this into (3.10) shows convergence of the variance in (3.12).
Similarly, it is elementary to check that, as \(\delta _0\rightarrow 0\),
Substituting (7.4) and (7.5) into (3.10) and (3.11) proves convergence of the mean in (3.12). \(\square\)
Proof of Lemma 4.1
By (4.3),
and
Since
the claim of the lemma will follow once we show that
where \(\Gamma _0\) is the second matrix in the right hand side of (4.4). By (4.1) we only need to prove that
Since the covariance matrix of the random vector in the left hand side of (7.6) converges to \(\Gamma _0\), only the Lyapunov condition needs to be checked for an application of the central limit theorem. The latter can be performed component-wise and is elementary when taking, for instance, the 4th powers of the terms. \(\square\)
Proof of Lemma 4.2
Once again, computing the second derivatives, we obtain, for example,
Clearly, the second and the third terms in the right hand side are \(o_p(1)\) as \(n\rightarrow \infty\). Furthermore,
and by computing the mean and the variance we see that
in probability. Therefore,
in probability, and the limit is the appropriate entry in the matrix \(2\Gamma _\infty\). The other terms of the Hessian matrix can be handled in a similar manner. \(\square\)
Proof of Lemma 4.3
We proceed as in the proof of Lemma 3.3. Denote now
where \({{\tilde{\omega }}}_{1,\delta }\) is defined as \(\omega _{1,\delta }\) and \({\tilde{g}}_{\delta ,1}\) is defined as \(g_{\delta ,1}\), both with \(\theta _1=\varepsilon\). Since we can write
it follows that
It is clear that the first three terms in the right hand side vanish as \(n\rightarrow \infty\). The same is true for the last term in the right hand side because we can bound the latter by
It is clear that both suprema are finite, while by computing once again the means and the variances we see that the two differences converge to zero in probability.
Clearly, the point \((\gamma _0,\delta _0)\) is a minimizer of the function \({{\tilde{\omega }}}_{1,\delta }\gamma ^{-2}(\gamma \tilde{g}_{\delta ,0}-\gamma _0{\tilde{g}}_{\delta _0,1})^2\). Let us denote the remaining part of the function \(L(\gamma ,\delta )\) by \(L_1(\gamma ,\delta )\). To check that the point \((\gamma _0,\delta _0)\) is a unique minimizer of the latter function, note that for a fixed value of \(\delta\) the unique minimizer of \(L_1(\cdot ,\delta )\) is the point
Since, up to \(\delta\)-independent terms,
which vanishes for \(\delta =\delta _0\) and is strictly positive by Jensen’s inequality for \(\delta \not =\delta _0\), we see that \(\delta =\delta _0\) and \(\gamma =\gamma (\delta _0=\gamma _0\) is the unique minimizer of \(L_1\) and, hence, also of L.
As before, the uniform convergence of \(L_n/k_n\) to L implies now that any minimizer of the former function convergence in probability to \((\gamma _0,\delta _0)\). Lemma 4.2 and the fact that \(\Gamma _\infty\) is invertible mean that, with probability converging to 1, the minimizer of \(L_n\) is unique. \(\square\)
Rights and permissions
About this article
Cite this article
Xu, H., Davis, R. & Samorodnitsky, G. Handling missing extremes in tail estimation. Extremes 25, 199–227 (2022). https://doi.org/10.1007/s10687-021-00429-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10687-021-00429-z