Log in

Functional strong law of large numbers for Betti numbers in the tail

  • Published:
Extremes Aims and scope Submit manuscript

Abstract

The objective of this paper is to investigate the layered structure of topological complexity in the tail of a probability distribution. We establish the functional strong law of large numbers for Betti numbers, a basic quantifier of algebraic topology, of a geometric complex outside an open ball of radius \(R_n\), such that \(R_n\rightarrow \infty\) as the sample size n increases. The nature of the obtained law of large numbers is determined by the decay rate of a probability density and how rapidly \(R_n\) diverges. In particular, if \(R_n\) diverges sufficiently slowly, the limiting function in the law of large numbers is crucially affected by the emergence of arbitrarily large connected components supporting topological cycles in the limit.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  • Adler, R.J., Bobrowski, O., Weinberger, S.: Crackle: The homology of noise. Discret. Comput. Geom. 52, 680–704 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Bachmann, S., Reitzner, M.: Concentration for Poisson \(U\)-statistics: subgraph counts in random geometric graphs. Stochastic Processes and their Applications 128, 3327–3352 (2018)

  • Balkema, G., Embrechts, P.: High Risk Scenarios and Extremes: A Geometric Approach. European Mathematical Society (2007)

  • Björner, A.: Topological methods. In: Handbook of Combinatorics. Elsevier, Amsterdam (1995)

  • Decreusefond, L., Schulte, M., Thäle, C.: Functional Poisson approximation in Kantorovich-Rubinstein distance with applications to \(U\)-statistics and stochastic geometry. Ann. Probab. 44, 2147–2197 (2016)

  • de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer, New York (2006)

    Book  MATH  Google Scholar 

  • Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events: for Insurance and Finance. Springer, New York (1997)

    Book  MATH  Google Scholar 

  • Ghrist, R.: Elementary Applied Topology. Createspace (2014)

  • Goel, A., Duy, K.T., Tsunoda, K.: Strong law of large numbers for Betti numbers in the thermodynamic regime. J. Stat. Phys. 174 (2019)

  • Hiraoka, Y., Shirai, T., Trinh, K.D.: Limit theorems for persistence diagrams. Ann. Appl. Probab. 28, 2740–2780 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  • Kahle, M., Meckes, E.: Limit theorems for Betti numbers of random simplicial complexes. Homology, Homotopy and Applications 15, 343–374 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Niyogi, P., Smale, S., Weinberger, S.: A topological view of unsupervised learning from noisy data. SIAM J. Comput. 40, 646–663 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Owada, T.: Functional central limit theorem for subgraph counting processes. Electron. J. Probab. 22 (2017)

  • Owada, T.: Limit theorems for Betti numbers of extreme sample clouds with application to persistence barcodes. Ann. Appl. Probab. 28, 2814–2854 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  • Owada, T.: Topological crackle of heavy-tailed moving average processes. Stochastic Processes and their Applications 129, 4965–4997 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  • Owada, T., Adler, R.J.: Limit theorems for point processes under geometric constraints (and topological crackle). Ann. Probab. 45, 2004–2055 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Owada, T., Bobrowski, O.: Convergence of persistence diagrams for topological crackle. Bernoulli 26, 2275–2310 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  • Owada, T., Wei, Z. Functional strong law of large numbers for Betti numbers in the tail. ar**v:2103.05799 (2021)

  • Penrose, M.: Random Geometric Graphs, Oxford Studies in Probability 5. Oxford University Press, Oxford (2003)

    Book  Google Scholar 

  • Resnick, S.: Extreme Values. Regular Variation and Point Processes. Springer-Verlag, New York (1987)

    MATH  Google Scholar 

  • Resnick, S.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York (2007)

    MATH  Google Scholar 

  • Schulte, M., Thäle, C.: The scaling limit of Poisson-driven order statistics with applications in geometric probability. Stochastic Processes and their Applications 122, 4096–4120 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Thomas, A.M., Owada, T.: Functional limit theorems for the Euler characteristic process in the critical regime. Adv. Appl. Probab. 53, 57–80 (2021a)

  • Thomas, A.M., Owada, T.: Functional strong law of large numbers for Euler characteristic processes of extreme sample clouds. Extremes 24, 699–724 (2021b)

  • Yogeshwaran, D., Subag, E., Adler, R.J.: Random geometric complexes in the thermodynamic regime. Probab. Theory Relat. Fields 167, 107–142 (2017)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are very grateful for useful and detailed comments received from anonymous referees and an anonymous Associate Editor. These comments helped the authors to introduce a number of improvements to the paper.

Funding

This research was supported by the NSF grant DMS-1811428 and the AFOSR grant FA9550-22-0238.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takashi Owada.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In the Appendix, we provide a series of lemmas and propositions that will be used for the proof of Theorem 3.1. As in the last section, \(C^*\) denotes a generic and positive constant independent of n.

1.1 Extension of pointwise SLLN to its functional version

The first result below allows us to develop a functional SLLN from its pointwise version.

Lemma 5.1

  1. (i)

    Let \(\big ( X_n(t), \, n \ge 1 \big )\) be a sequence of random elements of D[0, 1] with non-decreasing sample paths. Suppose \(a:[0,1]\rightarrow \mathbb {R}\) is a continuous and non-decreasing function. Suppose that

    $$\begin{aligned} X_n(t) \rightarrow a(t), \ \ n\rightarrow \infty , \ \ \text {a.s.}, \end{aligned}$$

    for every \(t\in [0,1]\), then

    $$\begin{aligned} X_n(t) \rightarrow a(t), \ \ n\rightarrow \infty , \ \ \text {a.s.~in } D[0,1], \end{aligned}$$

    where D[0, 1] is endowed with the uniform topology.

  2. (ii)

    Let \(\big (X_n(t,s), \, n\in \mathbb {N}\big )\) be a sequence of random elements, such that for each \(n \ge 1\), \(X_n(t,s)\) has right continuous sample paths with left limits in each of the coordinates. Assume further that for every \(n \ge 1\), \(X_n(t,s)\) is non-decreasing in t and non-increasing in s. Suppose a(ts) is a real-valued, continuous function on \([0,1]^2\), which has the same monotonicity as \(X_n(t,s)\) in each of the coordinates. If we have that

    $$\begin{aligned} X_n(t,s) \rightarrow a(t,s), \quad n \rightarrow \infty , \quad \text {a.s.} \end{aligned}$$
    (59)

    for every \(t,s \in [0,1]\), then, as \(n\rightarrow \infty\),

    $$\begin{aligned} X_n(t,t)\rightarrow a(t,t) \quad \text {a.s.} \ \text {in }\ D[0,1], \end{aligned}$$

    where D[0, 1] is equipped with the uniform topology.

Proof

Part (i) is proven in Proposition 4.2 of Thomas and Owada (2021a). For Part (ii), it is clear that a(ts) is uniformly continuous on \([0,1]^2\). Given \(\epsilon >0\), we can choose \(N=N(\epsilon )\in \mathbb {N}\) such that for all \((t_1, s_1)\), \((t_2, s_2) \in [0,1]^2\),

$$\begin{aligned} |t_1-t_2|+ |s_1-s_2|\le \frac{2}{N} \ \ \text { implies } \ \big |a(t_1,s_1) - a(t_2, s_2) \big |< \epsilon . \end{aligned}$$
(60)

Then, we see that

$$\begin{aligned}&\sup _{0 \le t \le 1}\big |X_n(t,t) - a(t,t) \big |\\&\le \sup _{0 \le t, s \le 1} \big |X_n(t,s) -a(t,s) \big |\\&=\max _{1 \le i \le N}\max _{1\le j \le N} \sup _{t \in [(i-1)/N, i/N]} \sup _{s \in [(j-1)/N, j/N]} \Big \{ \big ( X_n(t,s)-a(t,s) \big ) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \vee \big ( a(t,s)-X_n(t,s) \big ) \Big \}\\&\le \max _{1 \le i \le N}\max _{1\le j \le N} \Big \{ \Big ( X_n\Big (\frac{i}{N}, \frac{j-1}{N}\Big ) - a\Big (\frac{i-1}{N}, \frac{j}{N}\Big ) \Big ) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \vee \Big ( a\Big (\frac{i}{N}, \frac{j-1}{N}\Big ) - X_n\Big (\frac{i-1}{N}, \frac{j}{N}\Big ) \Big ) \Big \} \\&\le \max _{1 \le i \le N}\max _{1\le j \le N} \Big \{ \Big ( X_n\Big (\frac{i}{N}, \frac{j-1}{N}\Big ) - a\Big (\frac{i}{N}, \frac{j-1}{N}\Big ) \Big ) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \vee \Big ( a\Big (\frac{i-1}{N}, \frac{j}{N}\Big ) - X_n\Big (\frac{i-1}{N}, \frac{j}{N}\Big ) \Big ) \Big \} + \epsilon . \end{aligned}$$

In the above, the second inequality follows from the monotonicity of \(X_n\) and a, and we have used (60) for the third inequality. By virtue of (59), the last expression converges to \(\epsilon\) almost surely as \(n \rightarrow \infty\). Since \(\epsilon\) is arbitrary, the proof is complete.

1.2 Concentration inequalities for Poisson U-statistics

The next result we introduce here is the concentration bound derived in Bachmann and Reitzner (2018) for a Poisson U-statistics. Let us rephrase the setup and assumptions of Bachmann and Reitzner (2018) in a way suitable for the current study. Let \(\mathcal P\) denote a Poisson point process in \(\mathbb {R}^d\) with finite intensity measure of no atoms. Let \(s: (\mathbb {R}^d)^i \rightarrow \{ 0,1 \}\) be a symmetric indicator function of order i with the following properties.

  1. (i)

    There exists \(c_1>0\) such that

    $$\begin{aligned} s(x_1,\dots ,x_i) =1 \text { whenever diam}(x_1,\dots ,x_i) < c_1. \end{aligned}$$
    (61)
  2. (ii)

    There is a constant \(c_2>c_1\) such that

    $$\begin{aligned} s(x_1,\dots ,x_i) = 0 \text { whenever diam}(x_1,\dots ,x_i) > c_2. \end{aligned}$$
    (62)

    Finally, we define a Poisson U-statistics \(F(\mathcal P)\) of order i by

    $$\begin{aligned} F(\mathcal P)=\sum _{{\mathcal {Y}}\subset \mathcal P, \, |{\mathcal {Y}}|=i} s({\mathcal {Y}}). \end{aligned}$$

Proposition 5.2

[Theorem 3.1 in Bachmann and Reitzner (2018)] Under the above conditions, there is a constant \(C^*>0\), depending only on \(i, d, c_1\), and \(c_2\), such that for all \(r>0\),

$$\begin{aligned} \mathbb {P}\big ( F(\mathcal P) \ge \mathbb {E}[F(\mathcal P)] + r \big )&\le \exp \Big \{ -C^* \Big ( \big ( \mathbb {E}[F(\mathcal P)]+r \big )^\frac{1}{2i} - \big ( \mathbb {E}[F(\mathcal P)] \big )^\frac{1}{2i} \Big )^2 \Big \}, \\ \mathbb {P}\big ( F(\mathcal P) \le \mathbb {E}[F(\mathcal P)] - r \big )&\le \exp \bigg \{ -\frac{C^*r^2}{ \text {Var}(F(\mathcal P))} \bigg \}. \end{aligned}$$

1.3 Ratio of regularly varying sequences

The result below deals with asymptotic ratios of the regularly varying sequences as a function of \(v_m\), \(p_m\), \(q_m\), \(r_m\), and \(s_m\) in (27), (28), and (29).

Lemma 5.3

Let \(u_m\) and \(w_m\) be any of the sequences in (27), (28), and (29). Under the setup of Theorem 3.1, as \(m\rightarrow \infty\),

$$\begin{aligned} \frac{u_m}{w_m} \rightarrow 1, \ \ \ \frac{R_{u_m}}{R_{w_m}} \rightarrow 1, \ \ \ \frac{f(R_{u_m})}{f(R_{w_m})} \rightarrow 1. \end{aligned}$$
(63)

Proof

By the definition of these five sequences, it is evident that

$$\begin{aligned} \frac{v_m}{v_{m+1}} \le \frac{u_m}{w_m} \le \frac{v_{m+1}}{v_m} \end{aligned}$$

for all \(m\in \mathbb {N}\). Note that

$$\begin{aligned} 1\le \frac{v_{m+1}}{v_m} \le \frac{e^{(m+1)^\gamma - m^\gamma }}{1-e^{-m^\gamma }} = \frac{e^{m^{\gamma -1}(\gamma +o(1))}}{1-e^{-m^\gamma }}, \ \ \ m\rightarrow \infty . \end{aligned}$$

As \(0<\gamma <1\), the rightmost term goes to 1 as \(m\rightarrow \infty\).

For the proof of the second statement in (63), let \(\zeta >0\) be a regular variation exponent of \((R_n)\) (In the case of Theorem 3.1 (ii), we can take \(\zeta =1/\alpha\)). One can then rewrite the ratio as

$$\begin{aligned} \frac{R_{u_m}}{R_{w_m}} = \frac{R_{\lfloor w_m \frac{u_m}{w_m} \rfloor }}{R_{w_m}} - \Big ( \frac{u_m}{w_m} \Big )^\zeta + \Big ( \frac{u_m}{w_m} \Big )^\zeta . \end{aligned}$$

Since \(u_m/w_m \rightarrow 1\) as \(m\rightarrow \infty\), we have that \(u_m/w_m \in [1/2, 3/2]\) for sufficiently large m. By the uniform convergence of regularly varying sequences (see, e.g., Proposition 2.4 in Resnick 2007),

$$\begin{aligned} \bigg |\, \frac{R_{\lfloor w_m \frac{u_m}{w_m} \rfloor }}{R_{w_m}} - \Big ( \frac{u_m}{w_m} \Big )^\zeta \bigg |\le \sup _{\frac{1}{2} \le a\le \frac{3}{2}} \bigg |\, \frac{R_{\lfloor a w_m \rfloor }}{R_{w_m}} - a^\zeta \, \bigg |\rightarrow 0, \ \ \ m\rightarrow \infty . \end{aligned}$$

Now, the proof is complete. Finally, since \(\big ( f(R_n), \, n\ge 1 \big )\) is also a regularly varying sequence, the third statement in (63) can be shown in the same way as above.

1.4 Asymptotic moment results for Theorem 3.1 (ii)

Lemma 5.4 and 5.5 below give the asymptotic first and second moments for (37), (38), and (42). The main result itself is given in Lemma 5.5, while Lemma 5.4 provides complimentary details to the proof of Lemma 5.5.

Lemma 5.4

  1. (i)

    Under the assumptions of Theorem 3.1 (ii), for every \(i\ge k+2\) and \(t\in [0,1]\),

    $$\begin{aligned} \begin{aligned}&v_{m+1}^i R_{q_m}^{-d} \, \mathbb {P}\Big ( \check{C}\big ( \{ X_1,\dots ,X_i \}, t\big ) \ \text{is connected} , \,\\&\qquad \qquad \qquad \qquad \mathcal {M}(X_1,\dots ,X_i)\ge R_{q_m}\Big ) \rightarrow \lambda ^i \zeta _i(t), \ \ m\rightarrow \infty , \end{aligned} \end{aligned}$$
    (64)

    where \(\zeta _i(t)\) is given in (71).

  2. (ii)

    Moreover, let \(\delta >0\) be a constant so small that \(\lambda (1+\delta ) e\omega _d < 1\). Then, for each \(i\ge k+2\) and \(t\in [0,1]\), there exists \(N\in \mathbb {N}\) such that for all \(m \ge N\),

    $$\begin{aligned} \begin{aligned}&v_{m+1}^i R_{q_m}^{-d} \, \mathbb {P}\Big ( \check{C}\big ( \{ X_1,\dots ,X_i \}, t\big ) \ \text{is connected} , \,\\&\qquad \qquad \qquad \qquad \mathcal {M}(X_1,\dots ,X_i)\ge R_{q_m}\Big ) \le C^*\lambda ^i(1+\delta )^ii^{i-2}\omega _d^{i-1}, \end{aligned} \end{aligned}$$
    (65)

    where \(C^*>0\) is a constant independent of i and m.

Lemma 5.5

  1. (i)

    Under the assumptions of Theorem 3.1 (ii), for every \(t, s\in [0,1]\), \(i\ge k+2\), and \(j\ge 1\), we have as \(m\rightarrow \infty\),

    $$\begin{aligned} R_{q_m}^{-d}\, \mathbb {E}\big [T_m^{(i,j,\uparrow )}(t,s)\big ]\rightarrow \frac{\lambda ^i}{i !}\, \mu _k^{(i,j,+)}(t,s; \lambda ), \end{aligned}$$
    (66)
    $$\begin{aligned} R_{p_m}^{-d}\, \mathbb {E}\big [T_m^{(i,j,\downarrow )}(t,s)\big ]\rightarrow \frac{\lambda ^i}{i !}\, \mu _k^{(i,j,+)}(t,s; \lambda ). \end{aligned}$$
    (67)

    Moreover,

    $$\begin{aligned} \sup _{m\ge 1} R_{q_m}^{-d}\, Var \big ( T_m^{(i,j,\uparrow )}(t,s)\big )<\infty , \end{aligned}$$
    (68)
    $$\begin{aligned} \sup _{m\ge 1} R_{p_m}^{-d}\, Var \big (T_m^{(i,j,\downarrow )}(t,s)\big )<\infty . \end{aligned}$$
    (69)
  2. (ii)

    Under the conditions of Theorem 3.1 (ii), for every \(t\in [0,1]\) and \(M\in \mathbb {N}\), we have as \(m\rightarrow \infty\),

    $$\begin{aligned} R_{q_m}^{-d}\, \mathbb {E}\big [ V_{m,M}(t) \big ] \rightarrow \sum _{i=M+1}^\infty i^{k+1} \frac{\lambda ^i}{i!}\, \zeta _i (t) <\infty , \end{aligned}$$
    (70)

    where

    $$\begin{aligned} \zeta _i(t) = \frac{s_{d-1}}{\alpha i-d}\, \int _{(\mathbb {R}^d)^{i-1}} \mathbf{1}\Big \{ \check{C}\big ( \{ 0,\mathbf{y}\}, t \big ) \text { is connected} \Big \}d\mathbf{y}. \end{aligned}$$
    (71)

    Furthermore,

    $$\begin{aligned} \sup _{m\ge 1} R_{q_m}^{-d}\, Var \big ( V_{m,M}(t) \big ) < \infty . \end{aligned}$$
    (72)

Proof of Lemma 5.5

We start with proving (66).

Proof of (66): By conditioning on \({\mathcal {Y}}\) we have

$$\begin{aligned} \begin{aligned} \mathbb {E}\big [ T_m^{(i,j,\uparrow )}(t,s)\big ]&= \left( {\begin{array}{c}v_{m+1}\\ i\end{array}}\right) \mathbb {E}\Big [ h_t^{(i,j,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}})\ge R_{q_m} \big \}\\&\quad\times \mathbb {P}\big ( \mathcal {B}({\mathcal {Y}}; \frac{s}{2})\cap \mathcal {B}({\mathcal {X}}_{v_m}\setminus {\mathcal {Y}}; \frac{s}{2}) =\emptyset \, \big |\, {\mathcal {Y}}\big )\Big ]\\&={v_{m+1} \atopwithdelims ()i}\int _{(\mathbb {R}^d)^i}h^{(i,j,+)}_t(x_1,\dots ,x_i)\, \mathbf {1}\big \{\mathcal {M}(x_1,\dots ,x_i)\ge R_{q_m}\big \}\\&\quad\times \big (1-I_s(x_1,\dots ,x_i)\big )^{v_{m}-i}\prod _{\ell =1}^if(x_\ell )d\mathbf{x}, \end{aligned} \end{aligned}$$
(73)

where

$$\begin{aligned} I_s(x_1,\dots ,x_i) :=\int _{\mathcal {B}\big ( \{x_1,\dots ,x_i\};s\big )}f(z)dz. \end{aligned}$$

Here, we consider the case in which a point set \({\mathcal {Y}}\) is contained in \({\mathcal {X}}_{v_m}\). We treat only this case, but the other cases (i.e., \({\mathcal {Y}}\cap ({\mathcal {X}}_{v_{m+1}}\setminus {\mathcal {X}}_{v_m})\ne \emptyset\)) can be handled in the same way. Changing the variables \(x_1= x\), \(x_\ell = x+y_{\ell -1}\), \(\ell \in \{2,\dots ,i\}\) and using shift invariance condition (23), we have that

$$\begin{aligned} \begin{aligned}\mathbb {E}\big [T_m^{(i,j,\uparrow )}(t,s)\big ]&={v_{m+1} \atopwithdelims ()i}\int _{\mathbb {R}^d}\int _{(\mathbb {R}^d)^{i-1}}h^{(i,j,+)}_t(0,\mathbf{y}) \\&\quad\times \big (1-I_s(x,x+y_1,\cdots ,x+y_{i-1})\big )^{v_{m}-i}f(x) \\&\quad\times \, \mathbf {1}\big \{\Vert x\Vert \ge R_{q_m}\big \} \prod _{\ell =1}^{i-1}f(x+y_\ell )\mathbf {1}\big \{\Vert x+y_\ell \Vert \ge R_{q_m}\big \}d\mathbf{y}dx, \end{aligned} \end{aligned}$$

where \(\mathbf{y}=(y_1,\dots ,y_{i-1}) \in (\mathbb {R}^d)^{i-1}\). The polar coordinate transform \(x\leftrightarrow (r,\theta )\) with \(r\ge 0\), \(\theta \in S^{d-1}\), gives that

$$\begin{aligned} \begin{aligned} \mathbb {E}\big [T_m^{(i,j,\uparrow )}(t,s)\big ]&= \left( {\begin{array}{c}v_{m+1}\\ i\end{array}}\right) \int _{S^{d-1}}J(\theta )\int _{R_{q_m}}^{\infty } r^{d-1}\int _{(\mathbb {R}^d)^{i-1}} h^{(i,j,+)}_t(0,\mathbf{y})\\&\quad\times f(r) \prod _{\ell =1}^{i-1} f\big ( \Vert r\theta +y_\ell \Vert \big )\, \mathbf{1}\big \{ \Vert r\theta + y_\ell \Vert \ge R_{q_m} \big \} \\&\quad\times \big (1-I_s(r\theta , r\theta +y_1, \dots , r\theta + y_{i-1}) \big )^{v_{m}-i}d\mathbf{y}\, dr\, d\theta . \end{aligned} \end{aligned}$$
(74)

Furthermore, an additional change of variable \(r=R_{q_m}\rho\) yields that

$$\begin{aligned} \begin{aligned}\mathbb {E}\big [ T_m^{(i,j,\uparrow )}(t,s)\big ] &={v_{m+1} \atopwithdelims ()i}R_{q_m}^df(R_{q_m})^i\int _{S^{d-1}}J(\theta )\int _1^{\infty }\rho ^{d-1}\int _{(\mathbb {R}^d)^{i-1}}h^{(i,j,+)}_t(0,\mathbf{y})\\&\quad\times \frac{f(R_{q_m}\rho )}{f(R_{q_m})} \prod _{\ell =1}^{i-1}\frac{f\big (R_{q_m}\Vert \rho \theta +y_\ell /R_{q_m} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +y_\ell /R_{q_m}\Vert \ge 1\big \}\\&\quad\times \big (1-I_s(R_{q_m}\rho \theta , R_{q_m}\rho \theta +y_1, \dots , R_{q_m}\rho \theta + y_{i-1}) \big )^{v_{m}-i}d\mathbf{y}\, d\rho \, d\theta . \end{aligned} \end{aligned}$$
(75)

By Lemma 5.3, we have

$$\begin{aligned} \left( {\begin{array}{c}v_{m+1}\\ i\end{array}}\right) f(R_{q_m})^i \sim \frac{\big ( v_{m+1}f(R_{q_m}) \big )^i}{i!} \sim \frac{\big ( q_{m}f(R_{q_m}) \big )^i}{i!} \rightarrow \frac{\lambda ^i}{i!}, \ \ \ m\rightarrow \infty . \end{aligned}$$

By the regular variation assumption in (12),

$$\begin{aligned} \frac{f(R_{q_m}\rho )}{f(R_{q_m})}\prod _{\ell =1}^{i-1}\frac{f\big (R_{q_m}\Vert \rho \theta +\frac{y_\ell }{R_{q_m}} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +\frac{y_\ell }{R_{q_m}}\Vert \ge 1\big \}\rightarrow \rho ^{-\alpha i}, \text {as } n\rightarrow \infty \end{aligned}$$

for all \(\rho \ge 1\), \(\theta \in S^{d-1}\), and \(y_1,\dots , y_{i-1}\in \mathbb {R}^d\). As for the remaining term of the integrand in (75), we get that

$$\begin{aligned} \begin{aligned}&\lim _{m\rightarrow \infty } \big (1-I_s(R_{q_m}\rho \theta , R_{q_m}\rho \theta +y_1, \dots , R_{q_m}\rho \theta + y_{i-1}) \big )^{v_{m}-i}\\&=\lim _{m\rightarrow \infty } \Big ( 1-\int _{\mathcal {B}\big ( \{ 0,y_1,\dots ,y_{i-1} \}; s \big )} f\big ( R_{q_m} \Vert \rho \theta +z/R_{q_m} \Vert \big ) dz\Big )^{v_m-i}\\&=\lim _{m\rightarrow \infty } \exp \Big \{ -v_m f(R_{q_m}) \int _{\mathcal {B}\big ( \{ 0,y_1,\dots ,y_{i-1} \}; s \big )} \frac{f\big ( R_{q_m} \Vert \rho \theta +z/R_{q_m} \Vert \big )}{f(R_{q_m})} dz \Big \}. \end{aligned} \end{aligned}$$
(76)

Now, (12) and Lemma 5.3 ensure that the last term in (76) converges to

$$\begin{aligned} \exp \Big \{ -\lambda \rho ^{-\alpha } s^d \text {vol} \Big ( \mathcal {B}\big ( \{ 0,y_1,\dots ,y_{i-1} \}; 1 \big ) \Big ) \Big \}. \end{aligned}$$

Appealing to all of these convergence results and assuming temporarily that the dominated convergence theorem is applicable, we can obtain (66).

It now remains to find an integrable upper bound for the terms under the integral sign in (75). First it is evident that

$$\begin{aligned} \big (1-I_s(R_{q_m}\rho \theta , R_{q_m}\rho \theta +y_1, \dots , R_{q_m}\rho \theta + y_{i-1}) \big )^{v_{m}-i} \le 1. \end{aligned}$$

Using Potter’s bounds (see Proposition 2.6 in Resnick 2007), for every \(\xi \in (0, \alpha -d)\), we have, for sufficiently large m,

$$\begin{aligned} \frac{f(R_{q_m}\rho )}{f(R_{q_m})}\le (1+\xi )\rho ^{-\alpha +\xi } \end{aligned}$$
(77)

and

$$\begin{aligned} \prod _{\ell =1}^{i-1}\frac{f\big (R_{q_m}\Vert \rho \theta +y_\ell /R_{q_m} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +y_\ell /R_{q_m}\Vert \ge 1\big \}\le (1+\xi )^{i-1} \end{aligned}$$
(78)

for all \(\rho \ge 1\), \(\theta \in S^{d-1}\) and \(\mathbf{y}=(y_1,\dots ,y_{i-1})\in (\mathbb {R}^d)^{i-1}\) such that \(h_t^{(i,j,+)}(0,\mathbf{y})\) \(=1\). Combining all the bounds derived above, together with \(\int _1^{\infty }\rho ^{d-1-\alpha +\xi }d\rho <\infty\), we can obtain an integrable upper bound, as desired. The proof of (67) is similar, so we skip it here.

Proof of (68): We first write

$$\begin{aligned}&\mathbb {E}\big [ T_m^{(i,j,\uparrow )}(t,s)^2\big ] \\&=\sum _{\ell =0}^i\mathbb {E}\bigg [ \sum \limits_{\underset{|{\mathcal {Y}}|=i} {{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}}} } \sum \limits_{\underset{|{\mathcal {Y}}\prime |=i, \, |{\mathcal {Y}}\cap {\mathcal {Y}}\prime |=\ell} {{\mathcal {Y}}\prime \subset {\mathcal {X}}_{v_{m+1}}} } h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}\prime)\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}\big \} \\&\quad\times \mathbf{1}\big \{\mathcal {B}({\mathcal {Y}};\frac{s}{2})\cap \mathcal {B}({\mathcal {X}}_{v_m}\setminus {\mathcal {Y}};\frac{s}{2})=\emptyset \big \} \, \mathbf{1}\big \{\mathcal {B}({\mathcal {Y}}\prime;\frac{s}{2})\cap \mathcal {B}({\mathcal {X}}_{v_m}\setminus {\mathcal {Y}}\prime;\frac{s}{2})=\emptyset \big \} \bigg ] \\ &=:\sum _{\ell =0}^i\mathbb {E}[I_\ell ]. \end{aligned}$$

From this, \(\text {Var}\big ( T_m^{(i,j,\uparrow )}(t,s)\big )\) can be partitioned as \(\text {Var}\big ( T_m^{(i,j,\uparrow )}(t,s)\big ) = A_m +B_m\), where

$$\begin{aligned} A_m = \sum _{\ell =1}^i\mathbb {E}[I_\ell ], \ \ \ B_m = \mathbb {E}[I_0]-\Big \{ \mathbb {E}\big [T_m^{(i,j,\uparrow )}(t,s)\big ] \Big \}^2. \end{aligned}$$
(79)

For \(\ell \in \{1,2,\dots ,i\}\),

$$\begin{aligned} \begin{aligned} \mathbb {E}[I_\ell ] \le& \ \mathbb {E}\bigg [ \sum \limits_{\underset{|{\mathcal {Y}}|=i} {{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}}}} \sum \limits_{\underset{|{\mathcal {Y}}\prime |=i, \, |{\mathcal {Y}}\cap {\mathcal {Y}}\prime |=\ell}{{\mathcal {Y}}\prime \subset {\mathcal {X}}_{v_{m+1}}} } h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}\prime)\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}\big \} \bigg ]\\ =& \ {v_{m+1} \atopwithdelims ()i}{i \atopwithdelims ()\ell }{v_{m+1}-i \atopwithdelims ()i-\ell } \\&\times \mathbb {E}\Big [ h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}\prime)\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}\big \} \, \mathbf{1}\big \{|{\mathcal {Y}}\cap {\mathcal {Y}}\prime |=\ell \big \} \Big ]\\ \le& \ C^{*}v_{m+1}^{2i-\ell }\mathbb {P}\big ( \check{C}({\mathcal {Y}}\cup {\mathcal {Y}}\prime, t) \text { is connected}, \ \mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}, \, |{\mathcal {Y}}\cap {\mathcal {Y}}\prime |= \ell \big ) \, \\ \le& \ C^{*}R_{q_m}^d, \end{aligned} \end{aligned}$$

where the last inequality comes from Lemma 5.4 (ii) below. This implies that \(\sup _{m \ge 1}R_{q_m}^{-d}\, A_m<\infty\). In order to treat \(B_m\) in (79), we derive an upper bound for \(\mathbb {E}[I_0]\) by

$$\begin{aligned} \begin{aligned} \mathbb {E}[I_0]\le& \ \mathbb {E}\bigg [ \sum\limits_{\underset{|{\mathcal {Y}}|=i}{{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}}}} \sum \limits_{\underset{|{\mathcal {Y}}\prime |=i} {{\mathcal {Y}}\prime \subset {\mathcal {X}}_{v_{m+1}}}} h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}\prime)\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}\big \} \\&\times \mathbf{1}\Big \{\mathcal {B}({\mathcal {Y}}\cup {\mathcal {Y}}\prime;\frac{s}{2})\cap \mathcal {B}\big ({\mathcal {X}}_{v_m}\setminus ({\mathcal {Y}}\cup {\mathcal {Y}}\prime); \frac{s}{2}\big )=\emptyset \Big \} \, \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}\prime=\emptyset \big \}\bigg ] \\\le & \ {v_{m+1} \atopwithdelims ()i}^2\mathbb {E}\Big [h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}\prime)\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}\prime)\ge R_{q_m}\big \} \\&\times \Big (1-\int _{\mathcal {B}({\mathcal {Y}}\cup {\mathcal {Y}}\prime;s)}f(z)dz\Big )^{v_m-2i} \, \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}\prime=\emptyset \big \}\Big ], \end{aligned} \end{aligned}$$
(80)

where the second inequality is obtained from an obvious relation \(\left( {\begin{array}{c}v_{m+1}-i\\ i\end{array}}\right) \le \left( {\begin{array}{c}v_{m+1}\\ i\end{array}}\right)\) as well as the conditioning on \({\mathcal {Y}}\cup {\mathcal {Y}}'\) as in (73). Although we here consider the case when all the points in \({\mathcal {Y}}\cup {\mathcal {Y}}'\) belong to \({\mathcal {X}}_{v_m}\), the other cases (i.e., \(({\mathcal {Y}}\cup {\mathcal {Y}}')\cap ({\mathcal {X}}_{v_{m+1}}\setminus {\mathcal {X}}_{v_m})\ne \emptyset\)) can be treated in the same manner. By the calculation similar to the above, we have

$$\begin{aligned} \begin{aligned}&\Big \{ \mathbb {E}\big [T_m^{(i,j,\uparrow )}(t,s)\big ] \Big \}^2 = {v_{m+1} \atopwithdelims ()i}^2\mathbb {E}\Big [ h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}')\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}')\ge R_{q_m}\big \}\\& \times \Big ( 1-\int _{\mathcal {B}({\mathcal {Y}};s)}f(z)dz\Big )^{v_m-i}\Big (1-\int _{\mathcal {B}({\mathcal {Y}}';s)}f(z)dz\Big )^{v_m-i} \, \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}'=\emptyset \big \}\Big ]. \end{aligned} \end{aligned}$$
(81)

By (80) and (81), we get that \(B_m \le v_{m+1}^{2i} \mathbb {E}[\** _m \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}'=\emptyset \big \}]\), where

$$\begin{aligned} \** _m:=& \ h^{(i,j,+)}_t({\mathcal {Y}})h^{(i,j,+)}_t({\mathcal {Y}}')\, \mathbf {1}\big \{\mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}')\ge R_{q_m}\big \}\\&\Big \{\Big (1-\int _{\mathcal {B}({\mathcal {Y}}\cup {\mathcal {Y}}';s)}f(z)dz \Big )^{v_m-2i}\\&-\Big (1-\int _{\mathcal {B}({\mathcal {Y}};s)}f(z)dz\Big )^{v_m-i}\Big (1-\int _{\mathcal {B}({\mathcal {Y}}';s)}f(z)dz\Big )^{v_m-i}\Big \}. \end{aligned}$$

Furthermore, \(\** _m\) can be decomposed as \(\** _m=C_m+D_m\), where

$$\begin{aligned} C_m=\** _m\mathbf {1}\big \{\mathcal {B}({\mathcal {Y}};s)\cap \mathcal {B}({\mathcal {Y}}';s)=\emptyset \big \}, \end{aligned}$$
$$\begin{aligned} D_m=\** _m\mathbf {1}\big \{\mathcal {B}({\mathcal {Y}};s)\cap \mathcal {B}({\mathcal {Y}}';s)\ne \emptyset \big \}. \end{aligned}$$

Since \(\text {vol}\big (\mathcal {B}({\mathcal {Y}}\cup {\mathcal {Y}}';s)\big )=\text {vol}\big (\mathcal {B}({\mathcal {Y}};s)\big )+\text {vol}\big (\mathcal {B}({\mathcal {Y}}';s)\big )\) whenever \(\mathcal {B}({\mathcal {Y}};s)\cap \mathcal {B}({\mathcal {Y}}';s)=\emptyset\), it is straightforward to check that

$$\begin{aligned} v_{m+1}^{2i}\mathbb {E}[C_m \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}'=\emptyset \big \}] = o(R_{q_m}^d), \ \ \ m\rightarrow \infty , \end{aligned}$$

by the same arguments as in (73), (75), and (76). Moreover, by Lemma 5.4 (ii),

$$\begin{aligned} \begin{aligned}&v_{m+1}^{2i}\mathbb {E}[D_m \, \mathbf{1}\big \{ {\mathcal {Y}}\cap {\mathcal {Y}}'=\emptyset \big \}]\\ \le& \ v_{m+1}^{2i}\mathbb {P}\big ( \check{C}({\mathcal {Y}}\cup {\mathcal {Y}}', 2s) \text { is connected}, \, \mathcal {M}({\mathcal {Y}}\cup {\mathcal {Y}}')\ge R_{q_m}, \, {\mathcal {Y}}\cap {\mathcal {Y}}'=\emptyset \big )\\ \le& \ C^{*}R_{q_m}^d. \end{aligned} \end{aligned}$$

It thus follows that \(\sup _{m\ge 1} R_{q_m}^{-d}\, B_m <\infty\), and hence (68) has been established. Since the proof of (69) is very similar to that of (68), we will omit it.

Proof of (70): We apply Fubini’s theorem to obtain that

$$\begin{aligned}R_{q_m}^{-d}\, \mathbb {E}\big [ V_{m,M}(t) \big ] &= \sum _{i=M+1}^\infty i^{k+1} \left( {\begin{array}{c}v_{m+1}\\ i\end{array}}\right) R_{q_m}^{-d}\, \\&\quad \times \mathbb {P}\Big ( \check{C}\big ( \{ X_1,\dots ,X_i \}, t \big ) \text { is connected}, \, \mathcal {M}(X_1,\dots ,X_i) \ge R_{q_m} \Big ). \end{aligned}$$

Taking \(\delta >0\) so small that \(\lambda (1+\delta ) e\omega _d <1\), Lemma 5.4 (ii) demonstrates that

$$\begin{aligned} R_{q_m}^{-d}\, \mathbb {E}\big [ V_{m,M}(t) \big ] \le C^* \sum _{i=M+1}^\infty i^{k+1} \cdot \frac{\lambda ^i(1+\delta )^ii^{i-2}\omega _d^{i-1}}{i!}, \end{aligned}$$
(82)

where \(C^*\) is a positive constant independent of i and m. By Stirling’s formula \(i! \ge (i/e)^i\) for sufficiently large i, (82) is further bounded by \(C^*\sum _{i=M+1}^\infty i^{k-1}\) \(\times \big ( \lambda (1+\delta ) e\omega _d \big )^i\), which is finite due to the constraint \(\lambda (1+\delta )e\omega _d<1\). By Lemma 5.4 (i) and the dominated convergence theorem, one can obtain (70) as required.

Finally, (72) can be established by combining Lemma 5.4 and the argument similar to that for the variance asymptotics of (68), so we will skip the detailed discussions. \(\square\)

Proof of Lemma 5.4

We first prove (65). By the change of variables \(x_1=x\), \(x_\ell =x+y_{\ell -1}\) for \(\ell \in \{2,\dots ,i\}\), which is followed by an additional change of variables \(x\leftrightarrow (R_{q_m}\rho ,\theta )\) as in (74) and (75), we have that

$$\begin{aligned} \begin{aligned}&v_{m+1}^i R_{q_m}^{-d}\, \mathbb {P}\Big ( \check{C}\big ( \{ X_1,\dots ,X_i \}, t\big ) \text { is connected}, \, \mathcal {M}(X_1,\dots ,X_i)\ge R_{q_m}\Big )\\&=v_{m+1}^i f(R_{q_m})^i\int _{S^{d-1}}J(\theta )\int _1^\infty \rho ^{d-1}\int _{(\mathbb {R}^d)^{i-1}}\mathbf {1}\big \{\check{C}\big (\{0,\mathbf{y}\},t\big ) \text { is connected}\big \}\\&\quad \times \frac{f(R_{q_m}\rho )}{f(R_{q_m})}\prod _{\ell =1}^{i-1}\frac{f\big (R_{q_m}\Vert \rho \theta +y_\ell /R_{q_m} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +y_\ell /R_{q_m}\Vert \ge 1\big \}d\mathbf{y}\, d\rho \, d\theta . \end{aligned} \end{aligned}$$
(83)

Employing Potter’s bound as in (77) and (78) with \(\xi =\min \{(\alpha -d)/2,\delta /2\}\), there exists \(N_1\in \mathbb {N}\), such that for all \(m\ge N_1\),

$$\begin{aligned} \frac{f(R_{q_m}\rho )}{f(R_{q_m})}\prod _{\ell =1}^{i-1}\frac{f\big (R_{q_m}\Vert \rho \theta +y_\ell /R_{q_m} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +y_\ell /R_{q_m}\Vert \ge 1\big \}\le (1+\xi )^i\rho ^{-\alpha +\xi }. \end{aligned}$$
(84)

By the definition of \(\xi\) above and Lemma 5.3, one can find \(N_2\in \mathbb {N}\), such that for all \(m\ge N_2\), \((1+\xi )v_{m+1}f(R_{q_m})\le \lambda (1+\delta )\). Thus, for all \(m \ge N:=N_1\vee N_2\),

$$\begin{aligned}&v_{m+1}^iR_{q_m}^{-d}\, \mathbb {P}\Big ( \check{C}\big ( \{ X_1,\dots ,X_i \}, t\big ) \text { is connected}, \, \mathcal {M}(X_1,\dots ,X_i)\ge R_{q_m}\Big ) \\&\le \frac{s_{d-1}\big ((1+\xi )v_{m+1}f(R_{q_m})\big )^i}{\alpha -d-\xi }\int _{(\mathbb {R}^d)^{i-1}}\mathbf {1}\big \{\check{C}\big (\{0,\mathbf{y}\}, t\big ) \text { is connected}\big \}d\mathbf{y}\\&\le \frac{2s_{d-1}\lambda ^i (1+\delta )^i}{ \alpha -d}\int _{(\mathbb {R}^d)^{i-1}}\mathbf {1}\big \{\check{C}\big (\{0,\mathbf{y}\}, 1\big ) \text { is connected}\big \}d\mathbf{y}\\&\le \frac{2s_{d-1}\lambda ^i(1+\delta )^ii^{i-2}\omega _d^{i-1}}{\alpha -d}. \end{aligned}$$

The last inequality above follows from an elementary fact that there exist \(i^{i-2}\) spanning trees on a set of i points.

For the proof of (64), returning to (83) we see that \(v_{m+1}^i f(R_{q_m})^i\rightarrow \lambda\) as \(m\rightarrow \infty\) by Lemma 5.3. By the dominated convergence theorem with the regular variation assumption (12) and an integrable bound obtained in (84), we can get (64). \(\square\)

1.5 Approximating bounds for Betti numbers

We provide the upper and lower bounds for \(\beta _{k,n}(t)\) in (11). We will make use of these bounds when estimating the difference between \(\beta _{k,n}(t)\) and \(G_{k,n}(t)\) (see (44)) in the proof of Theorem 3.1 (i).

Lemma 5.6

For all \(t \in [0,1]\),

$$\begin{aligned} J_{k,n}^{(k+2,1)}(t) \le \beta _{k,n}(t) \le J_{k,n}^{(k+2,1)} (t) + \left( {\begin{array}{c}k+3\\ k+1\end{array}}\right) L_{k,n}(t), \end{aligned}$$
(85)

where

$$L_{k,n}(t) := \sum \limits_{\underset{|{\mathcal {Y}}|=k+3}{{\mathcal {Y}}\subset {\mathcal {X}}_n,} } \mathbf{1}\big \{ \check{C}({\mathcal {Y}}, t) \text { is connected}, \, \mathcal {M}({\mathcal {Y}})\ge R_n \big \}.$$

Proof

The inequality on the left hand side in (85) is obvious due to (11). Owing to (11) again, the remaining inequality is equivalent to

$$\begin{aligned} \sum _{i=k+3}^n \sum _{j\ge 1} jJ_{k,n}^{(i,j)} (t) \le \left( {\begin{array}{c}k+3\\ k+1\end{array}}\right) L_{k,n}(t). \end{aligned}$$

By the definition of \(J_{k,n}^{(i,j)}(t)\), the left hand side above is equal to

$$\begin{aligned} \begin{aligned} \sum _{i=k+3}^n \sum \limits_{\underset{|{\mathcal {Y}}|=i}{ {\mathcal {Y}}\subset {\mathcal {X}}_n,}}&\beta _k \big ( \check{C}({\mathcal {Y}},t) \big ) \, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}})\ge R_n \big \}\, \\&\times \mathbf{1}\big \{ \check{C}({\mathcal {Y}}, t) \text { is a connected component of } \check{C}({\mathcal {X}}_n, t) \big \}. \end{aligned} \end{aligned}$$
(86)

Note that \(\beta _k \big ( \check{C}({\mathcal {Y}},t) \big )\) is bounded by the number of k-simplices of \(\check{C}({\mathcal {Y}},t)\). Suppose that for some \(i\ge k+3\) and \({\mathcal {Y}}\subset {\mathcal {X}}_n\) with \(|{\mathcal {Y}}|=i\), \(\check{C}({\mathcal {Y}},t)\) is a connected component of \(\check{C}({\mathcal {X}}_n,t)\) with \(\mathcal {M}({\mathcal {Y}})\ge R_n\). Then, there exists \(\mathcal {Z}\subset {\mathcal {Y}}\) with \(|\mathcal {Z}|=k+3\) such that \(\check{C}(\mathcal {Z},t)\) is a connected subcomplex of \(\check{C}({\mathcal {Y}},t)\). Every time one finds such a connected subcomplex on \(k+3\) points, it can increase the k-simplex counts of \(\check{C}({\mathcal {Y}},t)\) by at most \(\left( {\begin{array}{c}k+3\\ k+1\end{array}}\right)\). In conclusion,

$$\begin{aligned}&\beta _k\big ( \check{C}({\mathcal {Y}},t) \big )\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}})\ge R_n \big \} \\ \le&\left( {\begin{array}{c}k+3\\ k+1\end{array}}\right) \sum \limits_{\underset{|\mathcal {Z}|=k+3 } {\mathcal {Z}\subset {\mathcal {Y}},}} \mathbf{1}\big \{ \check{C}(\mathcal {Z}, t) \text { is connected}, \, \mathcal {M}(\mathcal {Z})\ge R_n \big \}. \end{aligned}$$

Substituting this bound back into (86),

$$\begin{aligned}&\sum _{i=k+3}^n \sum _{j\ge 1} jJ_{k,n}^{(i,j)} (t) \\& \ \le \left( {\begin{array}{c}k+3\\ k+1\end{array}}\right) \sum _{i=k+3}^n \sum \limits _{\underset{|{\mathcal {Y}}|=i}{{\mathcal {Y}}\subset {\mathcal {X}}_n,}} \mathbf{1}\big \{ \check{C}({\mathcal {Y}}, t) \text { is a connected component of } \check{C}({\mathcal {X}}_n, t) \big \} \\&\ \times \sum \limits_{\underset{|\mathcal {Z}|=k+3} {\mathcal {Z}\subset {\mathcal {Y}},} } \mathbf{1}\big \{ \check{C}(\mathcal {Z}, t) \text { is connected}, \, \mathcal {M}(\mathcal {Z})\ge R_n \big \} \\ &= \ \left( {\begin{array}{c}k+3\\ k+1\end{array}}\right) L_{k,n}(t). \end{aligned}$$

\(\square\)

1.6 Asymptotic moment results for Theorem 3.1 (i)

Here, we calculate the asymptotic moments of \(S_m^\uparrow (t)\), \(S_m^\downarrow (t)\), and \(W_m(t)\); see (48), (49), and (56).

Lemma 5.7

Under the assumptions of Theorem 3.1 (i), for every \(t\in [0,1]\) we have that as \(m\rightarrow \infty\),

$$\begin{aligned} \big ( v_{m+1}^{k+2} R_{q_m}^d f(R_{q_m})^{k+2} \big )^{-1} \mathbb {E}\big [S_m^\uparrow (t)\big ]&\rightarrow \frac{\mu _k^{(k+2,1,+)}(t; 0)}{(k+2)!}, \\ \big ( v_{m}^{k+2} R_{p_m}^d f(R_{p_m})^{k+2} \big )^{-1} \mathbb {E}\big [S_m^\downarrow (t)\big ]&\rightarrow \frac{\mu _k^{(k+2,1,+)}(t; 0)}{(k+2)!}, \end{aligned}$$

and also,

$$\begin{aligned}&\big ( v_{m+1}^{k+3} R_{q_m}^d f(R_{q_m})^{k+3} \big )^{-1} \mathbb {E}\big [ W_m(t) \big ] \\ \rightarrow&\frac{s_{d-1}}{(k+3)!\big ( \alpha (k+3)-d \big )}\, \int _{(\mathbb {R}^d)^{k+2}}\mathbf{1}\Big \{ \check{C}\big ( \{ 0,\mathbf{y}\}, t \big ) \text { is connected} \Big \} d\mathbf{y}, \end{aligned}$$

where \(\mathbf{y}=(y_1,\dots ,y_{k+2})\in (\mathbb {R}^d)^{k+2}\). Moreover,

$$\begin{aligned}&\sup _{m\ge 1}\big ( v_{m+1}^{k+2} R_{q_m}^d f(R_{q_m})^{k+2} \big )^{-1} \text {Var}\big ( S_m^\uparrow (t) \big )<\infty , \\&\sup _{m\ge 1} \big ( v_{m}^{k+2} R_{p_m}^d f(R_{p_m})^{k+2} \big )^{-1}\text {Var}\big ( S_m^\downarrow (t) \big ) <\infty . \end{aligned}$$

Proof

The proof here is mostly the same as those for Lemma 5.5, so we provide only the sketch of proof for the first statement. Appealing to Palm theory for Poisson processes (see, e.g., Theorem 1.6 in Penrose 2003),

$$\begin{aligned} \mathbb {E}\big [S_m^\uparrow (t) \big ] &= \frac{v_{m+1}^{k+2}}{(k+2)!}\, \int _{(\mathbb {R}^d)^{k+2}}h_t^{(k+2,1,+)}(x_1,\dots ,x_{k+2}) \, \\&\quad\times \mathbf{1}\big \{ \mathcal {M}(x_1,\dots ,x_{k+2}) \ge R_{q_m}\big \}\prod _{\ell =1}^{k+2} f(x_\ell ) d\mathbf{x}. \end{aligned}$$

By the same change of variables as in (75) with \(i=k+2\) and \(j=1\),

$$\begin{aligned}\mathbb {E}\big [ S_m^\uparrow (t)\big ]&=\frac{v_{m+1}^{k+2}R_{q_m}^d f(R_{q_m})^{k+2}}{(k+2)!}\, \int _{S^{d-1}}J(\theta )\int _1^{\infty }\rho ^{d-1}\int _{(\mathbb {R}^d)^{k+1}}h^{(k+2,1,+)}_t(0,\mathbf{y})\\&\quad \times \frac{f(R_{q_m}\rho )}{f(R_{q_m})} \prod _{\ell =1}^{k+1}\frac{f\big (R_{q_m}\Vert \rho \theta +y_\ell /R_{q_m} \Vert \big )}{f(R_{q_m})}\, \mathbf {1}\big \{\Vert \rho \theta +y_\ell /R_{q_m}\Vert \ge 1\big \} d\mathbf{y}\, d\rho \, d\theta . \end{aligned}$$

The rest of our discussion is completely the same as the argument after (75). \(\square\)

1.7 Reverting Poisson U-statistics to the binomial counterpart

The next result justifies that with a proper scaling, the asymptotic behaviors of \(S_m^\uparrow (t)\) and \(S_m^\downarrow (t)\) will remain unchanged, even if the Poisson point process is replaced with the corresponding binomial process.

Lemma 5.8

Under the assumptions of Theorem 3.1 (i), for every \(t\in [0,1]\) we have, as \(m\rightarrow \infty\),

$$\begin{aligned}&\big ( v_{m+1}^{k+2} R_{q_m}^df(R_{q_m})^{k+2} \big )^{-1} \\& \qquad \qquad \times \bigg \{ \sum \limits_{\underset{|{\mathcal {Y}}|=k+2} {{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}},}} h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{q_m} \big \} -S_m^\uparrow (t) \bigg \}\rightarrow 0, \ \ \text {a.s.}, \\&\big ( v_{m}^{k+2} R_{p_m}^df(R_{p_m})^{k+2} \big )^{-1} \\& \qquad \qquad \times \bigg \{ \sum \limits_{\underset{|{\mathcal {Y}}|=k+2 }{{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m}},} } h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{p_m} \big \} -S_m^\downarrow (t) \bigg \}\rightarrow 0, \ \ \text {a.s.}, \end{aligned}$$

and further,

$$\begin{aligned}&\big ( v_{m+1}^{k+3} R_{q_m}^d f(R_{q_m})^{k+3}\big )^{-1} \\& \qquad \times \bigg \{ \sum \limits_{\underset{|{\mathcal {Y}}|=k+3} {{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}},} } \mathbf{1}\big \{ \check{C}({\mathcal {Y}}, t) \text { is connected}, \, \mathcal {M}({\mathcal {Y}})\ge R_{q_m} \big \} - W_m(t)\bigg \} \rightarrow 0, \text {a.s.} \end{aligned}$$

Proof

The proofs of these statements are essentially the same, so we show only the first result. By the Borel-Cantelli lemma and Markov’s inequality, it suffices to demonstrate that

$$\begin{aligned} &\begin{aligned} \sum _{m=1}^\infty&\big ( v_{m+1}^{k+2} R_{q_m}^df(R_{q_m})^{k+2} \big )^{-1} \\& \qquad \times \mathbb {E}\bigg [\, \bigg |\sum \limits_{\underset{|{\mathcal {Y}}|=k+2}{{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}},}}h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{q_m} \big \} -S_m^\uparrow (t) \, \bigg |\, \bigg ]< \infty . \end{aligned} \end{aligned}$$
(87)

Recall that \(|\mathcal P_{v_{m+1}}|\) (i.e., the cardinality of \(\mathcal P_{v_{m+1}}\)) is Poisson distributed with parameter \(v_{m+1}\). By the conditioning on the values of \(|\mathcal P_{v_{m+1}}|\), we get that

$$\begin{aligned} \begin{aligned}&\mathbb {E}\bigg [\, \bigg |\sum\limits _{\underset{|{\mathcal {Y}}|=k+2 } {{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}},} } h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{q_m} \big \} -S_m^\uparrow (t)\, \bigg |\, \bigg ] \\&=\sum _{\ell =0}^\infty \mathbb {E}\bigg [\, \bigg |\sum _{{\mathcal {Y}}\subset {\mathcal {X}}_{v_{m+1}}, \, |{\mathcal {Y}}|=k+2} h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{q_m} \big \} \\&\quad- \sum \limits_{\underset{|{\mathcal {Y}}|=k+2} {{\mathcal {Y}}\subset {\mathcal {X}}_\ell ,} } h_t^{(k+2,1,+)}({\mathcal {Y}})\, \mathbf{1}\big \{ \mathcal {M}({\mathcal {Y}}) \ge R_{q_m} \big \} \bigg |\, \bigg ] \mathbb {P}\big ( |\mathcal P_{v_{m+1}}|=\ell \big ) \\&= \sum _{\ell =0}^\infty \Big |\left( {\begin{array}{c}\ell \\ k+2\end{array}}\right) -\left( {\begin{array}{c}v_{m+1}\\ k+2\end{array}}\right) \Big |\, \mathbb {E}\big [ h_t^{(k+2,1,+)}(X_1,\dots ,X_{k+2})\,\\&\quad\times \mathbf{1}\big \{ \mathcal {M}(X_1,\dots ,X_{k+2})\ge R_{q_m} \big \} \big ] \mathbb {P}\big ( |\mathcal P_{v_{m+1}}|=\ell \big ), \end{aligned} \end{aligned}$$
(88)

where \(X_1,\dots , X_{k+2}\) are i.i.d random variables with density f. Proceeding as in the proof of Lemma 5.5, we can derive that

$$\begin{aligned}\mathbb {E}\big [ h_t^{(k+2,1,+)}(X_1,\dots ,X_{k+2})\, \mathbf{1}\big \{ \mathcal {M}(X_1,\dots ,X_{k+2})\ge R_{q_m} \big \} \big ] \\ \sim C^* R_{q_m}^d f(R_{q_m})^{k+2}\ , \ \ m\rightarrow \infty . \end{aligned}$$

Referring this back into (88), the left hand side in (87) is now bounded by a constant multiple of

$$\begin{aligned} \begin{aligned}&\sum _{m=1}^\infty \frac{1}{v_{m+1}^{k+2}}\, \sum _{\ell =0}^\infty \Big |\left( {\begin{array}{c}\ell \\ k+2\end{array}}\right) -\left( {\begin{array}{c}v_{m+1}\\ k+2\end{array}}\right) \Big |\, \mathbb {P}\big ( |\mathcal P_{v_{m+1}}|=\ell \big )\\&=\sum _{m=1}^\infty \frac{1}{v_{m+1}^{k+2}}\, \mathbb {E}\bigg [ \Big |\left( {\begin{array}{c}|\mathcal P_{v_{m+1}}|\\ k+2\end{array}}\right) -\left( {\begin{array}{c}v_{m+1}\\ k+2\end{array}}\right) \Big |\bigg ] \\&\le \sum _{m=1}^\infty \frac{1}{v_{m+1}^{k+2}}\, \bigg \{ \mathbb {E}\bigg [ \left( {\begin{array}{c}|\mathcal P_{v_{m+1}}|\\ k+2\end{array}}\right) ^2 \bigg ] - 2\left( {\begin{array}{c}v_{m+1}\\ k+2\end{array}}\right) \mathbb {E}\bigg [ \left( {\begin{array}{c}|\mathcal P_{v_{m+1}}|\\ k+2\end{array}}\right) \bigg ] + \left( {\begin{array}{c}v_{m+1}\\ k+2\end{array}}\right) ^2 \bigg \}^{1/2}, \end{aligned} \end{aligned}$$
(89)

where the last relation is due to the Cauchy-Schwarz inequality. It is elementary to check that there are constants \(c_j\), \(j=1,2,\dots ,2k+4\), with \(c_{2k+4}=1\), such that

$$\begin{aligned}&\mathbb {E}\bigg [ \left( {\begin{array}{c}|\mathcal P_{v_{m+1}}|\\ k+2\end{array}}\right) \bigg ] = \frac{v_{m+1}^{k+2}}{(k+2)!}, \ \ \text {and } \ \mathbb {E}\bigg [ \left( {\begin{array}{c}|\mathcal P_{v_{m+1}}|\\ k+2\end{array}}\right) ^2 \bigg ] =\frac{1}{\big ((k+2)!\big )^2}\, \sum _{j=1}^{2k+4}c_j v_{m+1}^j. \end{aligned}$$

Therefore, the last expression in (89) can be written as

$$\begin{aligned} \sum _{m=1}^\infty \frac{1}{v_{m+1}^{k+2}} \cdot \frac{1}{(k+2)!}\Big ( \sum _{j=1}^{2k+3}c_j' v_{m+1}^j \Big )^{1/2} \end{aligned}$$
(90)

for some constants \(c_j'\), \(j=1,\dots ,2k+3\) (note that \(v_{m+1}^{2k+4}\) has disappeared here). Finally, (90) is further bounded by

$$\begin{aligned} C^*\sum _{m=1}^\infty \frac{1}{v_{m+1}^{1/2}} \le C^*\sum _{m=1}^\infty e^{-m^\gamma /2} < \infty . \end{aligned}$$

\(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Owada, T., Wei, Z. Functional strong law of large numbers for Betti numbers in the tail. Extremes 25, 655–693 (2022). https://doi.org/10.1007/s10687-022-00441-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10687-022-00441-x

Keywords

AMS 2000 Subject Classification

Navigation