1 Introduction

Let \(({\mathcal M},{\mathscr {F}},\nu )\) be a probability space and consider a measure-preserving dynamical system

$$\begin{aligned} \varphi ^t:{\mathcal M}\rightarrow {\mathcal M}. \end{aligned}$$
(1.1)

A fundamental question is how often a trajectory with random initial data \(x\in {\mathcal M}\) intersects a given target set \({\mathcal D}\in {\mathscr {F}}\) within time t. If \({\mathcal D}\) is fixed, this problem has led to many important developments in ergodic theory, which show that, if \(\varphi ^t\) is sufficiently “chaotic” (e.g., partially hyperbolic), the number of intersections satisfies a central limit theorem and more general invariance principles. One of the first results in this direction was Sinai’s proof of the central limit theorem for geodesic flows [42] and, with Bunimovich, the finite-horizon Lorentz gas [10]. We refer the reader to [2, 15, 21, 47] for further references to the literature on this subject. In the case of non-hyperbolic dynamical systems, such as horocycle flows or toral translations, the classical stable limit laws generally fail and must be replaced by system-dependent limit theorems [68, 16, 17, 22]. If on the other hand one considers a sequence of target sets \({\mathcal D}_\rho \in {\mathscr {F}}\) such that \(\nu ({\mathcal D}_\rho )\rightarrow 0\) as \(\rho \rightarrow 0\), then the number of intersections within time t (now measured in units of the mean return time to \({\mathcal D}_\rho \)) satisfies a Poisson limit law, provided \(\varphi ^t\) is mixing with sufficiently rapid decay of correlations. The first results of this type were proved by Pitskel [36] for Markov chains, and by Hirata [25] in the case of Axiom A diffeomorphisms by employing transfer operator techniques and the Ruelle zeta function. (Hirata’s paper was in fact motivated by Sinai’s work [43, 44] on the Poisson distribution for quantum energy levels of generic integrable Hamiltonians, following a conjecture by Berry and Tabor [3, 32] in the context of quantum chaos.) For more recent studies on the Poisson law for hitting times in “chaotic” dynamical systems, see [1, 11, 20, 23, 24, 29, 39] and references therein.

In the present paper we prove analogous limit theorems for integrable Hamiltonian flows \(\varphi ^t\), which are not Poisson yet universal in the sense that they do not depend on the fine features of the individual system considered. The principal result of this study is explained in Sect. 2 for the case of flows with two degrees of freedom, where the target set is a union of small intervals of varying position, length and orientation on each Liouville torus. In the limit of vanishing target size, the sequence of hitting times converges to a limiting process which is described in Sect. 3. Sections 4 and 5 illustrate the universality of our limit distribution in the case of two classic examples: the motion of a particle in a central force field and the billiard dynamics in an ellipse. In both cases, the limit process for the hitting times, measured in units of the mean return time on each Liouville torus, is independent of the choice of potential or ellipse, and in fact only depends on the number of connected components of the target set on the invariant torus. The results of Sect. 3 are generalized in Sect. 6 to integrable flows with d degrees of freedom, where unions of small intervals are replaced by unions of shrinking dilations of k given target sets. The key ingredient in the proof of the limit theorems for hitting time statistics is the equidistribution of translates of certain submanifolds in the homogeneous space \(G/\Gamma \), where \(G={\text {SL}}(d,{\mathbb R})\ltimes ({\mathbb R}^d)^k\) and \(\Gamma ={\text {SL}}(d,{\mathbb Z})\ltimes ({\mathbb Z}^d)^k\). These results, which are stated and proved in Sect. 7, generalize the equidistribution theorems by Elkies and McMullen [18] in the case of nonlinear horocycles (\(d=2\), \(k=1\)), and are based on Ratner’s celebrated measure classification theorem. The application of these results to the hitting times is carried out in Sect. 8, and builds on our earlier work for the linear flow on a torus [34].

2 Integrable Flows with Two Degrees of Freedom

To keep the presentation as transparent as possible, we first restrict our attention to Hamiltonian flows with two degrees of freedom, whose phase space is the four-dimensional symplectic manifold \({\mathcal X}\). (The higher dimensional case is treated in Sect. 6.) The basic example is of course \({\mathcal X}={\mathbb R}^2\times {\mathbb R}^2\), where the first factor represents the particle’s position and the second its momentum. To keep the setting more general, we will not assume Liouville-integrability on the entire phase space, but only on an open subset \({\mathcal M}\subset {\mathcal X}\), a so-called integrable island. Liouville integrability [5, Sect. 1.4] implies that there is a foliation (the Liouville foliation) of \({\mathcal M}\) by two-dimensional leaves. Regular leaves are smooth Lagrangian submanifolds of \({\mathcal M}\) that fill \({\mathcal M}\) bar a set of measure zero. A compact and connected regular leaf is called a Liouville torus. Every Liouville torus has a neighbourhood that can be parametrised by action-angle variables \((\varvec{\theta },{\varvec{J}})\in {\mathbb T}^2\times {\mathcal U}\), where \({\mathbb T}^2={\mathbb R}^2/{\mathbb Z}^2\) and \({\mathcal U}\) is a bounded open subset of \({\mathbb R}^2\). In these coordinates the Hamiltonian flow is given by

$$\begin{aligned} \varphi ^t: {\mathbb T}^2\times {\mathcal U}\rightarrow {\mathbb T}^2\times {\mathcal U}, \quad (\varvec{\theta },{\varvec{J}}) \mapsto (\varvec{\theta }+ t\, \varvec{f}({\varvec{J}}),{\varvec{J}}) , \end{aligned}$$
(2.1)

with the smooth Hamiltonian vector field \(\varvec{f}=\nabla _{\varvec{J}}H\). In what follows, the Hamiltonian structure is in fact completely irrelevant, and we will assume \({\mathcal U}\) is a bounded open subset of \(\mathbb {R}^m\) (\(m\ge 1\) arbitrary), and \(\varvec{f}:{\mathcal U}\rightarrow {\mathbb R}^2\) a smooth function. We will refer to the corresponding \(\varphi ^t\) in (2.1) simply as an integrable flow. Even in the Hamiltonian setting, it is often not necessary to represent the dynamics in action-angle variables to apply our theory; cf. the examples of the central force field and billiards in ellipses discussed in Sects. 4 and 5.

We will consider random initial data \((\varvec{\theta },{\varvec{J}})\) that is distributed according to a given Borel probability measure \(\Lambda \) on \({\mathbb T}^2\times {\mathcal U}\). One example is

$$\begin{aligned} \Lambda ={\text {Leb}}_{{\mathbb T}^2}\times \lambda , \end{aligned}$$
(2.2)

where \({\text {Leb}}_{{\mathbb T}^2}\) is the uniform probability measure on \({\mathbb T}^2\) and \(\lambda \) is a given absolutely continuous Borel probability measure on \({\mathcal U}\). This choice of \(\Lambda \) is \(\varphi ^t\)-invariant. One of the key features of this work is that our conclusions also hold for more singular and non-invariant measures \(\Lambda \), such as \(\Lambda =\delta _{\varvec{\theta }_0}\times \lambda \), where \(\delta _{\varvec{\theta }_0}\) is a point mass at \(\varvec{\theta }_0\). The most general setting we will consider is to define \(\Lambda \) as the push-forward of a given (absolutely continuous) probability measure \(\lambda \) on \({\mathcal U}\) by the map \({\varvec{J}}\mapsto (\varvec{\theta }({\varvec{J}}),{\varvec{J}})\), where \(\varvec{\theta }:{\mathcal U}\rightarrow {\mathbb T}^2\) is a fixed smooth map; this means that we consider random initial data in \({\mathbb T}^2\times {\mathcal U}\) of the form \((\varvec{\theta }({\varvec{J}}),{\varvec{J}})\), where \({\varvec{J}}\) is a random point in \({\mathcal U}\) distributed according \(\lambda \). This is the set-up that we use in the formulation of our main result, Theorem 1 below. We will demonstrate in Remark 2.1 that this setting is indeed rather general, and allows a greater selection of measures than is apparent; for instance invariant measures of the form (2.2) can be realized within this framework.

We also note that the smoothness assumptions on \(\varvec{f}\) and \(\varvec{\theta }\) are less restrictive than they may appear: We can allow discontinuities in the derivatives of theses maps, provided there is an open subset \({\mathcal U}'\subset {\mathcal U}\) with \(\lambda ({\mathcal U}\setminus {\mathcal U}')=0\), so that the restrictions of \(\varvec{f}\) and \(\varvec{\theta }\) to \({\mathcal U}'\) are smooth. Furthermore, the smoothness requirements are a result of an application of Sard’s theorem in Theorem 11 and may in fact be replaced by finite differentiability conditions.

We consider target sets \({\mathcal D}_\rho ={\mathcal D}_\rho ^{(k)}\) that, in each leaf, appear as disjoint unions of k short intervals transversal to the flow direction. To give a precise definition of \({\mathcal D}_\rho \), fix smooth functions \(\varvec{u}_j:{\mathcal U}\rightarrow {\text {S}}^1\), \(\varvec{\phi }_j:{\mathcal U}\rightarrow {\mathbb T}^2\), and \(\ell _j:{\mathcal U}\rightarrow {\mathbb R}_{>0}\) (\(j=1,\ldots ,k\)) which describe the orientation, midpoint and length of the jth interval in each leaf. Set

$$\begin{aligned} {\mathcal D}_\rho ^{(k)} = \bigcup _{j=1}^k {\mathcal D}(\varvec{u}_j,\varvec{\phi }_j,\rho \ell _j) , \end{aligned}$$
(2.3)

where

$$\begin{aligned} {\mathcal D}(\varvec{u},\varvec{\phi },\ell ) := \left\{ \left( \varvec{\phi }({\varvec{J}})+s \varvec{u}({\varvec{J}})^\perp ,{\varvec{J}}\right) \in {\mathbb T}^2\times {\mathcal U}\,\bigg |\, -\frac{\ell ({\varvec{J}})}{2}<s<\frac{\ell ({\varvec{J}})}{2} \right\} , \end{aligned}$$
(2.4)

with \(\varvec{u}({\varvec{J}})^\perp \) denoting a unit vector perpendicular to \(\varvec{u}({\varvec{J}})\). This yields, in each leaf \({\mathbb T}^2\times \{{\varvec{J}}\}\), a union of k intervals, where the jth interval has length \(\rho \ell _j({\varvec{J}})\), is centered at \(\varvec{\phi }_j({\varvec{J}})\) and perpendicular to \(\varvec{u}_j({\varvec{J}})\). As mentioned, we assume that each interval is transversal to the flow direction, i.e. \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})\ne 0\) for all \(j\in \{1,\ldots ,k\}\) and all \({\varvec{J}}\in {\mathcal U}\); in fact we will even assume \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\), without any loss of generality.

Now, for any initial condition \((\varvec{\theta },{\varvec{J}})\), the set of hitting times

$$\begin{aligned} {\mathcal T}(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho ) :=\{ t >0 \mid \varphi ^t(\varvec{\theta },{\varvec{J}}) \in {\mathcal D}_\rho \} \end{aligned}$$
(2.5)

is a discrete (possibly empty) subset of \({\mathbb R}_{>0}\), the elements of which we label by

$$\begin{aligned} 0<t_1(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho )<t_2(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho )<\ldots . \end{aligned}$$
(2.6)

We call \(t_i(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho )\) the ith entry time to \({\mathcal D}_\rho \) if \((\varvec{\theta },{\varvec{J}})\notin {\mathcal D}_\rho \), and the ith return time to \({\mathcal D}_\rho \) if \((\varvec{\theta },{\varvec{J}})\in {\mathcal D}_\rho \). A simple volume argument (Santalo’s formula [12]) shows that for any fixed \({\varvec{J}}\in {\mathcal U}\) such that the components of \(\varvec{f}({\varvec{J}})\) are not rationally related, the first return time to \({\mathcal D}_\rho \) on the leaf \({\mathbb T}^2\times \{{\varvec{J}}\}\) satisfies the formula

$$\begin{aligned} \int _{{\mathcal D}_\rho }t_1(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho )\,d\nu _{\varvec{J}}(\varvec{\theta })=1, \end{aligned}$$
(2.7)

where \(\nu _{\varvec{J}}\) is the invariant measure on \({\mathcal D}_\rho \) obtained by disintegrating Lebesgue measure on \({\mathbb T}^2\times \{{\varvec{J}}\}\) with respect to the section \({\mathcal D}_\rho \) of the flow \(\varphi ^t\). The measure \(\nu _{\varvec{J}}\) is explicitly given by

$$\begin{aligned} \int _{{\mathcal D}_\rho } g \,d\nu _{\varvec{J}}= \sum _{j=1}^k \left( \varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})\right) \int _{-\rho \ell _j({\varvec{J}})/2}^{\rho \ell _j({\varvec{J}})/2} g \left( \varvec{\phi }({\varvec{J}})+s \varvec{u}({\varvec{J}})^\perp ,\,{\varvec{J}}\right) \,ds, \qquad \forall g \in {\text {C}}({\mathcal D}_\rho ). \end{aligned}$$
(2.8)

Recall that by transversality \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\). It follows that the mean return time with respect to \(\nu _{\varvec{J}}\) equals

$$\begin{aligned} \frac{\overline{\sigma }^{(k)}({\varvec{J}})}{\rho },\qquad \text {where } \overline{\sigma }^{(k)} ({\varvec{J}}):= \frac{1}{\sum _{j=1}^k \ell _j({\varvec{J}})\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})}. \end{aligned}$$
(2.9)

If we also average over \({\varvec{J}}\) with respect to the measure \(\lambda \), the mean return time becomes

$$\begin{aligned} \frac{\overline{\sigma }^{(k)}_\lambda }{\rho },\qquad \text {where } \overline{\sigma }_\lambda ^{(k)} := \int _{\mathcal U}\overline{\sigma }^{(k)} ({\varvec{J}})\, \lambda (d{\varvec{J}}) . \end{aligned}$$
(2.10)

We have assumed here that the pushforward of \(\lambda \) by \(\varvec{f}\) has no atoms at points with rationally related coordinates. This holds in particular if \(\lambda \) is \(\varvec{f}\)-regular as defined below.

For \({\varvec{J}}\) a random point in \({\mathcal U}\) distributed according to \(\lambda \), the hitting times \(t_n(\varvec{\theta }({\varvec{J}}),{\varvec{J}},{\mathcal D}_\rho ^{(k)})\) become random variables, which we denote by \(\tau _{n,\rho }^{(k)}\). Also \(\overline{\sigma }^{(k)}({\varvec{J}})\) becomes a random variable, which we denote by \(\overline{\sigma }^{(k)}\). In this paper, we are interested in the distribution of the sequence of entry times \(\tau ^{(k)}_{n,\rho }\) rescaled by the mean return time (2.10), or by the conditional mean return time (2.9).

Finally we introduce two technical conditions. Note that \(\varvec{f}({\varvec{J}})\ne \mathbf {0}\) for all \({\varvec{J}}\in {\mathcal U}\), by the transversality assumption made previously. We say that \(\lambda \) is \(\varvec{f}\) -regular if the pushforward of \(\lambda \) under the map

$$\begin{aligned} {\mathcal U}\rightarrow {\text {S}}^1, \qquad {\varvec{J}}\mapsto \frac{\varvec{f}({\varvec{J}})}{\Vert \varvec{f}({\varvec{J}})\Vert }, \end{aligned}$$
(2.11)

is absolutely continuous with respect to Lebesgue measure on \({\text {S}}^1\). We say a k-tuple of smooth functions \(\varvec{\phi }_1,\ldots ,\varvec{\phi }_k:{\mathcal U}\rightarrow {\mathbb T}^2\) is \((\varvec{\theta },\lambda )\) -generic, if for all \(\varvec{m}=(m_1,\ldots ,m_k)\in {\mathbb Z}^k\setminus \{\varvec{0}\}\) we have

$$\begin{aligned} \lambda \left( \left\{ {\varvec{J}}\in {\mathcal U}\, : \,\sum _{j=1}^k m_j \, \left( \varvec{\phi }_j({\varvec{J}}) -\varvec{\theta }({\varvec{J}})\right) \in {\mathbb R}\varvec{f}({\varvec{J}}) + {\mathbb Q}^2 \right\} \right) = 0. \end{aligned}$$
(2.12)

The following is the main result of this paper.

Theorem 1

Let \(\varvec{f}:{\mathcal U}\rightarrow {\mathbb R}^2\) and \(\varvec{\theta }:{\mathcal U}\rightarrow {\mathbb T}^2\) be smooth maps, \(\lambda \) an absolutely continuous Borel probability measure on \({\mathcal U}\), and for \(j=1,\ldots ,k\), let \(\varvec{u}_j:{\mathcal U}\rightarrow {\text {S}}^1\), \(\varvec{\phi }_j:{\mathcal U}\rightarrow {\mathbb T}^2\) and \(\ell _j:{\mathcal U}\rightarrow {\mathbb R}_{>0}\) be smooth maps. Assume \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\) for all \({\varvec{J}}\in {\mathcal U}\), \(j\in \{1,\ldots ,k\}\). Also assume that \(\lambda \) is \(\varvec{f}\)-regular and \((\varvec{\phi }_1,\ldots ,\varvec{\phi }_k)\) is \((\varvec{\theta },\lambda )\)-generic. Then there are sequences of random variables \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \) in \({\mathbb R}_{>0}\) such that in the limit \(\rho \rightarrow 0\), for every integer N,

$$\begin{aligned} \left( \frac{\rho \tau _{1,\rho }^{(k)}}{\overline{\sigma }_\lambda ^{(k)}},\ldots ,\frac{\rho \tau _{N,\rho }^{(k)}}{\overline{\sigma }_\lambda ^{(k)}} \right) \,\,\buildrel \mathrm{d}\over \longrightarrow \,\,(\tau _1,\ldots ,\tau _N), \end{aligned}$$
(2.13)

and

$$\begin{aligned} \left( \frac{\rho \tau _{1,\rho }^{(k)}}{\overline{\sigma }^{(k)}},\ldots ,\frac{\rho \tau _{N,\rho }^{(k)}}{\overline{\sigma }^{(k)}} \right) \,\,\buildrel \mathrm{d}\over \longrightarrow \,\,(\widetilde{\tau }_1,\ldots ,\widetilde{\tau }_N). \end{aligned}$$
(2.14)

Note that if \(\overline{\sigma }_\lambda ^{(k)}=\infty \) then (2.13) is trivial, with \(\tau _i=0\) for all i, since \(\tau _{i,\rho }^{(k)}<\infty \) a.s. for every fixed \(\rho \).

Remark 2.1

Recall that Theorem 1 assumes that the initial data is \((\varvec{\theta }({\varvec{J}}),{\varvec{J}})\) with \({\varvec{J}}\in {\mathcal U}\) distributed according to \(\lambda \). This seems to exclude natural choices such as invariant measures of the form (2.2). Let us demonstrate that this is not the case. The setting of Theorem 1 (as well as its generalisation to arbitrary dimension \(d\ge 2\), Theorem 2 below) in fact permits random initial data \((\varvec{\theta },{\varvec{J}})\) distributed according to any probability measure \(\Lambda \) on \({\mathbb T}^d\times {\mathcal U}\) of the form \(\Lambda =\iota _*\lambda _0\), where \(\lambda _0\) is an absolutely continuous Borel probability measure on an open subset \({\mathcal U}_0\subset \mathbb {R}^{m_0}\) for some \(m_0\in \mathbb {Z}^+\), and some smooth map \(\iota :{\mathcal U}_0\rightarrow {\mathbb T}^d\times {\mathcal U}\). Indeed, such \(\Lambda \) can be realized within the setting of Theorem 1 by using

$$\begin{aligned} {\mathcal U}_0, \quad \varvec{f}_0:=\varvec{f}\circ {\text {pr}}_2\circ \iota , \quad \varvec{\theta }_0:={\text {pr}}_1\circ \iota , \quad \lambda _0 \end{aligned}$$
(2.15)

in place of

$$\begin{aligned} {\mathcal U}, \quad \varvec{f}, \quad \varvec{\theta }, \quad \lambda , \end{aligned}$$
(2.16)

where \({\text {pr}}_1,{\text {pr}}_2\) are the projection maps from \({\mathbb T}^d\times {\mathcal U}\) to \({\mathbb T}^d\) and \({\mathcal U}\), respectively. Of course, for Theorem 1 to apply we need to assume that \(\lambda _0\) is \(\varvec{f}_0\)-regular, and that \((\varvec{\phi }_1,\ldots ,\varvec{\phi }_k)\) is \((\varvec{\theta }_0,\lambda _0)\)-generic.

Remark 2.2

We describe the limit sequences \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \) in Sect. 3. A particular highlight is that in the case of a single target (\(k=1)\), or in the case of multiple targets with the same lengths \(\ell _1=\ldots =\ell _k\) and orientiation \(\varvec{u}_1=\ldots =\varvec{u}_k\), the distribution of \((\widetilde{\tau }_i)_{i=1}^\infty \) is universal. This means that it is independent of the choice of \({\mathcal U}\), \(\varvec{f}\), \(\lambda \), target orientations, positions and sizes. In fact a weaker form of universality holds also in the general case, and for both \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \). Indeed, let us define the target weight functions \(\varvec{L}=(L_1,\ldots ,L_k)\) and \(\widetilde{\varvec{L}}=(\widetilde{L}_1,\ldots ,\widetilde{L}_k)\) from \({\mathcal U}\) to \((\mathbb {R}_{>0})^k\), through

$$\begin{aligned} L_j({\varvec{J}})=\overline{\sigma }_\lambda ^{(k)}\,\ell _j({\varvec{J}})\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}}) \end{aligned}$$
(2.17)

and

$$\begin{aligned} \widetilde{L}_j({\varvec{J}})=\overline{\sigma }^{(k)}({\varvec{J}})\,\ell _j({\varvec{J}})\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}}). \end{aligned}$$
(2.18)

Then the distribution of \((\tau _i)_{i=1}^\infty \) depends on the system data only via the distribution of \(\varvec{L}({\varvec{J}})\) for \({\varvec{J}}\) random in \({\mathcal U}\) according to \(\lambda \), and similarly \(({\widetilde{\tau }}_i)_{i=1}^\infty \) depends only on the distribution of \(\widetilde{\varvec{L}}({\varvec{J}})\). Furthermore, both \((\tau _i)_{i=1}^\infty \) and \(({\widetilde{\tau }}_i)_{i=1}^\infty \) yield stationary point processes, i.e. the random set of time points \(\{\tau _i\}\) has the same distribution as \(\{\tau _i-t\}\cap \mathbb {R}_{>0}\) for every fixed \(t\ge 0\), and similarly for \(\{{\widetilde{\tau }}_i\}\) (cf. Sect. 6).

Remark 2.3

Theorem 1 is stated for the convergence of entry time distributions. It is a general fact that the convergence of entry time distributions implies the convergence of return time distributions and vice versa, with a simple formula relating the two [33].

Fig. 1
figure 1

Numerically computed \(F_1(s)\) and \(F_2(s)\), compared with the exponential function \(\mathrm {e}^{-s}\) and the explicit formula (3.12) for \(F_1(s)\). The inset shows the difference between the numerically computed \(F_1(s)\) and (3.12)

3 The Limit Distribution

We will now describe the limit processes \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \) in terms of elementary random variables in the unit cube. A more conceptual description in terms of Haar measure of the special linear group \({\text {SL}}(2,{\mathbb R})\) will be given in Sect. 6.

Pick uniformly distributed random points (abc) in the unit cube \((0,1)^3\). The push-forward of the uniform probability measure under the diffeomorphism

$$\begin{aligned} (0,1)^3 \rightarrow F, \qquad (a,b,c) \mapsto \left( \sin (\tfrac{\pi }{3}(a-\tfrac{1}{2})) , \frac{\cos (\tfrac{\pi }{3}(a-\tfrac{1}{2}))}{1-b} ,\pi c\right) \end{aligned}$$
(3.1)

yields the probability measure \(\mu _F=\frac{3}{\pi ^2}\, y^{-2} dx\,dy\,d\theta \) on the domain

$$\begin{aligned} F=\bigl \{ (x,y,\theta )\in {\mathbb R}^3 \, : \,|x|<\tfrac{1}{2},\, x^2+y^2>1,\, y>0,\, 0<\theta <\pi \bigr \} . \end{aligned}$$
(3.2)

For \(x,y,\theta \in {\mathbb R}\) with \(y>0\) and \(0\le \theta <\pi \), consider the Euclidean lattice

$$\begin{aligned} {\mathcal L}(x,y,\theta ) = k_{\theta }\left( \begin{matrix} {\sqrt{y}} &{} 0 \\ 0 &{} 1/{\sqrt{y}} \end{matrix} \right) \left( \begin{array}{l@{\quad }l}1&{}0\\ x&{}1\end{array}\right) \mathbb {Z}^2, \qquad \text {where }\, k_\theta :=\left( \begin{matrix} \cos \theta &{} \sin \theta \\ -\sin \theta &{} \cos \theta \end{matrix} \right) . \end{aligned}$$
(3.3)

A basis for this lattice is given by the two vectors

$$\begin{aligned} \varvec{b}_1=y^{-1/2}\,k_{\theta }\left( \begin{matrix} y \\ x \end{matrix} \right) \quad \text {and}\quad \varvec{b}_2=y^{-1/2}\,k_{\theta }\left( \begin{matrix} 0 \\ 1 \end{matrix} \right) . \end{aligned}$$
(3.4)

Note that \(\det (\varvec{b}_1,\varvec{b}_2)=1\) and hence \({\mathcal L}(x,y,\theta ) \) has unit covolume. If we choose \((x,y,\theta )\) random according to the probability measure \(\mu _F\), then \({\mathcal L}(x,y,\theta )\) represents a random Euclidean lattice (of covolume one). Similarly, for \(\varvec{\alpha }\in {\mathbb T}^2\), the shifted lattice

$$\begin{aligned} {\mathcal L}(x,y,\theta ,\varvec{\alpha }) = k_{\theta }\left( \begin{matrix} {\sqrt{y}} &{} 0 \\ 0 &{} 1/{\sqrt{y}} \end{matrix} \right) \left( \begin{array}{l@{\quad }l}1&{}0\\ x&{}1\end{array}\right) (\mathbb {Z}^2+\varvec{\alpha }) \end{aligned}$$
(3.5)

represents a random affine Euclidean lattice if in addition \(\varvec{\alpha }\) is uniformly distributed in \({\mathbb T}^2\). For a given affine Euclidean lattice \({\mathcal L}\) and \(\ell >0\), consider the cut-and-project set

$$\begin{aligned} {\mathcal P}({\mathcal L}, l):= \bigg \{ y_1>0 : \begin{pmatrix} y_1 \\ y_2 \end{pmatrix} \in {\mathcal L},\; -\frac{l}{2}< y_2 <\frac{l}{2} \bigg \} \subset {\mathbb R}_{>0}. \end{aligned}$$
(3.6)

Let (xyz) be randomly distributed according to \(\mu _F\), \(\varvec{\alpha }_1,\ldots ,\varvec{\alpha }_k\) be independent and uniformly distributed in \({\mathbb T}^2\), and \({\varvec{J}}\in {\mathcal U}\) distributed according to \(\lambda \). Let \(L_j({\varvec{J}})\) be as in (2.17). We will prove in Sect. 8 that the elements of the random set

$$\begin{aligned} \bigcup _{j=1}^k {\mathcal P}\big ({\mathcal L}(x,y,\theta ,\varvec{\alpha }_j), L_j({\varvec{J}})\big ), \end{aligned}$$
(3.7)

ordered by size, form precisely the sequence of random variables \((\tau _i)_{i=1}^\infty \) in Theorem 1. This sequence evidently only depends on the choice of target weight function \(\varvec{L}\) and the choice of \({\mathcal U}\), \(\lambda \). Similarly, replacing \(L_j({\varvec{J}})\) by \(\widetilde{L}_j({\varvec{J}})\) (cf. (2.18)) in (3.7), we obtain the sequence \((\widetilde{\tau }_i)_{i=1}^\infty \). Note that if \(\ell _1=\ldots =\ell _k\) and \(\varvec{u}_1=\ldots =\varvec{u}_k\), then \(\widetilde{L}_j({\varvec{J}})=1/k\), and thus \((\widetilde{\tau }_i)_{i=1}^\infty \) is indeed universal as we stated below Theorem 1.

Let us describe in some more detail the distribution of the first entry times \(\tau _1\) and \(\widetilde{\tau }_1\). In the case of k holes, we have

$$\begin{aligned} {\mathbb P}( \tau _1 > s ) = \int _{\mathcal U}F_k(s;\varvec{L}({\varvec{J}})) \, \lambda (d{\varvec{J}}) , \end{aligned}$$
(3.8)
$$\begin{aligned} {\mathbb P}( \widetilde{\tau }_1 > s ) = \int _{\mathcal U}F_k(s;{{\widetilde{\varvec{L}}}}({\varvec{J}})) \, \lambda (d{\varvec{J}}), \end{aligned}$$
(3.9)

with the universal function

$$\begin{aligned} F_k(s,\varvec{l}) = {\mathbb P}\big ( {\mathcal P}({\mathcal L}(x,y,\theta ,\varvec{\alpha }_j),l_j)\cap (0,s]=\emptyset \quad \text {for all} j=1,\ldots ,k \big ) , \end{aligned}$$
(3.10)

where \((x,y,\theta )\) is taken to be randomly distributed according to \(\mu _F\) and \(\varvec{\alpha }_1,\ldots ,\varvec{\alpha }_k\) independent and uniformly distributed in \({\mathbb T}^2\), and \(\varvec{l}=(l_1,\ldots ,l_k)\). It follows from the invariance properties of the underlying Haar measure (this will become clear in Sect. 6) that for any \(h>0\)

$$\begin{aligned} F_k\bigg (\frac{s}{h},h\varvec{l}\bigg )= F_k(s,\varvec{l}) . \end{aligned}$$
(3.11)

In the case of one hole (\(k=1\)), the function \(F_1(s):=F_1(s,1)\) appears as a limit in various other problems; notably it corresponds to the distribution of free path lengths in the periodic Lorentz gas in the small scatterer limit [4, 34]. It is explicitly given by

$$\begin{aligned} F_1(s)={\left\{ \begin{array}{ll} {\displaystyle \frac{3}{\pi ^2} s^2 -s+1} &{} (0 \le s \le 1); \\ {\displaystyle \frac{12}{\pi ^2} ( \** (s) - \** (s/2)) + \frac{6}{\pi ^2} s \log s + \Bigl (\frac{6+6 \log 2}{\pi ^2} - 2\Bigr )s + \frac{18 \log 2}{\pi ^2}} &{} (1 \le s), \end{array}\right. } \end{aligned}$$
(3.12)

where \(\** (s)\) for \(s>0\) is defined by \(\** ''(s) = (1-s^{-1})^2 \log |1-s^{-1}|\) and \(\** (1)=\** '(1)=0\). In particular \(F_1(s)\) has a heavy tail: One has

$$\begin{aligned} F_1(s)=\frac{2}{\pi ^2s}+O\Bigl (\frac{1}{s^2}\Bigr ) \qquad \text {as }\, s\rightarrow \infty . \end{aligned}$$
(3.13)

The formula (3.12) was derived in [46, Sec. 8]; cf. also [4, Theorem 1] and [14]. We are not aware of explicit formulas for the multiple-hole case \(k>1\). In this case we evaluate the right hand side of (3.10) numerically using a Monte Carlo algorithm. That is, we repeatedly generate a random tuple \((x,y,\theta ,\varvec{\alpha }_1,\ldots ,\varvec{\alpha }_k)\) as described above, and then determine the smallest \(s>0\) such that for some \(j\in \{1,\ldots ,k\}\) there exists a lattice point \((s,y_2)\in {\mathcal L}(x,y,\theta ,\varvec{\alpha }_j)\) in the strip \(-l_j/2<y_2<l_j/2\). In more detail, for given j, in order to determine the left-most point in the intersection of \({\mathcal L}(x,y,\theta ,\varvec{\alpha }_j)\) and the strip \(\mathbb {R}_{>0}\times (-l_j/2,l_j/2)\), one may proceed as follows. Write \({\mathcal L}(x,y,\theta ,\varvec{\alpha }_j)=\varvec{\beta }+\mathbb {Z}\varvec{b}_1+\mathbb {Z}\varvec{b}_2\) with \(\varvec{b}_1,\varvec{b}_2\) as in (3.4) and \(\varvec{\beta }\in \mathbb {R}^2\). After possibly interchanging \(\varvec{b}_1\) and \(\varvec{b}_2\), and then possibly negating \(\varvec{b}_1\), we may assume that the line \(\mathbb {R}\varvec{b}_2\) does not coincide with the x-axis and that the half plane \(\mathbb {R}_{>0}\varvec{b}_1+\mathbb {R}\varvec{b}_2\) intersects the x-axis in the interval \((0,+\infty )\). Now determine the smallest integer \(m_0\) for which the line \(\varvec{\beta }+m_0\varvec{b}_1+\mathbb {R}\varvec{b}_2\) intersects the strip \(\mathbb {R}_{>0}\times (-l_j/2,l_j/2)\), and then successively for \(m=m_0,m_0+1,m_0+2,\ldots \), check whether there is one or more integers n for which \(\varvec{\beta }+m\varvec{b}_1+n\varvec{b}_2\) lies in the strip. Note that once this happens for the first time, say for \((s',y')=\varvec{\beta }+m_1\varvec{b}_1+n\varvec{b}_2\), we only need to investigate at most finitely many further m-values \(m=m_1+1,m_1+2,\ldots \), namely those for which the line \(\varvec{\beta }+m\varvec{b}_1+\mathbb {R}\varvec{b}_2\) intersects the box \((0,s')\times (-l_j/2,l_j/2)\).

Our calculation for \(F_2(s):=F_2(s,(\frac{1}{2},\frac{1}{2}))\) used \(10^8\) random lattices. The result is presented in Fig. 1. We tested the algorithm by using it to calculate \(F_1(s)\) and comparing the resulting graph with the explicit formula (3.12).

4 Central Force Fields

The dynamics of a point particle subject to a central force field in \({\mathbb R}^3\) takes place in a plane perpendicular to its angular momentum, which is a constant of motion. We choose a coordinate system in which the angular momentum reads (0, 0, L), \(L\ge 0\). The equations of motion for a particle of unit mass read in polar coordinates

$$\begin{aligned} \dot{\phi }= \frac{L}{r^2}, \qquad \dot{r} =\pm \sqrt{2[E-V(r)]-\frac{L^2}{r^2}}, \end{aligned}$$
(4.1)

where V(r) is the potential as a function of the distance to the origin, and E the total energy. It will be convenient to set \({\varvec{J}}=(E,L)\), although this choice does not represent the canonical action variables in this problem. The equations of motion separate, and the dynamics in r is described by a one-dimensional Hamiltonian with effective potential \(V(r)+\frac{L^2}{2r^2}\). For a given initial \(r_0=r_0({\varvec{J}})\), the dynamics takes place between the periastron \(r_-=r_-({\varvec{J}})\le r_0({\varvec{J}})\) and the apastron \(r_+=r_+({\varvec{J}})\ge r_0({\varvec{J}})\), the minimal/maximal distance to the origin of the particle trajectory with energy E and angular momentum L. We will consider cases when the motion is bounded, i.e., \(0<r_-\le r_+<\infty \). Then these values are the turning points of the particle motion, and thus solutions to \(V(r)+\frac{L^2}{2r^2}=E\). The solution of the equations of motion \((r(t),\phi (t))\) with \((r(0),\phi (0))=(r_0,\phi _0)\) and initial radial velocity \(\dot{r}(0)\ge 0\) is either circular with \(\dot{r}(t)=0\) for all t, or otherwise implicitly given by

$$\begin{aligned} t = {\left\{ \begin{array}{ll} \displaystyle \int _{r_0}^{r(t)} \frac{dr'}{\sqrt{2[E-V(r')]-\frac{L^2}{{r'}^2}}} + n T &{} (\dot{r}(t) \ge 0) \\ \displaystyle \bigg (\int _{r_0}^{r_+} + \int _{r(t)}^{r_+}\bigg ) \frac{dr'}{\sqrt{2[E-V(r')]-\frac{L^2}{{r'}^2}}} + n T &{} (\dot{r}(t) \le 0) , \end{array}\right. } \end{aligned}$$
(4.2)

where n is an arbitrary integer. The period is

$$\begin{aligned} T =T({\varvec{J}}) = 2 \int _{r_-({\varvec{J}})}^{r_+({\varvec{J}})} \frac{dr}{\sqrt{2[E-V(r)]-\frac{L^2}{r^2}}} . \end{aligned}$$
(4.3)

Also

$$\begin{aligned} \phi (t) = {\left\{ \begin{array}{ll} \displaystyle \phi _0+\int _{r_0}^{r(t)} \frac{\frac{L}{{r'}^2}\, dr'}{\sqrt{2[E-V(r')]-\frac{L^2}{{r'}^2}}} + n \alpha &{} (\dot{r}(t) \ge 0) \\ \displaystyle \phi _0+\bigg (\int _{r_0}^{r_+} + \int _{r(t)}^{r_+}\bigg ) \frac{\frac{L}{{r'}^2}\, dr'}{\sqrt{2[E-V(r')]-\frac{L^2}{{r'}^2}}} + n \alpha &{} (\dot{r}(t) \le 0) , \end{array}\right. } \end{aligned}$$
(4.4)

with rotation angle

$$\begin{aligned} \alpha =\alpha ({\varvec{J}}) = 2 \int _{r_-({\varvec{J}})}^{r_+({\varvec{J}})} \frac{\frac{L}{r^2}\, dr}{\sqrt{2[E-V(r)]-\frac{L^2}{r^2}}}. \end{aligned}$$
(4.5)

The dynamics is described best by first considering the return map to the cross section defined by restricting the radial variable to \(r=r_0\) with non-negative radial velocity \(\dot{r} \ge 0\); here \(r_0=r_0({\varvec{J}})\) is permitted to depend on \({\varvec{J}}\). This cross section is thus simply parametrized by \(\phi \in {\mathbb R}/2\pi {\mathbb Z}\). The corresponding return map is

$$\begin{aligned} \phi \mapsto \phi +\alpha ({\varvec{J}}) \bmod 2\pi , \end{aligned}$$
(4.6)

with rotation angle \(\alpha ({\varvec{J}})\) as in (4.5), and return time \(T({\varvec{J}})\) as in (4.3). We turn the map (4.6) into a flow of the form (2.1) by considering its suspension flow

$$\begin{aligned} \varphi ^t : {\mathbb T}^2\times {\mathcal U}\rightarrow {\mathbb T}^2\times {\mathcal U},\quad (\varvec{\theta },{\varvec{J}}) \mapsto \bigg (\varvec{\theta }+ \frac{t}{T({\varvec{J}})} \begin{pmatrix} 1 \\ \frac{\alpha ({\varvec{J}})}{2\pi }\end{pmatrix},{\varvec{J}}\bigg ) . \end{aligned}$$
(4.7)

A comparison with (2.1) yields

$$\begin{aligned} \varvec{f}({\varvec{J}}) = T({\varvec{J}})^{-1} \begin{pmatrix} 1 \\ \frac{\alpha ({\varvec{J}})}{2\pi }\end{pmatrix} . \end{aligned}$$
(4.8)

As to the hypotheses of Theorem 1, we see that a Borel probability measure \(\lambda \) on \({\mathcal U}\) is \(\varvec{f}\) -regular if the push-forward of \(\lambda \) by the map

$$\begin{aligned} {\mathcal U}\rightarrow {\mathbb R}, \qquad {\varvec{J}}\mapsto \alpha ({\varvec{J}}), \end{aligned}$$
(4.9)

is absolutely continuous with respect to Lebesgue measure on \({\mathbb R}\). Note that although this condition can hold for most potentials V, it fails for the Coulomb potential and the isotropic harmonic oscillator, where every orbit is closed.

A natural choice of target set in polar coordinates is

$$\begin{aligned} \{ (r,\phi ) \mid r=r_0({\varvec{J}}),\; -\pi \rho< \phi < \pi \rho \} , \end{aligned}$$
(4.10)

with no restriction on the sign of the radial velocity \(\dot{r}\). We distinguish two cases:

  1. (I)

    If \(r_0({\varvec{J}})=r_+({\varvec{J}})\) or \(r_0({\varvec{J}})=r_-({\varvec{J}})\), the target set is of the form (2.3), where

    $$\begin{aligned} {\mathcal D}_\rho ^{(1)} = {\mathcal D}\bigg (\varvec{u}_1,\varvec{\phi }_1,\rho \bigg ) , \quad \varvec{u}_1=\begin{pmatrix} 1 \\ 0 \end{pmatrix}, \quad \varvec{\phi }_1=\begin{pmatrix} 0 \\ 0 \end{pmatrix}. \end{aligned}$$
    (4.11)

    In this simple setting \(\varvec{\phi }_1=\varvec{0}\) is \((\varvec{\theta },\lambda )\)-generic if (recall (2.12))

    $$\begin{aligned} \lambda \bigg (\bigg \{ {\varvec{J}}\in {\mathcal U}\, : \,\varvec{\theta }({\varvec{J}}) \in {\mathbb R}\begin{pmatrix} 1 \\ \frac{\alpha ({\varvec{J}})}{2\pi }\end{pmatrix} + {\mathbb Q}^2 \bigg \}\bigg ) = 0. \end{aligned}$$
    (4.12)
  2. (II)

    If \(r_-({\varvec{J}})<r_0({\varvec{J}})< r_+({\varvec{J}})\), then the particle attains the value \(r=r_0({\varvec{J}})\) with radial velocity \(\dot{r}<0\) before returning to the section \((r_0,\dot{r}>0)\). The traversed angle is

    $$\begin{aligned} \alpha _*({\varvec{J}}) = 2 \int _{r_0({\varvec{J}})}^{r_+({\varvec{J}})} \frac{\frac{L}{r^2}\, dr}{\sqrt{2[E-V(r)]-\frac{L^2}{r^2}}}, \end{aligned}$$
    (4.13)

    and the corresponding travel time is

    $$\begin{aligned} T_*({\varvec{J}}) = 2 \int _{r_0({\varvec{J}})}^{r_+({\varvec{J}})} \frac{dr}{\sqrt{2[E-V(r)]-\frac{L^2}{r^2}}} . \end{aligned}$$
    (4.14)

    The target set (4.10) has therefore the following angle-action representation, recall (2.3):

    $$\begin{aligned} {\mathcal D}_\rho ^{(2)} = \bigcup _{j=1}^2 {\mathcal D}(\varvec{u}_j,\varvec{\phi }_j,\rho ), \end{aligned}$$
    (4.15)

    with identical orientation

    $$\begin{aligned} \varvec{u}_1({\varvec{J}})=\varvec{u}_2({\varvec{J}})=\begin{pmatrix} 1 \\ 0 \end{pmatrix}, \end{aligned}$$
    (4.16)

    located at

    $$\begin{aligned} \varvec{\phi }_1({\varvec{J}})= \begin{pmatrix} 0 \\ 0 \end{pmatrix},\qquad \varvec{\phi }_2({\varvec{J}})= \frac{T_*({\varvec{J}})}{T({\varvec{J}})} \begin{pmatrix} 1 \\ \frac{\alpha ({\varvec{J}})}{2\pi }\end{pmatrix} - \begin{pmatrix} 0\\ \frac{\alpha _*({\varvec{J}})}{2\pi } \end{pmatrix} . \end{aligned}$$
    (4.17)

    Here the target location is \((\varvec{\theta },\lambda )\)-generic if for all \((m_1',m_2')\in {\mathbb Z}^2\setminus \{\varvec{0}\}\)

    $$\begin{aligned} \lambda \bigg (\bigg \{ {\varvec{J}}\in {\mathcal U}\, : \,m_1' \varvec{\theta }({\varvec{J}}) + m_2' \, \begin{pmatrix}0\\ \frac{\alpha _*({\varvec{J}})}{2\pi } \end{pmatrix} \in {\mathbb R}\begin{pmatrix} 1 \\ \frac{\alpha ({\varvec{J}})}{2\pi }\end{pmatrix} + {\mathbb Q}^2 \bigg \}\bigg ) = 0 \end{aligned}$$
    (4.18)

    (indeed, set \((m_1,m_2)=(m_2'-m_1',-m_2')\) in (2.12)). For our numerical simulations of the first entry time, the relevant parameters used were as follows. The potential is

    $$\begin{aligned} V(r)={\left\{ \begin{array}{ll} \frac{r^\gamma -1}{\gamma }&{}(\gamma \ne 0)\\ \ln (r)&{} (\gamma =0), \end{array}\right. } \end{aligned}$$
    (4.19)

    where \(\gamma \in {\mathbb R}\), \(\gamma >-2\). The particle mass is \(m=1\), initial position in polar coordinates \((r_0,\phi _0)=(1,-2)\), initial velocity 0.3 with directions uniform in \([0.5,1]\subset [0,2\pi ]\) (the sample size is \(10^8\)); the target is the angular interval \([-\rho /2,\rho /2]\) located at radius \(r_0=1\). Figure 2 displays the results of computations with several values of \(\rho \) and fixed \(\gamma =1\), and Fig. 3 the corresponding results for fixed \(\rho =10^{-4}\) and various values of \(\gamma \).

Fig. 2
figure 2

Numerical simulations for the entry time distribution \({\mathbb P}( \widetilde{\tau }_1 > s )\) for the potential \(V(r)=r-1\), with different holes sizes \(\rho \). We consider particles of mass \(m=1\) with initial position in polar coordinates \((r_0,\phi _0)=(1,-2)\), initial velocity \(v=0.3\), initial angles uniform in [0.5, 1] with a sample size \(10^8\). The target is located at radius \(r_0\) and angle interval \([-\rho /2,\rho /2]\). The deviation from the predicted distribution \(F_2(s)\) is shown in the inset

5 Integrable Billiards

The dynamics of a point particle in a billiard is integrable if there is a coordinate system in which the Hamilton-Jacobi equation separates. All known examples in two dimensions involve either very particular polygonal billiards, whose dynamics unfolds to a linear flow on a torus, or billiards whose boundaries are aligned with elliptical coordinate lines (or the degenerate cases of circular or parabolic coordinates). While many configurations can be constructed from arcs of confocal ellipses and hyperbolas, the most natural and studied is the ellipse billiard itself, of which the circle is a special case. Scaling of escape from a circular billiard with a single small hole to a universal function of the product of hole size and time was observed in Fig. 3 of [9]. We will here consider billiards in general ellipses, where the target set is a sub-interval of the boundary. Action-angle coordinates for the billiard flow have been described in the literature, for example in [45]. For our purposes it will be simpler to formulate the dynamics in terms of the billiard map, which is the return map of the billiard flow to the boundary; see [48] for a detailed discussion. The billiard domain is confined by the ellipse \(\{ (b\cos \phi ,a\sin \phi ) \mid \phi \in [0,2\pi ) \}\) with semi-axes \(a\ge b\), eccentricity \(e=\sqrt{1-b^2/a^2}\) and foci \((0,\pm ae)\). The billiard dynamics conserves the kinetic energy \(E=\frac{1}{2}\Vert \varvec{\xi }\Vert ^2\) (where \(\varvec{\xi }\) denotes the particle’s momentum) and the product \(L_+L_-\) of angular momenta \(L_\pm =x_1\xi _2-(x_2\mp ae)\xi _1\) about the foci. Note that a change in energy \(E>0\) only affects the speed of the billiard particle but not its trajectory, and we will fix \(E=\frac{1}{2}\) in the following without loss of generality.

Fig. 3
figure 3

Numerical simulations for the entry time distribution \({\mathbb P}( \widetilde{\tau }_1 > s )\) for the potential \(V(r)=\frac{r^\gamma -1}{\gamma }\) (\(\gamma \ne 0\)) and \(V(r)=\log r\) (\(\gamma =0\)). The hole size is \(\rho =10^{-4}\), and all other parameter values as in Fig. 2. The cases \(\gamma =-1,2\) correspond to the Coulomb potential and isotropic harmonic oscillator, for which the assumptions of Theorem 1 are not satisfied, and indeed the hitting probability is zero for our choice of initial data. In the remaining cases the deviation from the predicted distribution \(F_2(s)\) is shown in the inset

Each segment of the trajectory is tangent to a caustic given by a confocal conic of eccentricity

$$\begin{aligned} \varepsilon =\sqrt{\frac{a^2e^2}{a^2e^2+L_+L_-}}\in (e,\infty ). \end{aligned}$$
(5.1)

For \(\varepsilon <1\) we have elliptic caustics, where the orbit rotates around the foci. For \(\varepsilon =1\) we have the separatrix, where the orbit passes through the foci; this has zero probability with respect to an absolutely continuous distribution of initial conditions. For \(\varepsilon >1\) we have hyperbolic caustics, and the orbit passes between the foci. Solving Eq. (5.1) for \(\varvec{\xi }\) gives two solutions, which for \(\varepsilon <1\) correspond to the direction of rotation of the orbit, and for \(\varepsilon >1\) are both contained in the closure of a single aperiodic orbit.

Following [48] in our notation, we parametrize the billiard boundary by the new parameter \(\theta \in {\mathbb T}\) defined by

$$\begin{aligned} \theta = {\left\{ \begin{array}{ll} \frac{F(\phi ,\varepsilon )}{F(2\pi ,\varepsilon )} \mod 1 &{} (\varepsilon <1)\\ \frac{F(\arcsin (\varepsilon \sin \phi ),\varepsilon ^{-1})}{F(2\pi ,\varepsilon ^{-1})} \mod 1 &{} (\varepsilon >1),\end{array}\right. } \end{aligned}$$
(5.2)

where F is the elliptic integral of the first kind [35]

$$\begin{aligned} F(\phi ,k)=\int _0^\phi \frac{d t}{\sqrt{1-k^2\sin ^2t}}. \end{aligned}$$
(5.3)

The choice of branch for the \(\arcsin \) (for \(\varepsilon >1\)) depends on the choice of solution for \(\varvec{\xi }\) in (5.1). The billiard map reads in these new coordinates

$$\begin{aligned} {\mathbb T}\rightarrow {\mathbb T},\qquad \theta \mapsto \theta +f(\varepsilon ) \mod 1 \end{aligned}$$
(5.4)

where

$$\begin{aligned} f(\varepsilon )={\left\{ \begin{array}{ll} \pm 2\frac{F\left( \arccos \sqrt{\frac{e^2(1-\varepsilon ^2)}{\varepsilon ^2(1-e^2)}}, \varepsilon \right) }{F(2\pi ,\varepsilon )}&{}(\varepsilon <1)\\ 2\frac{F\left( \arccos \sqrt{\frac{e^2(\varepsilon ^2-1)}{\varepsilon ^2-e^2}}, \varepsilon ^{-1}\right) }{F(2\pi ,\varepsilon ^{-1})}&{}(\varepsilon >1). \end{array}\right. } \end{aligned}$$
(5.5)

Here, the ± (for \(\varepsilon <1\)) again depends on the choice of solution for \(\varvec{\xi }\) in (5.1). The time between collisions with the boundary, averaged over the equilibrium measure associated with \(\varepsilon \), is given by

$$\begin{aligned} \bar{l}=\left\{ \begin{array}{cc} \frac{2b\sqrt{1-e^2/\varepsilon ^2}\Pi (e^2,\varepsilon )}{K(\varepsilon )}&{}(\varepsilon <1)\\ \frac{2b\sqrt{1-e^2/\varepsilon ^2}\Pi (e^2/\varepsilon ^2,\varepsilon ^{-1})}{K(\varepsilon ^{-1})} &{}(\varepsilon >1)\end{array}\right. \end{aligned}$$
(5.6)

where \(K(\varepsilon )=F(\frac{\pi }{2},\varepsilon )=\frac{1}{4}F(2\pi ,\varepsilon )\) and

$$\begin{aligned} \Pi (\alpha ^2,k)=\int _0^{\frac{\pi }{2}}\frac{dt}{(1-\alpha ^2\sin ^2t) \sqrt{1-k^2\sin ^2t}} \end{aligned}$$
(5.7)

are complete elliptic integrals of the first and third kind respectively [35]. Even when \(f(\varepsilon )\) is rational, hence the orbit is periodic (a set of zero measure of initial conditions), the mean collision time is independent of the starting point, and hence given by the above formula [13].

We consider a single target set in the billiard’s boundary given by the interval \(\phi _0-\frac{\rho }{2}<\phi <\phi _0+\frac{\rho }{2}\). If \(\varepsilon >1\), we assume the target intersects the region covered by the orbit, i.e., \(\varepsilon \sin \phi _0<1\). In this case a single target in \(\phi \) corresponds to two equal-sized targets in \(\theta \) located at \(\theta _0=\theta _0^{(1)}\) and \(\theta _0^{(2)}\) (which are functions of \(\phi _0\) and \(\varepsilon \)). If \(\varepsilon < 1\), a single target in \(\phi \) corresponds to a single target in \(\theta \).

For \(\phi =\phi _0+s\) with |s| small and \(\theta \) (respectively \(\theta _0\)) the value defined by (5.2) for \(\phi \) (respectively \(\phi _0\)),

$$\begin{aligned} \theta =\theta _0+s\left\{ \begin{array}{cc} \frac{1}{F(2\pi ,\varepsilon ) \sqrt{1-\varepsilon ^2\sin ^2\phi _0}}&{}(\varepsilon <1)\\ \frac{\varepsilon }{F(2\pi ,\varepsilon ^{-1})\sqrt{1-\varepsilon ^2 \sin ^2\phi _0}}&{}(\varepsilon >1)\end{array}\right\} +O(s^2). \end{aligned}$$
(5.8)

Up to a small error, which is negligible when \(\rho \rightarrow 0\), the target becomes the interval \(\theta _0-\frac{\rho \ell }{2}<\theta <\theta _0+\frac{\rho \ell }{2}\) where

$$\begin{aligned} \ell =\ell (\varepsilon )=\left\{ \begin{array}{cc}\frac{1}{F(2\pi ,\varepsilon )\sqrt{1-\varepsilon ^2\sin ^2\phi _0}}&{}(\varepsilon <1)\\ \frac{\varepsilon }{F(2\pi ,\varepsilon ^{-1})\sqrt{1-\varepsilon ^2\sin ^2\phi _0}}&{}(\varepsilon >1)\end{array}\right. . \end{aligned}$$
(5.9)

The circle is a special case, with \(e=0\) and hence \(\varepsilon =0\). The constant of motion is the angular momentum about the centre, \(L=x_1\xi _2-x_2\xi _1\). In this case

$$\begin{aligned} \theta =\frac{\phi }{2\pi },\quad f(0)=\pm \frac{1}{\pi }\arccos \frac{L}{a},\quad \ell =\frac{1}{2\pi },\quad \bar{l}=2\sqrt{a^2-L^2}, \end{aligned}$$
(5.10)

which is consistent with the above expressions for ellipses in the limit \(e\rightarrow 0\). For ellipses of small eccentricity, this approach gives a systematic expansion in powers of \(e^2\).

Finally, we have for the mean return time (2.9)

$$\begin{aligned} \overline{\sigma }^{(k)} (\varepsilon ) = {\left\{ \begin{array}{ll} \frac{\bar{l}}{\ell (\varepsilon )} &{} (\varepsilon <1,\, \text {i.e. }\, k=1) \\ \frac{\bar{l}}{2\ell (\varepsilon )} &{} (\varepsilon >1,\, \text {i.e. }\, k=2) . \end{array}\right. } \end{aligned}$$
(5.11)
Fig. 4
figure 4

Numerical simulations confirming that the entry time distribution \({\mathbb P}( \widetilde{\tau }_1 > s )\) for an arbitrary ellipse scales to the expected universal functions for initial conditions with \(\varepsilon <1\) (upper panel) and \(\varepsilon >1\) (lower panel). The inset panels highlight the difference between the ellipse simulations and theoretical predictions \(F_1(s)\) resp. \(F_2(s)\). The choice of initial data and target set is specified at the end of Sect. 5

For our numerical simulations of the first entry time, the relevant parameters used were as follows: \(a=10\), \(b\in \{6,8,10\}\) corresponding to \(e\in \{0.8,0.6,0\}\) respectively. The target was \(2.8-5\times 10^{-5}<\phi <2.8+5\times 10^{-5}\), i.e. \(\phi _0=2.8\) and \(\rho =10^{-4}\). The entry time distribution \({\mathbb P}( \widetilde{\tau }_1 > s )\) for the actual billiard flow was sampled by taking a fixed initial point \(\varvec{x}=(3,7)\) inside the ellipse, and \(10^8\) initial directions \(\varvec{\xi }\in {\text {S}}^1\) chosen randomly with uniform angular distribution in the intervals [2, 2.6] or [3.8, 4.4] for the hyperbolic or elliptic caustics, respectively. All the numerical curves are shown in Fig. 4 and are identical within numerical errors too small to see on the plot; differences between the ellipse calculations and the theoretical predictions from Theorem 1 are shown in the inset panels.

6 Integrable Flows in Arbitrary Dimension

We now state the generalization of Theorem 1 to arbitrary dimension \(d\ge 2\). The basic setting is just as in Sect. 2, but with \({\mathbb T}^d\) in place of \({\mathbb T}^2\): Let \({\mathcal U}\) be a bounded open subset of \(\mathbb {R}^m\) for some \(m\in \mathbb {Z}^+\), and let \(\varvec{f}:{\mathcal U}\rightarrow \mathbb {R}^d\) be a smooth function. We consider the flow

$$\begin{aligned} \varphi ^t: {\mathbb T}^d\times {\mathcal U}\rightarrow {\mathbb T}^d\times {\mathcal U}, \quad (\varvec{\theta },{\varvec{J}}) \mapsto (\varvec{\theta }+ t\, \varvec{f}({\varvec{J}}),{\varvec{J}}) . \end{aligned}$$
(6.1)

Let \(\lambda \) be an absolutely continuous Borel probability measure on \({\mathcal U}\), and let \(\varvec{\theta }\) be a smooth map from \({\mathcal U}\) to \({\mathbb T}^d\). We will consider the random initial data \((\varvec{\theta }({\varvec{J}}),{\varvec{J}})\) in \({\mathbb T}^d\times {\mathcal U}\), where \({\varvec{J}}\) is a random point in \({\mathcal U}\) distributed according \(\lambda \).

We next define the target sets. Let us fix a map \(\varvec{v}\mapsto R_{\varvec{v}}\), \({{\text {S}}_1^{d-1}}\rightarrow {\text {SO}}(d)\), such that \(R_{\varvec{v}} \varvec{v}=\varvec{e}_{1}\) for all \(\varvec{v}\in {{\text {S}}_1^{d-1}}\), and such that \(\varvec{v}\mapsto R_{\varvec{v}}\) is smooth throughout \({{\text {S}}_1^{d-1}}\setminus \{\varvec{v}_0\}\), where \(\varvec{v}_0\) is a fixed point in \({{\text {S}}_1^{d-1}}\). Fix \(k\in \mathbb {Z}^+\) and for each \(j=1,\ldots ,k\), fix smooth functions \(\varvec{u}_j:{\mathcal U}\rightarrow {{\text {S}}_1^{d-1}}\), \(\varvec{\phi }_j:{\mathcal U}\rightarrow {\mathbb T}^d\) and a bounded open subset \(\Omega _j\subset \mathbb {R}^{d-1}\times {\mathcal U}\). Set

$$\begin{aligned} {\mathcal D}_\rho ={\mathcal D}_\rho ^{(k)}:=\bigcup _{j=1}^k {\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j) , \end{aligned}$$
(6.2)

where

$$\begin{aligned} {\mathcal D}_\rho (\varvec{u},\varvec{\phi },\Omega ) := \biggl \{ \biggl (\varvec{\phi }({\varvec{J}})+\rho R_{\varvec{u}({\varvec{J}})}^{-1}\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) ,\,{\varvec{J}}\biggr )\in {\mathbb T}^d\times {\mathcal U}\,\bigg |\,(\varvec{x},{\varvec{J}})\in \Omega _j\biggr \}. \end{aligned}$$
(6.3)

Here we use the convention

$$\begin{aligned} \left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) :=\begin{pmatrix} 0 \\ x_1 \\ \vdots \\ x_{d-1}\end{pmatrix} \in \mathbb {R}^d \quad \text {when}\quad \varvec{x}=\begin{pmatrix} x_1 \\ \vdots \\ x_{d-1} \end{pmatrix} . \end{aligned}$$
(6.4)

Note that all points \(R_{\varvec{u}({\varvec{J}})}^{-1}\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) \) lie in the linear subspace orthogonal to \(\varvec{u}({\varvec{J}})\) in \(\mathbb {R}^d\). We write \(\Omega _j({\varvec{J}}):=\{\varvec{x}\in \mathbb {R}^{d-1}\, : \,(\varvec{x},{\varvec{J}})\in \Omega _j\}\), and assume \(\Omega _j({\varvec{J}})\ne \emptyset \) for all \({\varvec{J}}\in {\mathcal U}\). As in Sect. 2 we also impose the condition \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\) for all \(j\in \{1,\ldots ,k\}\) and \({\varvec{J}}\in {\mathcal U}\), which implies that each sub-target \({\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j)\) is transversal to the flow direction. Note that the target set \({\mathcal D}_\rho ^{(k)}\) defined here generalizes the one introduced in Sect. 2. Indeed, for \(d=2\), and given smooth functions \(\varvec{u}_j:{\mathcal U}\rightarrow {\text {S}}^1\), \(\varvec{\phi }_j:{\mathcal U}\rightarrow {\mathbb T}^2\), and \(\ell _j:{\mathcal U}\rightarrow {\mathbb R}_{>0}\) (\(j=1,\ldots ,k\)), we recover the target set in (2.3) as \(\bigcup _{j=1}^k {\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j)\) where \(\Omega _j=\{(s,{\varvec{J}})\, : \,{\varvec{J}}\in {\mathcal U},\,-\frac{1}{2}\ell _j ({\varvec{J}})<s<\frac{1}{2}\ell _j({\varvec{J}})\}\).

For any initial condition \((\varvec{\theta },{\varvec{J}})\), let \({\mathcal T}(\varvec{\theta },{\varvec{J}},{\mathcal D}^{(k)}_\rho )\) be the set of hitting times, as in (2.5). This is a discrete subset of \(\mathbb {R}_{>0}\), and we label its elements

$$\begin{aligned} 0<t_1(\varvec{\theta },{\varvec{J}},{\mathcal D}^{(k)}_\rho )<t_2(\varvec{\theta }, {\varvec{J}},{\mathcal D}^{(k)}_\rho )<\ldots . \end{aligned}$$
(6.5)

Again by Santalo’s formula, for any fixed \({\varvec{J}}\in {\mathcal U}\) such that the components of \(\varvec{f}({\varvec{J}})\) are not rationally related, the first return time to \({\mathcal D}_\rho \) on the leaf \({\mathbb T}^d\times \{{\varvec{J}}\}\) satisfies the formula

$$\begin{aligned} \int _{{\mathcal D}_\rho }t_1(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho )\,d\nu _{\varvec{J}}(\varvec{\theta })=1, \end{aligned}$$
(6.6)

where \(\nu _{\varvec{J}}\) is the invariant measure on \({\mathcal D}_\rho \) obtained by disintegrating Lebesgue measure on \({\mathbb T}^d\times \{{\varvec{J}}\}\) with respect to the section \({\mathcal D}_\rho \) of the flow \(\varphi ^t\); explicitly

$$\begin{aligned} \int _{{\mathcal D}_\rho } g\,d\nu _{\varvec{J}}= \sum _{j=1}^k \bigl (\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})\bigr ) \int _{\rho \Omega _j({\varvec{J}})} g \biggl (\varvec{\phi }_j({\varvec{J}})+R_{\varvec{u}_j({\varvec{J}})}^{-1} \left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) ,\,{\varvec{J}}\biggr )\,d\varvec{x}, \qquad \forall g \in {\text {C}}({\mathcal D}_\rho ). \end{aligned}$$
(6.7)

It follows that the mean return time with respect to \(\nu _{\varvec{J}}\) equals

$$\begin{aligned} \frac{\overline{\sigma }^{(k)}({\varvec{J}})}{\rho ^{d-1}},\qquad \text {where } \overline{\sigma }^{(k)} ({\varvec{J}}):= \frac{1}{\sum _{j=1}^k{\text {Leb}}(\Omega _j({\varvec{J}}))\,\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})}, \end{aligned}$$
(6.8)

with \({\text {Leb}}\) denoting Lebesgue measure on \(\mathbb {R}^{d-1}\). If we also average over \({\varvec{J}}\) with respect to the measure \(\lambda \) (assuming that the pushforward of \(\lambda \) by \(\varvec{f}\) has no atoms at points with rationally related coordinates), the mean return time becomes

$$\begin{aligned} \frac{\overline{\sigma }^{(k)}_\lambda }{\rho ^{d-1}},\qquad \text {where } \overline{\sigma }^{(k)}_\lambda := \int _{\mathcal U}\overline{\sigma }^{(k)} ({\varvec{J}}) \lambda (d{\varvec{J}}). \end{aligned}$$
(6.9)

As in Sect. 2, for \({\varvec{J}}\) a random point in \({\mathcal U}\) distributed according \(\lambda \), the hitting times \(t_n(\varvec{\theta }({\varvec{J}}),{\varvec{J}},{\mathcal D}_\rho ^{(k)})\) become random variables, which we denote by \(\tau _{n,\rho }^{(k)}\); also \(\overline{\sigma }^{(k)}({\varvec{J}})\) becomes a random variable, which we denote by \(\overline{\sigma }^{(k)}\). We say that \(\lambda \) is \(\varvec{f}\) -regular if the pushforward of \(\lambda \) under the map

$$\begin{aligned} {\mathcal U}\rightarrow {{\text {S}}_1^{d-1}}, \qquad {\varvec{J}}\mapsto \frac{\varvec{f}({\varvec{J}})}{\Vert \varvec{f}({\varvec{J}})\Vert }, \end{aligned}$$
(6.10)

is absolutely continuous with respect to Lebesgue measure on \({{\text {S}}_1^{d-1}}\), and we say the k-tuple of smooth functions \(\varvec{\phi }_1,\ldots ,\varvec{\phi }_k:{\mathcal U}\rightarrow {\mathbb T}^d\) is \((\varvec{\theta },\lambda )\) -generic, if for all \(\varvec{m}=(m_1,\ldots ,m_k)\in {\mathbb Z}^k\setminus \{\varvec{0}\}\) we have

$$\begin{aligned} \lambda \bigg (\bigg \{ {\varvec{J}}\in {\mathcal U}: \sum _{j=1}^k m_j \, \big (\varvec{\phi }_j({\varvec{J}}) -\varvec{\theta }({\varvec{J}})\big ) \in {\mathbb R}\varvec{f}({\varvec{J}}) + {\mathbb Q}^d \bigg \}\bigg ) = 0. \end{aligned}$$
(6.11)

The following theorem generalizes Theorem 1 to arbitrary dimension \(d\ge 2\).

Theorem 2

Let \(\varvec{f}:{\mathcal U}\rightarrow {\mathbb R}^d\) and \(\varvec{\theta }:{\mathcal U}\rightarrow {\mathbb T}^d\) be smooth maps, \(\lambda \) an absolutely continuous Borel probability measure on \({\mathcal U}\), and for \(j=1,\ldots ,k\), let \(\varvec{u}_j:{\mathcal U}\rightarrow {{\text {S}}_1^{d-1}}\) and \(\varvec{\phi }_j:{\mathcal U}\rightarrow {\mathbb T}^d\) be smooth maps and \(\Omega _j\) a bounded open subset of \(\mathbb {R}^{d-1}\times {\mathcal U}\). For each \(j=1,\ldots ,k\), assume that

  1. (i)

    \(\lambda (\varvec{u}_j^{-1}(\{\varvec{v}_0\}))=0\) (where by assumption \(\varvec{v}_{0}\) is the point in \({{\text {S}}_1^{d-1}}\) such that \(\varvec{v}\mapsto R_{\varvec{v}}\) is smooth throughout \({{\text {S}}_1^{d-1}}\setminus \{\varvec{v}_0\}\)),

  2. (ii)

    \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\) for all \({\varvec{J}}\in {\mathcal U}\),

  3. (iii)

    \(\Omega _j\) has boundary of measure zero with respect to \({\text {Leb}}\times \lambda \),

  4. (iv)

    \({\text {Leb}}(\Omega _j({\varvec{J}}))\) is a smooth and positive function of \({\varvec{J}}\in {\mathcal U}\).

Also assume that \(\lambda \) is \(\varvec{f}\)-regular and \((\varvec{\phi }_1,\ldots ,\varvec{\phi }_k)\) is \((\varvec{\theta },\lambda )\)-generic. Then there are sequences of random variables \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \) in \({\mathbb R}_{>0}\) such that in the limit \(\rho \rightarrow 0\), for every integer N,

$$\begin{aligned} \left( \frac{\rho ^{d-1} \tau _{1,\rho }^{(k)}}{\overline{\sigma }_\lambda ^{(k)}},\ldots ,\frac{\rho ^{d-1} \tau _{N,\rho }^{(k)}}{\overline{\sigma }_\lambda ^{(k)}} \right) \,\,\buildrel \mathrm{d}\over \longrightarrow \,\,(\tau _1,\ldots ,\tau _N), \end{aligned}$$
(6.12)

and

$$\begin{aligned} \left( \frac{\rho ^{d-1} \tau _{1,\rho }^{(k)}}{\overline{\sigma }^{(k)}},\ldots ,\frac{\rho ^{d-1} \tau _{N,\rho }^{(k)}}{\overline{\sigma }^{(k)}} \right) \,\,\buildrel \mathrm{d}\over \longrightarrow \,\,(\widetilde{\tau }_1,\ldots ,\widetilde{\tau }_N). \end{aligned}$$
(6.13)

We next give an explicit description of the limit processes \((\tau _i)_{i=1}^\infty \) and \((\widetilde{\tau }_i)_{i=1}^\infty \) appearing in Theorem 2. For a given affine Euclidean lattice \({\mathcal L}\) in \(\mathbb {R}^d\) and a subset \(\Omega \subset \mathbb {R}^{d-1}\), consider the cut-and-project set

$$\begin{aligned} {\mathcal P}({\mathcal L},\Omega ):=\biggl \{t>0\, : \,\left( \begin{matrix} t \\ \varvec{x} \end{matrix} \right) \in {\mathcal L},\, \varvec{x}\in -\Omega \biggr \}. \end{aligned}$$
(6.14)

Fix an arbitrary (measurable) fundamental domain \(F\subset {\text {SL}}(d,\mathbb {R})\) for \({\text {SL}}(d,\mathbb {R})/{\text {SL}}(d,\mathbb {Z})\), and let \(\mu _F\) be the (left and right) Haar measure on \({\text {SL}}(d,\mathbb {R})\) restricted to F, normalized to be a probability measure. If we choose \(g\in F\) random according to \(\mu _F\) then \(g\mathbb {Z}^d\) represents a random Euclidean lattice in \(\mathbb {R}^d\) (of covolume one). Similarly, if \(\varvec{\alpha }\) is a random point in \({\mathbb T}^d\), uniformly distributed and independent from g, then the shifted lattice \(g(\mathbb {Z}^d+\varvec{\alpha })\) represents a random affine Euclidean lattice in \(\mathbb {R}^d\).

Let us define

$$\begin{aligned} \varvec{v}({\varvec{J}})=\frac{\varvec{f}({\varvec{J}})}{\Vert \varvec{f}({\varvec{J}})\Vert }\in {{\text {S}}_1^{d-1}}\qquad ({\varvec{J}}\in {\mathcal U}). \end{aligned}$$
(6.15)

For \(j\in \{1,\ldots ,k\}\) and \({\varvec{J}}\in {\mathcal U}\) we set \({\mathfrak R}_j({\varvec{J}})=R_{\varvec{v}({\varvec{J}})}R_{\varvec{u}_j({\varvec{J}})}^{-1}\in {\text {SO}}(d)\), and let \(\widetilde{{\mathfrak R}}_j({\varvec{J}})\) be the bottom right \((d-1)\times (d-1)\) submatrix of \({\mathfrak R}_j({\varvec{J}})\). In other words, \(\widetilde{{\mathfrak R}}_j({\varvec{J}})\) is the matrix of the linear map \(\varvec{x}\mapsto \biggl ({\mathfrak R}_j({\varvec{J}})\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) \biggr )_{\!\!\perp }\) on \(\mathbb {R}^{d-1}\), where \(\varvec{u}_\perp :=(u_2,\ldots ,u_d)^\mathrm {t}\in \mathbb {R}^{d-1}\) for \(\varvec{u}=(u_1,\ldots ,u_d)^\mathrm {t}\in \mathbb {R}^d\). Noticing that \({\mathfrak R}_j({\varvec{J}})\) is an orientation preserving isometry of \(\mathbb {R}^d\) which takes \(\varvec{e}_1\) to \({\mathfrak R}_j({\varvec{J}})(\varvec{e}_1)\) and \(\left( \begin{matrix} 0 \\ \mathbb {R}^{d-1} \end{matrix} \right) \) onto \(({\mathfrak R}_j({\varvec{J}})(\varvec{e}_1))_\perp \), we find that

$$\begin{aligned} \det \widetilde{{\mathfrak R}}_j({\varvec{J}})=\varvec{e}_1\cdot {\mathfrak R}_j({\varvec{J}})(\varvec{e}_1) =\varvec{e}_1\cdot R_{\varvec{v}({\varvec{J}})}(\varvec{u}_j({\varvec{J}}))=\varvec{u}_j({\varvec{J}})\cdot \varvec{v}({\varvec{J}})>0. \end{aligned}$$
(6.16)

For \({\varvec{J}}\in {\mathcal U}\) we define

$$\begin{aligned} \overline{\Omega }_j({\varvec{J}}):=\bigl (\overline{\sigma }^{(k)}_\lambda \Vert \varvec{f}({\varvec{J}}) \Vert \bigr )^{1/(d-1)}\widetilde{{\mathfrak R}}_j({\varvec{J}})\Omega _j({\varvec{J}})\subset \mathbb {R}^{d-1} \end{aligned}$$
(6.17)

and

$$\begin{aligned} \widetilde{\Omega }_j({\varvec{J}}):=\bigl (\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}}) \Vert \bigr )^{1/(d-1)}\widetilde{{\mathfrak R}}_j({\varvec{J}})\Omega _j({\varvec{J}})\subset \mathbb {R}^{d-1}. \end{aligned}$$
(6.18)

Geometrically, thus, both \(\overline{\Omega }_j({\varvec{J}})\) and \(\widetilde{\Omega }_j({\varvec{J}})\) are obtained by orthogonally projecting the sub-target \(\{\varvec{x}\in {\mathbb T}^d\, : \,(\varvec{x},{\varvec{J}})\in {\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j)\}\) onto the hyperplane orthogonal to the flow direction \(\varvec{f}({\varvec{J}})\) (which is identified with \(\mathbb {R}^{d-1}\) via the rotation \(R_{\varvec{v}({\varvec{J}})}\)), and then scaling the sets with appropriate scalar factors, which in particular make \(\overline{\Omega }_j({\varvec{J}})\) and \(\widetilde{\Omega }_j({\varvec{J}})\) independent of \(\rho \).

Now let \({\varvec{J}}\), g and \(\varvec{\alpha }_1,\ldots ,\varvec{\alpha }_k\) be independent random points in \({\mathcal U}\), F and \({\mathbb T}^d\), respectively, distributed according to \(\lambda \), \(\mu _F\) and \({\text {Leb}}_{{\mathbb T}^d}\). We will prove in Sect. 8 that the elements of the random set

$$\begin{aligned} \bigcup _{j=1}^k{\mathcal P}(g(\mathbb {Z}^d+\varvec{\alpha }_j),\overline{\Omega }_j({\varvec{J}})), \end{aligned}$$
(6.19)

ordered by size, form precisely the sequence of random variables \((\tau _i)_{i=1}^\infty \) in Theorem 2. Similarly the elements of

$$\begin{aligned} \bigcup _{j=1}^k{\mathcal P}(g(\mathbb {Z}^d+\varvec{\alpha }_j),\widetilde{\Omega }_j({\varvec{J}})), \end{aligned}$$
(6.20)

ordered by size, form the sequence of random variables \(({\widetilde{\tau }}_i)_{i=1}^\infty \). We will also see in the proof that, for any \(N\in \mathbb {Z}^+\), both \((\tau _1,\ldots ,\tau _N)\) and \((\widetilde{\tau }_1,\ldots ,\widetilde{\tau }_N)\) have continuous distributions, that is, the cumulative distribution functions \({\mathbb P}\bigl (\tau _n\le T_n\text { for }1\le n\le N)\) and \({\mathbb P}\bigl ({\widetilde{\tau }}_n\le T_n\text { for }1\le n\le N)\) depend continuously on \((T_n)\in \mathbb {R}_{>0}^N\).

One verifies easily that the above description generalizes the one in Sect. 3. Indeed, note that the image of the set F in (3.2) under the map

$$\begin{aligned} (x,y,\theta )\mapsto k_{\theta }\left( \begin{matrix} {\sqrt{y}} &{} 0 \\ 0 &{} 1/{\sqrt{y}} \end{matrix} \right) \left( \begin{array}{l@{\quad }l}1&{}0\\ x&{}1\end{array}\right) \end{aligned}$$
(6.21)

is a fundamental domain for \({\text {SL}}(2,\mathbb {R})/{\text {SL}}(2,\mathbb {Z})\), and the pushforward of the measure \(\mu _F\) in Sect. 3 gives the measure \(\mu _F\) considered in the present section. Note also that for \(d=2\), \({\mathfrak R}_j({\varvec{J}})\) is the \(1\times 1\) matrix with the single entry \(\varvec{u}_j({\varvec{J}})\cdot \varvec{v}({\varvec{J}})\) (cf. (6.16)), and now one checks that if \(\Omega _j=\{(s,{\varvec{J}})\, : \,{\varvec{J}}\in {\mathcal U},\,-\frac{1}{2}\ell _j ({\varvec{J}})<s<\frac{1}{2}\ell _j({\varvec{J}})\}\) then for any affine Euclidean lattice \({\mathcal L}\), the cut-and-project set \({\mathcal P}({\mathcal L},\overline{\Omega }_j({\varvec{J}}))\) equals \({\mathcal P}({\mathcal L},L_j({\varvec{J}}))\), and similarly \({\mathcal P}({\mathcal L},\widetilde{\Omega }_j({\varvec{J}}))\) equals \({\mathcal P}({\mathcal L},\widetilde{L}_j({\varvec{J}}))\) (cf. (3.6) and (6.14)).

Finally let us point out three invariance properties of the limit distributions. First, both \((\tau _i)_{i=1}^\infty \) and \(({\widetilde{\tau }}_i)_{i=1}^\infty \) yield stationary point processes, i.e. the random set of time points \(\{\tau _i\}\) has the same distribution as \(\{\tau _i-t\}\cap \mathbb {R}_{>0}\) for every fixed \(t\ge 0\), and similarly for \(\{{\widetilde{\tau }}_i\}\). This is clear from the explicit description above, using in particular the fact that Lebesgue measure on the torus \({\mathbb T}^d\) is invariant under any translation. Secondly, by the same argument, the distributions of \((\tau _i)_{i=1}^\infty \) and \(({\widetilde{\tau }}_i)_{i=1}^\infty \) are not affected by any leaf-wise translation of any of the sets \(\Omega _j\), i.e. replacing \(\Omega _j\) by the set \(\{(\varvec{x}+\varvec{g}({\varvec{J}}),{\varvec{J}})\, : \,(\varvec{x},{\varvec{J}})\in \Omega _j\}\), where \(\varvec{g}\) is any bounded continuous function from \({\mathcal U}\) to \(\mathbb {R}^{d-1}\). Thirdly, we point out the identity

$$\begin{aligned} {\mathcal P}\biggl (\left( \begin{matrix} {h}^{-1} &{} \mathbf {0}^\mathrm {t} \\ \mathbf {0} &{} H \end{matrix} \right) {\mathcal L},H\Omega \biggr ) ={h}^{-1}{\mathcal P}({\mathcal L},\Omega ), \end{aligned}$$
(6.22)

which holds for any \({\mathcal L}\) and \(\Omega \) as in (6.14), and any \(H\in {\text {GL}}_{d-1}(\mathbb {R})\) with \(h=\det H>0\). Note also that the map

$$\begin{aligned} g{\text {SL}}(d,\mathbb {Z})\mapsto \left( \begin{matrix} {h}^{-1} &{} \mathbf {0}^\mathrm {t} \\ \mathbf {0} &{} H \end{matrix} \right) g{\text {SL}}(d,\mathbb {Z}) \end{aligned}$$
(6.23)

is a measure preserving transformation of \({\text {SL}}(d,\mathbb {R})/{\text {SL}}(d,\mathbb {Z})\) onto itself. For \(d=2\) these two facts immediately lead to the formula (3.11) in Sect. 3. For general \(d\ge 2\), the same facts imply for example that if \(\varvec{u}_1=\cdots =\varvec{u}_k\) then the limit random sequences \((\tau _i)_{i=1}^\infty \) and \(({\widetilde{\tau }}_i)_{i=1}^\infty \) are not affected if \(\Omega _j\) is replaced by \(\{(H_1\varvec{x},{\varvec{J}})\, : \,(\varvec{x},{\varvec{J}})\in \Omega _j\}\) simultaneously for all j, where \(H_1\) is any fixed \((d-1)\times (d-1)\) matrix with positive determinant. Indeed, the given replacement has the effect that both \(\overline{\sigma }^{(k)}({\varvec{J}})\) and \(\overline{\sigma }^{(k)}_\lambda \) are multiplied by the constant \((\det H_1)^{-1}\); thus both \(\overline{\Omega }_j({\varvec{J}})\) and \(\widetilde{\Omega }_j({\varvec{J}})\) get transformed by the linear map \(H:=(\det H_1)^{-1/(d-1)}\widetilde{{\mathfrak R}}_j({\varvec{J}})H_1\widetilde{{\mathfrak R}}_j({\varvec{J}})^{-1}\), which has determinant 1 and is independent of j since \(\varvec{u}_1({\varvec{J}})=\cdots =\varvec{u}_k({\varvec{J}})\); hence the statement follows from the two facts noted above.

7 An Application of Ratner’s Theorem

In this section we will introduce a homogeneous space \(G/\Gamma \) which parametrizes such k-tuples of translates of a common lattice as appear in (6.19) and (6.20), and then use Ratner’s classification of unipotent-flow invariant measures to prove an asymptotic equidistribution result in \(G/\Gamma \), Theorem 3, which will be a key ingredient for our proof of Theorem 2 in Sect. 8.

Let \({\text {SL}}(d,\mathbb {R})\) act on \((\mathbb {R}^d)^k\) through

$$\begin{aligned} M\varvec{v}=M(\varvec{v}_1,\ldots ,\varvec{v}_k)=(M\varvec{v}_1,\ldots ,M\varvec{v}_k), \end{aligned}$$
(7.1)

for \(\varvec{v}=(\varvec{v}_1,\ldots ,\varvec{v}_k)\in (\mathbb {R}^d)^k\) and \(M\in {\text {SL}}(d,\mathbb {R})\). Let G be the semidirect product

$$\begin{aligned} G={\text {SL}}(d,\mathbb {R})\ltimes (\mathbb {R}^d)^k, \end{aligned}$$

with multiplication law

$$\begin{aligned} (M,\varvec{\xi })(M',\varvec{\xi }')=(MM',\varvec{\xi }+M\varvec{\xi }'). \end{aligned}$$

We extend the action of \({\text {SL}}(d,\mathbb {R})\) to an action of G on \((\mathbb {R}^d)^k\), by defining

$$\begin{aligned} (M,\varvec{\xi })\varvec{v}:=M\varvec{v}+\varvec{\xi }\qquad \quad \text {for}\quad (M,\varvec{\xi })\in G,\,\varvec{v}\in (\mathbb {R}^d)^k. \end{aligned}$$
(7.2)

Set \(\Gamma ={\text {SL}}(d,\mathbb {Z})\ltimes (\mathbb {Z}^d)^k\) and \(X=G/\Gamma \). Let \(\mu _X\) be the (left and right) Haar measure on G, normalized so as to induce a probability measure on X, which we also denote by \(\mu _X\). We also set

$$\begin{aligned} D(\rho )=\text {diag}[\rho ^{d-1},\rho ^{-1},\ldots ,\rho ^{-1}]\in {\text {SL}}(d,\mathbb {R}),\qquad \rho >0, \end{aligned}$$

and

$$\begin{aligned} n_-(\varvec{x})=\left( \begin{matrix} 1 &{} \mathbf {0}^\mathrm {t} \\ \varvec{x} &{} 1_{d-1} \end{matrix} \right) \in {\text {SL}}(d,\mathbb {R}),\qquad \varvec{x}\in \mathbb {R}^{d-1}. \end{aligned}$$

We view \({\text {SL}}(d,\mathbb {R})\) as embedded in G through \(M\mapsto (M,\mathbf {0})\).

Theorem 3

Let \(M\in {\text {SL}}(d,\mathbb {R})\); let \({\mathcal U}\) be an open subset of \(\mathbb {R}^{d-1}\); let \(\varvec{\phi }:{\mathcal U}\rightarrow (\mathbb {R}^d)^k\) be a Lipschitz map, and let \(\lambda \) be a Borel probability measure on \({\mathcal U}\) which is absolutely continuous with respect to Lebesgue measure. Writing \(\varvec{\phi }(\varvec{v})=(\varvec{\phi }_1(\varvec{v}),\ldots ,\varvec{\phi }_k(\varvec{v}))\), we assume that for every \(\varvec{w}=(w_1,\ldots ,w_k)\in \mathbb {Z}^k\setminus \{\mathbf {0}\}\),

$$\begin{aligned} \lambda \biggl (\biggl \{\varvec{v}\in {\mathcal U}\, : \,\sum _{j=1}^k w_j\cdot \varvec{\phi }_j(\varvec{v})\in \mathbb {R}M^{-1}\left( \begin{matrix} 1 \\ -\varvec{v} \end{matrix} \right) +\mathbb {Q}^d\biggr \}\biggr )=0. \end{aligned}$$
(7.3)

Then for any bounded continuous function \(f:X\rightarrow \mathbb {R}\),

$$\begin{aligned} \lim _{\rho \rightarrow 0}\int _{{\mathcal U}} f\bigl (D(\rho ) n_-(\varvec{v}) M (1_d,\varvec{\phi }(\varvec{v}))\bigr )\,d\lambda (\varvec{v}) =\int _{X}f(g)\,d\mu _X(g). \end{aligned}$$
(7.4)

Remark 7.1

For related results on equidistribution of expanding translates of curves, cf. Shah, [40, Theorem 1.2].

Remark 7.2

The proof of Theorem 3 extends trivially to the more general situation when \(\Gamma \) is a subgroup of \({\text {SL}}(d,\mathbb {Z})\ltimes (\mathbb {Z}^d)^k\) of finite index. In this form, Theorem 3 contains Elkies and McMullen, [18, Theorem 2.2] as a special case. Indeed, applying Theorem 3 with \(d=2\), \(k=1\), \(M=\bigr ( {\begin{matrix} 0 &{} -1 \\ 1 &{} 0 \end{matrix}} \bigr ) \), \(\varphi (v)=\left( \begin{matrix} x(v)+vy(v) \\ y(v) \end{matrix} \right) \) and \(f(g):=f_0(M^{-1}g)\), where \(f_0:X\rightarrow \mathbb {R}\) is an arbitrary bounded continuous function, and noticing \(D(\rho ) n_-(v)M(1_2,\varphi (v))=MD(\rho ^{-1})\biggl (\left( \begin{matrix} 1 &{} -v \\ 0 &{} 1 \end{matrix} \right) ,\left( \begin{matrix} x(v) \\ y(v) \end{matrix} \right) \biggr )\), we obtain

$$\begin{aligned} \lim _{s\rightarrow \infty }\int _{{\mathcal U}} f_0\left( D(s) \left( \left( \begin{matrix} 1 &{} -v \\ 0 &{} 1 \end{matrix} \right) ,\left( \begin{matrix} x(v) \\ y(v) \end{matrix} \right) \right) \right) \,d\lambda (v) =\int _{X}f_0(g)\,d\mu _X(g), \end{aligned}$$

provided that

$$\begin{aligned} \lambda \bigl (\bigl \{v\in {\mathcal U}\, : \,x(v)\in \mathbb {Q}+\mathbb {Q}v\bigr \}\bigr )=0. \end{aligned}$$

Our proof of Theorem 3 follows the same basic strategy as the proof of Theorem 2.2 in [18], but with several new complications arising.

Remark 7.3

Theorem 3 also generalizes [34, Theorem 5.2], which is obtained by taking \(k=1\) and \(\varvec{\phi }(\varvec{v})=\varvec{\phi }\) a constant vector independent of \(\varvec{v}\). Indeed note that (7.4) in this case is equivalent with \(\varvec{\phi }\notin \mathbb {Q}^d\). (To translate into the setting of [34], where vectors are represented as row matrices and one considers \(\Gamma \backslash G\) in place of \(G/\Gamma \); apply the map \((M,\varvec{\xi })\mapsto (M^\mathrm {t},\varvec{\xi }^\mathrm {t})\).)

We now give the proof of Theorem 3. Let \(M,{\mathcal U},\varvec{\phi },\lambda \) satisfy all the assumptions of Theorem 3. As an initial reduction, let us note that by a standard approximation argument where one removes from \({\mathcal U}\) a subset of small \(\lambda \)-measure, we may in fact assume that \({\mathcal U}\) is bounded, and furthermore that there is a constant \(B>0\) such that \(\lambda (A)\le B{\text {Leb}}(A)\) for every Borel set \(A\subset {\mathcal U}\). (We will only use these properties in the proof of Lemma 9 below.)

For each \(\rho >0\), let \(\mu _\rho \) be the probability measure on X defined by

$$\begin{aligned} \mu _\rho (f)=\int _{{\mathcal U}} f\bigl (D(\rho )n_-(\varvec{v})M(1_d,\varvec{\phi }(\varvec{v}))\bigr )\,d\lambda (\varvec{v}), \qquad f\in {\text {C}}_c(X). \end{aligned}$$
(7.5)

Our task is to prove that \(\mu _\rho \) converges weakly to \(\mu _X\) as \(\rho \rightarrow 0\). In fact it suffices to prove that \(\mu _\rho (f)\rightarrow \mu _X(f)\) holds for every function f in the space of continuous compactly supported functions on X, \({\text {C}}_c(X)\). Recall that the unit ball in the dual space of \({\text {C}}_c(X)\) is compact in the weak* topology (Alaoglu’s Theorem). Hence by a standard subsequence argument, it suffices to prove that every weak* limit of \((\mu _\rho )\) as \(\rho \rightarrow 0\) must equal \(\mu _X\). Thus from now on, we let \(\mu \) be a weak* limit of \((\mu _\rho )\), i.e. \(\mu \) is a Borel measure (apriori not necessarily a probability measure) on X, and we have \(\mu _{\rho _j}(f)\rightarrow \mu (f)\) for every \(f\in {\text {C}}_c(X)\), where \((\rho _j)\) is a fixed sequence of positive numbers tending to 0. Our task is to prove \(\mu =\mu _X\).

Let \(\pi :G\rightarrow {\text {SL}}(d,\mathbb {R})\) be the projection \((M,\varvec{\xi })\mapsto M\); this map induces a projection \(X\rightarrow X':={\text {SL}}(d,\mathbb {R})/{\text {SL}}(d,\mathbb {Z})\) which we also call \(\pi \). Let \(\mu _{X'}\) be the unique \({\text {SL}}(d,\mathbb {R})\) invariant probability measure on \(X'\).

Lemma 4

\(\pi _*\mu =\mu _{X'}\).

Proof

For any \(f\in {\text {C}}_c(X')\) we have

$$\begin{aligned} \pi _*\mu (f)=\lim _{j\rightarrow \infty }\mu _{\rho _j}(f\circ \pi )= \lim _{j\rightarrow \infty }\int _{\mathcal U}f\bigl (D(\rho _j)n_-(\varvec{v})M\bigr )\,d\lambda (\varvec{v}) =\mu _{X'}(f). \end{aligned}$$
(7.6)

For the last equality, cf., e.g., [28, Prop. 2.2.1]. (The point here is that f is averaged along expanding translates of a horospherical subgroup, and such translates can be proved to become asymptotically equidistributed using the so called thickening method, originally introduced in the 1970 thesis of Margulis [31].) \(\square \)

Lemma 5

\(\mu \) is invariant under \(n_-(\varvec{x})\) for every \(\varvec{x}\in \mathbb {R}^{d-1}\).

Proof

(Cf. [18, Theorem 2.5].) Let \(\lambda '\in \mathrm{L}^1(\mathbb {R}^{d-1})\) be the Radon-Nikodym derivative of \(\lambda \) with respect to Lebesgue measure (thus \(\lambda '(\varvec{v})=0\) for \(\varvec{v}\notin {\mathcal U}\)). Let \(f\in {\text {C}}_c(X)\) and \(\varvec{x}\in \mathbb {R}^{d-1}\) be given, and define \(f_1\in {\text {C}}_c(X)\) through \(f_1(p)=f(n_-(\varvec{x})p)\). Then our task is to prove that \(\mu (f_1)=\mu (f)\), viz., to prove that the difference

$$\begin{aligned}&\int _{{\mathcal U}} f\bigl (n_-(\varvec{x})D(\rho _j)n_-(\varvec{v})M(1_d,\varvec{\phi }(\varvec{v}))\bigr )\lambda '(\varvec{v})\,d\varvec{v}\\&\quad -\int _{{\mathcal U}} f\Bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}))\Bigr )\lambda '(\varvec{w})\,d\varvec{w}\end{aligned}$$

tends to 0 as \(j\rightarrow \infty \). Using \(n_-(\varvec{x})D(\rho _j)=D(\rho _j)n_-(\rho _j^d\varvec{x})\) and substituting \(\varvec{v}=\varvec{w}-\rho _j^d\varvec{x}\) in the first integral, the difference can be rewritten as

$$\begin{aligned}&\int _{({\mathcal U}+\rho _j^d\varvec{x})\cap {\mathcal U}} \Bigl (f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}-\rho _j^d\varvec{x}))\bigr )\nonumber \\&\quad -f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}))\bigr )\Bigr )\lambda '(\varvec{w})\,d\varvec{w}\nonumber \\&\quad +\int _{{\mathcal U}+\rho _j^d\varvec{x}} f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}-\rho _j^d\varvec{x}))\bigr ) \bigl (\lambda '(\varvec{w}-\rho _j^d\varvec{x})-\lambda '(\varvec{w})\bigr )\,d\varvec{w}\nonumber \\&\quad -\int _{{\mathcal U}\setminus ({\mathcal U}+\rho _j^d\varvec{x})} f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}))\bigr )\lambda '(\varvec{w})\,d\varvec{w}. \end{aligned}$$
(7.7)

The absolute value of this expression is bounded above by

$$\begin{aligned}&\sup _{\varvec{w}\in ({\mathcal U}+\rho _j^d\varvec{x})\cap {\mathcal U}}\, \Bigl |f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}-\rho _j^d\varvec{x}))\bigr ) -f\bigl (D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}))\bigr )\Bigr | \nonumber \\&\qquad +\Bigl (\sup _{X}|f|\Bigr )\int _{\mathbb {R}^{d-1}}\bigl |\lambda '(\varvec{w}-\rho _j^d\varvec{x}) -\lambda '(\varvec{w})\bigr |\,d\varvec{w}. \end{aligned}$$
(7.8)

By assumption, there exists \(C>0\) such that \(\Vert \varvec{\phi }(\varvec{w}')-\varvec{\phi }(\varvec{w})\Vert \le C\Vert \varvec{w}'-\varvec{w}\Vert \) for all \(\varvec{w},\varvec{w}'\in {\mathcal U}\), where in the left hand side \(\Vert \cdot \Vert \) is the standard Euclidean norm on \((\mathbb {R}^d)^k\). In particular for any \(\varvec{w}\in ({\mathcal U}+\rho _j^d\varvec{x})\cap {\mathcal U}\) we have \(\varvec{\phi }(\varvec{w}-\rho _j^d\varvec{x})=\varvec{\phi }(\varvec{w})+\varvec{\eta }\) for some \(\varvec{\eta }=\varvec{\eta }(\varvec{w},j)\in (\mathbb {R}^d)^k\) satisfying \(\Vert \varvec{\eta }\Vert \le C\rho _j^{d}\Vert \varvec{x}\Vert \), and thus

$$\begin{aligned} D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w}-\rho _j^d\varvec{x}))&=D(\rho _j)n_-(\varvec{w})M(1_d,\varvec{\phi }(\varvec{w})+\varvec{\eta }) \\&=\bigl (1_d,D(\rho _j)n_-(\varvec{w})M\varvec{\eta }\bigr )\,D(\rho _j)\, n_-(\varvec{w})\,M\,\bigl (1_d,\varvec{\phi }(\varvec{w})\bigr ). \end{aligned}$$

Now if \(M\varvec{\eta }=(\varvec{\eta }_1',\ldots ,\varvec{\eta }_k')\) and \(\varvec{\eta }_\ell '=(\eta _{\ell ,1}',\ldots ,\eta _{\ell ,d}')^\mathrm {t}\) for each \(\ell \), then the \(\ell \)th component of \(D(\rho _j)n_-(\varvec{w})M\varvec{\eta }\) equals \(\rho _j^{-1}\eta '_{\ell ,1}\left( \begin{matrix} \rho _j^{d} \\ \varvec{w} \end{matrix} \right) +\rho _j^{-1} (0,\eta '_{\ell ,2},\cdots ,\eta '_{\ell ,d})^\mathrm {t}\). Now \(\Vert M\varvec{\eta }\Vert \ll _{C,M}\rho _j^d\Vert \varvec{x}\Vert \), and thus the element \((1_d,D(\rho _j)n_-(\varvec{w})M\varvec{\eta })\) tends to the identity in G as \(j\rightarrow \infty \), uniformly over all \(\varvec{w}\in ({\mathcal U}+\rho _j^d\varvec{x})\cap {\mathcal U}\). But f is uniformly continuous since \(f\in {\text {C}}_c(X)\); hence it follows that the first term in the right hand side of (7.8) tends to zero as \(j\rightarrow \infty \). Also the second term tends to zero; cf., e.g., [19, Prop. 8.5]. This completes the proof of the lemma. \(\square \)

Since \(\mu \) is \(n_-(\mathbb {R}^{d-1})\)-invariant, we can apply ergodic decomposition to \(\mu \): Let \({\mathcal E}\) be the set of ergodic \(n_-(\mathbb {R}^{d-1})\)-invariant probability measures on X, provided with its usual Borel \(\sigma \)-algebra; then there exists a unique Borel probability measure P on \({\mathcal E}\) such that

$$\begin{aligned} \mu =\int _{{\mathcal E}}\nu \,dP(\nu ). \end{aligned}$$
(7.9)

Cf., e.g., [49, Theorem 4.4]. Note that (7.9) together with Lemma 4 implies \(\mu _{X'}=\pi _*\mu =\int _{{\mathcal E}}\pi _*\nu \,dP(\nu )\), and for each \(\nu \in {\mathcal E}\), \(\pi _*\nu \) is an ergodic \(n_-(\mathbb {R}^{d-1})\)-invariant measure on \(X'\). Hence in fact \(\pi _*\nu =\mu _{X'}\) for P-almost all \(\nu \in {\mathcal E}\), by uniqueness of the ergodic decomposition of \(\mu _{X'}\).

Now fix an arbitrary \(\nu \in {\mathcal E}\) satisfying \(\pi _*\nu =\mu _{X'}\). We now apply Ratner’s classification of unipotent-flow invariant measures, [38, Thm 3], to \(\nu \). Let H be the closed (Lie) subgroup of G given by

$$\begin{aligned} H=\{g\in G\, : \,g_*\nu =\nu \}, \end{aligned}$$

where \(g_*\nu \) denotes the push-forward of \(\nu \) by the map \(x\mapsto gx\) on X (viz., \((g_*\nu )(B):=\nu (g^{-1}B)\) for any Borel set \(B\subset X\)). Note that

$$\begin{aligned} n_-(\mathbb {R}^{d-1})\subset H, \end{aligned}$$
(7.10)

by definition. The conclusion from [38, Thm 3] is that there is some \(g_0\in G\) such that \(\nu (Hg_0\Gamma /\Gamma )=1\). Note that in this situation the measure \(\nu _0:=g_{0*}^{-1}\nu \) is \(g_0^{-1}Hg_0\) invariant and \(\nu _0(g_0^{-1}Hg_0\Gamma /\Gamma )=1\). Hence under the standard identification of \(g_0^{-1}Hg_0\Gamma /\Gamma \) with the homogeneous space \(g_0^{-1}Hg_0/(\Gamma \cap g_0^{-1}Hg_0)\) (viz., \(h\Gamma \mapsto h(\Gamma \cap g_0^{-1}Hg_0)\) for \(h\in g_0^{-1}Hg_0\)), \(\nu _0\) is the unique invariant probability measure on \(g_0^{-1}Hg_0/(\Gamma \cap g_0^{-1}Hg_0)\), induced from a Haar measure on \(g_0^{-1}Hg_0\). In particular \(\Gamma \cap g_0^{-1}Hg_0\) is a lattice in \(g_0^{-1}Hg_0\), and both \(g_0^{-1}Hg_0\Gamma /\Gamma \) and \(Hg_0\Gamma /\Gamma \) are closed subsets of X (cf. also [37, Theorem 1.13]); furthermore \({\text {supp}}(\nu )=Hg_0\Gamma /\Gamma \).

Lemma 6

In this situation, \(\pi (H)={\text {SL}}(d,\mathbb {R})\).

Proof

(Cf. [18, Theorem 2.8].) We have \(\pi ({\text {supp}}\nu )={\text {supp}}\pi _*\nu \), since \(\pi :X\rightarrow X'\) has compact fibers, and \({\text {supp}}\pi _*\nu =X'\), since we are assuming \(\pi _*\nu =\mu _{X'}\). Also \({\text {supp}}\nu =Hg_0\Gamma /\Gamma \). Hence \(\pi (H)\pi (g_0){\text {SL}}(d,\mathbb {Z})={\text {SL}}(d,\mathbb {R})\), and thus \(\pi (H)={\text {SL}}(d,\mathbb {R})\). \(\square \)

In the next lemma we deduce from (7.10) and Lemma 6 an explicit presentation of H. For \(\varvec{\xi }=(\varvec{\xi }_1,\ldots ,\varvec{\xi }_k)\in (\mathbb {R}^d)^k\) and \(\varvec{u}=(u_1,\ldots ,u_k)\in \mathbb {R}^k\), let us introduce the notation

$$\begin{aligned} \varvec{\xi }\cdot \varvec{u}:=\sum _{j=1}^k u_j\varvec{\xi }_j\in \mathbb {R}^d. \end{aligned}$$

Given any linear subspace \(U\subset \mathbb {R}^k\), we let L(U) be the linear subspace consisting of all \(\varvec{\xi }\in (\mathbb {R}^d)^k\) satisfying \(\varvec{\xi }\cdot \varvec{u}=\mathbf {0}\) for all \(\varvec{u}\in U^\perp \), where \(U^\perp \) is the orthogonal complement of U in \(\mathbb {R}^k\) with respect to the standard inner product. (It is natural to identify \(\varvec{\xi }=(\varvec{\xi }_1,\ldots ,\varvec{\xi }_k)\) with the \(d\times k\)-matrix with columns \(\varvec{\xi }_1,\ldots ,\varvec{\xi }_k\); then \(\varvec{\xi }\cdot \varvec{u}\) is simply matrix multiplication, and L(U) is the space of all \(d\times k\)-matrices such that every row vector is in U.) Note that L(U) is closed under multiplication from the left by any \({\text {SL}}(d,\mathbb {R})\)-matrix. Hence the following is a closed Lie subgroup of G:

$$\begin{aligned} H_U:={\text {SL}}(d,\mathbb {R})\ltimes L(U). \end{aligned}$$

Let \(\varvec{e}_1=(1,0,\ldots ,0)^\mathrm {t}\in \mathbb {R}^d\). Then \(\varvec{e}_1^\perp =\{(0,\xi _2,\ldots ,\xi _d)^\mathrm {t}\, : \,\xi _j\in \mathbb {R}\}\), and \((\varvec{e}_1^\perp )^k\) is a linear subspace of \((\mathbb {R}^d)^k\).

Lemma 7

There exist \(U\subset \mathbb {R}^k\) and \(\varvec{\xi }\in (\varvec{e}_1^\perp )^k\) such that \(H=(1_d,\varvec{\xi })H_U(1_d,\varvec{\xi })^{-1}\).

Proof

Set \(V=\{\varvec{\xi }\in (\mathbb {R}^d)^k\, : \,(1_d,\varvec{\xi })\in H\}\); this is a closed subgroup of \(\langle (\mathbb {R}^d)^k,+\rangle \), and it follows using Lemma 6 that V is \({\text {SL}}(d,\mathbb {R})\)-invariant, i.e. \(M\varvec{\xi }\in V\) whenever \(M\in {\text {SL}}(d,\mathbb {R})\) and \(\varvec{\xi }\in V\). Let \(\mathfrak {sl}(d,\mathbb {R})\) be the Lie algebra of \({\text {SL}}(d,\mathbb {R})\), i.e. the Lie algebra of \(d\times d\) matrices with trace 0. Then for every \({\varvec{\xi }}\in V\), \(A\in \mathfrak {sl}(d,\mathbb {R})\) and \(n\in \mathbb {Z}^+\) we have \(n(\exp (n^{-1}A){\varvec{\xi }}-{\varvec{\xi }})\in V\), and since V is closed, letting \(n\rightarrow \infty \) we obtain \(A{\varvec{\xi }}\in V\). Using the formula \(E_{ij}E_{ji}=E_{ii}\), where \(E_{ij}\) denotes the \(d\times d\) matrix which has (ij)th entry 1 and all other entries 0, the last invariance is upgraded to: \(A{\varvec{\xi }}\in V\) for any real \(d\times d\)-matrix A and \(\varvec{\xi }\in V\). This is easily seen to imply \(V=L(U)\) for some subspace \(U\subset \mathbb {R}^k\). Thus

$$\begin{aligned} N=H\cap \pi ^{-1}(\{1_d\})=\{1_d\}\ltimes L(U). \end{aligned}$$

This is a normal subgroup of G. Given any \(M\in {\text {SL}}(d,\mathbb {R})\), by Lemma 6 there exists some \(\varvec{\xi }\in (\mathbb {R}^d)^k\) such that \(h:=(M,\varvec{\xi })\in H\), and then \(H\cap \pi ^{-1}(\{M\})=Nh\). Using also the fact that \((\mathbb {R}^d)^k=L(U)\oplus L(U^\perp )\) it follows that for each \(M\in {\text {SL}}(d,\mathbb {R})\) there is a unique \(\varvec{\eta }\in L(U^\perp )\) such that \((M,\varvec{\eta })\in H\). Hence if we let \(H'\) be the closed Lie subgroup of \(H_{U^\perp }\) given by

$$\begin{aligned} H':=H\cap H_{U^\perp }, \end{aligned}$$

then \(H'\) contains exactly one element above each \(M\in {\text {SL}}(d,\mathbb {R})\), and \(H=NH'=H'N\). Note that the unipotent radical of \(H_{U^\perp }={\text {SL}}(d,\mathbb {R})\ltimes L(U^\perp )\) equals \(\{1_d\}\ltimes L(U^\perp )\), and thus \(H'\) is a Levi subgroup of \(H_{U^\perp }\). Hence by Malcev’s Theorem ([30]; [26, Ch. III.9]) there exists some \(\varvec{\xi }\in L(U^\perp )\) such that \(H'=(1_d,\varvec{\xi }){\text {SL}}(d,\mathbb {R})(1_d,\varvec{\xi })^{-1}\). (Recall that we view \({\text {SL}}(d,\mathbb {R})\) as embedded in G through \(M\mapsto (M,\mathbf {0})\).) Hence

$$\begin{aligned} H=NH'= (1_d,\varvec{\xi })H_U(1_d,\varvec{\xi })^{-1}. \end{aligned}$$

Finally using (7.10) we see that \(\varvec{\xi }\) must lie in \((\varvec{e}_1^\perp )^k\). \(\square \)

Next, for any linear subspace \(U\subset \mathbb {R}^k\), \(q\in \mathbb {Z}^+\) and \(\varvec{\xi }\in (\varvec{e}_1^\perp )^k\), we set

$$\begin{aligned} {\mathcal X}_{U,q,\varvec{\xi }}=\{g\Gamma \, : \,g\in G,\, g^{-1}\varvec{\xi }\in L(U)+q^{-1}(\mathbb {Z}^d)^k\}\subset X. \end{aligned}$$
(7.11)

Note here that the set \(L(U)+q^{-1}(\mathbb {Z}^d)^k\) is invariant under the action of \(\Gamma \); hence if \(g^{-1}\varvec{\xi }\in L(U)+q^{-1}(\mathbb {Z}^d)^k\) then also \((g\gamma )^{-1}\varvec{\xi }\in L(U)+q^{-1}(\mathbb {Z}^d)^k\) for every \(\gamma \in \Gamma \). Note also that if U intersects \(\mathbb {Z}^k\) in a lattice (viz., \(\mathbb {Z}^k\cap U\) contains an \(\mathbb {R}\)-linear basis for U), then \(L(U)+q^{-1}(\mathbb {Z}^d)^k\) is a closed subset of \((\mathbb {R}^d)^k\), and it follows that \({\mathcal X}_{U,q,\varvec{\xi }}\) is a closed subset of X.

Lemma 8

There exist \(q\in \mathbb {Z}^+\) and \(\varvec{\xi }\in (\varvec{e}_1^\perp )^k\), and a linear subspace \(U\subset \mathbb {R}^k\) which intersects \(\mathbb {Z}^k\) in a lattice, such that \({\text {supp}}(\nu )=Hg_0\Gamma /\Gamma \subset {\mathcal X}_{U,q,\varvec{\xi }}\).

Proof

Take \(U\subset \mathbb {R}^k\) and \(\varvec{\xi }\in (\varvec{e}_1^\perp )^k\) as in Lemma 7; then \(H=(1_d,\varvec{\xi })H_U(1_d,-\varvec{\xi })\). Now \(\Gamma \) intersects \(g_0^{-1}Hg_0\) in a lattice; hence if \(g=g_0^{-1}(1_d,\varvec{\xi })\) then \(g^{-1}\Gamma g\) intersects \(H_U\) in a lattice. Set \(\varvec{\xi }'=g_0^{-1}\varvec{\xi }\); then \(g=(M,\varvec{\xi }')=(1_d,\varvec{\xi }')M\) for some \(M\in {\text {SL}}(d,\mathbb {R})\), and since M normalizes \(H_U\), it follows that \(\widetilde{\Gamma }:=(1_d,\varvec{\xi }')^{-1}\Gamma (1_d,\varvec{\xi }')\cap H_U\) is a lattice in \(H_U\). By [37, Cor. 8.28], this implies that \(\widetilde{\Gamma }_r:=\{\varvec{v}\in L(U)\, : \,(1_d,\varvec{v})\in \widetilde{\Gamma }\}=(\mathbb {Z}^d)^k\cap L(U)\) is a lattice in L(U), and \(\pi (\widetilde{\Gamma })\) is a lattice in \({\text {SL}}(d,\mathbb {R})\). The first condition implies that \(\mathbb {Z}^k\cap U\) contains an \(\mathbb {R}\)-linear basis for U, i.e. U intersects \(\mathbb {Z}^k\) in a lattice. Next we compute

$$\begin{aligned} \pi (\widetilde{\Gamma }) =\{\gamma \in {\text {SL}}(d,\mathbb {Z})\, : \,(1_d-\gamma )\varvec{\xi }'\in L(U)+(\mathbb {Z}^d)^k\}. \end{aligned}$$

This is a subgroup of \({\text {SL}}(d,\mathbb {Z})\) and a lattice in \({\text {SL}}(d,\mathbb {R})\); hence \(\pi (\widetilde{\Gamma })\) must be a subgroup of finite index in \({\text {SL}}(d,\mathbb {Z})\). Now fix any \(\gamma \in \pi (\widetilde{\Gamma })\) for which \(1_d-\gamma \) is invertible (for example we can take \(\gamma \) as an appropriate integer power of any given hyperbolic element in \({\text {SL}}(d,\mathbb {Z})\)). Then \(1_d-\gamma \in {\text {GL}}(d,\mathbb {Q})\), and we conclude \(\varvec{\xi }'\in (1_d-\gamma )^{-1}(L(U)+(\mathbb {Z}^d)^k)\subset L(U)+(\mathbb {Q}^d)^k\), i.e. \(\varvec{\xi }'=\varvec{u}+q^{-1}\varvec{m}\) for some \(\varvec{u}\in L(U)\), \(q\in \mathbb {Z}_{>0}\) and \(\varvec{m}\in (\mathbb {Z}^d)^k\).

Now for every \(g\in Hg_0\Gamma \) we have \((1_d,-\varvec{\xi })g_0\Gamma g^{-1}(1_d,\varvec{\xi })\cap H_U\ne \emptyset \), i.e. there is some \(\gamma \in \Gamma \) such that \((1_d,-\varvec{\xi })g_0\gamma g^{-1}(1_d,\varvec{\xi })\mathbf {0}\in L(U)\), or equivalently \(g^{-1}\varvec{\xi }\in \gamma ^{-1} g_0^{-1}(1_d,\varvec{\xi })L(U)\). But we have \(g_0^{-1}(1_d,\varvec{\xi })=(M,\varvec{\xi }')=(M,\varvec{u}+q^{-1}\varvec{m})\) and hence \(\gamma ^{-1} g_0^{-1}(1_d,\varvec{\xi })L(U)=\gamma ^{-1}(L(U)+q^{-1}\varvec{m})\subset L(U)+q^{-1}(\mathbb {Z}^d)^k\). Hence every \(g\in Hg_0\Gamma \) satisfies \(g^{-1}\varvec{\xi }\in L(U)+q^{-1}(\mathbb {Z}^d)^k\), i.e. we have \(Hg_0\Gamma /\Gamma \subset {\mathcal X}_{U,q,\varvec{\xi }}\). \(\square \)

Recall that we have fixed \(\mu \) as an arbitrary weak* limit of \((\mu _\rho )\) as \(\rho \rightarrow 0\). The proof of the following Lemma 9 makes crucial use of the genericity assumption (7.3) in Theorem 3; later Lemma 9 combined with Lemma 8 will allow us to conclude that in the ergodic decomposition (7.9), we must have \(\nu =\mu _X\) for P-almost all \(\nu \).

Lemma 9

Let \(q\in \mathbb {Z}^+\) and let U be a linear subspace of \(\mathbb {R}^k\) of dimension \(<k\) which intersects \(\mathbb {Z}^k\) in a lattice. Then \(\mu \bigl (\cup _{\varvec{\xi }\in (\varvec{e}_1^\perp )^k}\,{\mathcal X}_{U,q,\varvec{\xi }}\bigr )=0\).

Proof

Let \({\mathcal B}_C^d\) be the closed ball of radius C in \(\mathbb {R}^d\) centered at the origin. It suffices to prove that for each \(C>0\), the set

$$\begin{aligned} {\mathcal X}_{U,q,C}:=\bigcup _{\varvec{\xi }\in ({\mathcal B}_C^d\cap \varvec{e}_1^\perp )^k} {\mathcal X}_{U,q,\varvec{\xi }}\subset X \end{aligned}$$
(7.12)

satisfies \(\mu \bigl ({\mathcal X}_{U,q,C}\bigr )=0\). Let \({\mathcal N}\) be the family of open subsets of G containing the identity element. Then for any \(\Omega \in {\mathcal N}\), \(\Omega {\mathcal X}_{U,q,C}\) is an open set in X containing \({\mathcal X}_{U,q,C}\). Hence, since \(\mu \) is a weak* limit of \((\mu _\rho )\) as \(\rho \rightarrow 0\) along some subsequence, it now suffices to prove that for every \(\varepsilon >0\) there exists some \(\Omega \in {\mathcal N}\) such that \(\limsup _{\rho \rightarrow 0}\mu _\rho (\Omega {\mathcal X}_{U,q,C})<\varepsilon \). We have \(g\Gamma \in {\mathcal X}_{U,q,C}\) if and only if the set \(g(L(U)+q^{-1}(\mathbb {Z}^d)^k)\) in \((\mathbb {R}^d)^k\) has some point in common with \(({\mathcal B}_C^d\cap \varvec{e}_1^\perp )^k\). The latter is a compact set, which for any \(\eta >0\) is contained in the open set \(V_\eta ^k\), where (after increasing C by 1)

$$\begin{aligned} V_\eta :=\bigl \{(\xi _1,\ldots ,\xi _d)^\mathrm {t}\, : \,|\xi _1|<\eta ,\, \Vert (\xi _2,\ldots ,\xi _d)\Vert <C\bigr \}\subset \mathbb {R}^d. \end{aligned}$$
(7.13)

Hence for every \(\eta >0\), there exists some \(\Omega \in {\mathcal N}\) such that

$$\begin{aligned} \Omega {\mathcal X}_{U,q,C}\subset {\mathcal X}_{U,q,C,\eta }:=\bigl \{g\Gamma \, : \,g(L(U)+q^{-1}(\mathbb {Z}^d)^k)\cap V_\eta ^k\ne \emptyset \bigr \}. \end{aligned}$$
(7.14)

Hence it now suffices to prove

$$\begin{aligned} \lim _{\eta \rightarrow 0}\limsup _{\rho \rightarrow 0}\mu _\rho ({\mathcal X}_{U,q,C,\eta })=0. \end{aligned}$$
(7.15)

By the definition of \(\mu _\rho \) we have \(\mu _\rho ({\mathcal X}_{U,q,C,\eta })=\lambda (T_\rho )\), where

$$\begin{aligned} T_\rho&=\bigl \{\varvec{v}\in {\mathcal U}\, : \,D(\rho )n_-(\varvec{v})M(1_d,\varvec{\phi }(\varvec{v}))\in {\mathcal X}_{U,q,C,\eta }\bigr \}\\&=\bigl \{\varvec{v}\in {\mathcal U}\, : \,D(\rho )n_-(\varvec{v})M(L(U)+q^{-1}(\mathbb {Z}^d)^k+\varvec{\phi }(\varvec{v}))\cap V_\eta ^k\ne \emptyset \bigr \}. \end{aligned}$$

It follows from our assumptions on U that there exists some \(\varvec{w}\in \mathbb {Z}^k\setminus \{\mathbf {0}\}\) such that U is contained in \(\varvec{w}^\perp \), the orthogonal complement of \(\varvec{w}\) in \(\mathbb {R}^k\). Now every \(\varvec{\xi }\in L(U)+q^{-1}(\mathbb {Z}^d)^k\) satisfies \(\varvec{\xi }\cdot \varvec{w}\in q^{-1}\mathbb {Z}^d\), and hence for any \(\varvec{v}\in {\mathcal U}\), every \(\varvec{\xi }\) in the set \(D(\rho )n_-(\varvec{v})M(L(U)+q^{-1}(\mathbb {Z}^d)^k+\varvec{\phi }(\varvec{v}))\) satisfies

$$\begin{aligned} \varvec{\xi }\cdot \varvec{w}\in D(\rho )n_-(\varvec{v})M(q^{-1}\mathbb {Z}^d+\varvec{\phi }(\varvec{v})\cdot \varvec{w}). \end{aligned}$$
(7.16)

But on the other hand, for every \(\varvec{\xi }\in V_\eta ^k\) we have

$$\begin{aligned} \varvec{\xi }\cdot \varvec{w}\in \Vert \varvec{w}\Vert V_\eta = \bigl \{(\xi _1,\ldots ,\xi _d)^\mathrm {t}\, : \,|\xi _1|<\Vert \varvec{w}\Vert \eta ,\, \Vert (\xi _2,\ldots ,\xi _d)\Vert <\Vert \varvec{w}\Vert C\bigr \}. \end{aligned}$$
(7.17)

Hence

$$\begin{aligned} T_\rho \subset \bigl \{\varvec{v}\in {\mathcal U}\, : \,D(\rho )n_-(\varvec{v})M(q^{-1}\mathbb {Z}^d+\varvec{\phi }(\varvec{v})\cdot \varvec{w})\cap \Vert \varvec{w}\Vert V_\eta \ne \emptyset \bigr \}. \end{aligned}$$
(7.18)

Therefore, if we alter the constant “C” appropriately in the definition of \(V_\eta \), we see that it now suffices to prove that

$$\begin{aligned} \lim _{\eta \rightarrow 0}\limsup _{\rho \rightarrow 0}\lambda \biggl (\bigcup _{\varvec{m}\in q^{-1}\mathbb {Z}^d}\widetilde{T}_{\rho }^{\varvec{m}}\biggr )=0, \end{aligned}$$
(7.19)

where

$$\begin{aligned} \widetilde{T}_{\rho }^{\varvec{m}} :&=\bigl \{\varvec{v}\in {\mathcal U}\, : \,D(\rho ) n_-(\varvec{v})M(\varvec{m}+\varvec{\phi }(\varvec{v})\cdot \varvec{w})\in V_\eta \bigr \}. \end{aligned}$$
(7.20)

For \(\varvec{v}\in \mathbb {R}^{d-1}\) and \(\varvec{u}=(u_1,\ldots ,u_d)^\mathrm {t}\in \mathbb {R}^d\) let us write \(\varvec{u}_\perp :=(u_2,\ldots ,u_d)^\mathrm {t}\in \mathbb {R}^{d-1}\) and \(\varvec{\ell }_{\varvec{v}}(\varvec{u})=u_1\varvec{v}+\varvec{u}_\perp \in \mathbb {R}^{d-1}\), so that \(n_-(\varvec{v})\varvec{u}=\left( \begin{matrix} {{\varvec{e}}_1} \cdot {\varvec{u}} \\ {\varvec{\ell }_{\varvec{v}}}(\varvec{u}) \end{matrix} \right) \). Then the set \(\widetilde{T}_{\rho }^{\varvec{m}}\) can be expressed as

$$\begin{aligned} \widetilde{T}_{\rho }^{\varvec{m}}=X_{\rho }^{\varvec{m}}\cap Y_{\rho }^{\varvec{m}}, \end{aligned}$$
(7.21)

where

$$\begin{aligned} X_{\rho }^{\varvec{m}}=\Bigl \{\varvec{v}\in {\mathcal U}\, : \,\varvec{\ell }_{\varvec{v}}(M(\varvec{m}+\varvec{\phi }(\varvec{v})\cdot \varvec{w}))\in {\mathcal B}_{C\rho }^{d-1}\Bigr \} \end{aligned}$$

and

$$\begin{aligned} Y_{\rho }^{\varvec{m}}=\Bigl \{\varvec{v}\in {\mathcal U}\, : \,\varvec{e}_1\cdot M(\varvec{m}+\varvec{\phi }(\varvec{v})\cdot \varvec{w}) \in (-\eta \rho ^{1-d},\eta \rho ^{1-d})\Bigr \}. \end{aligned}$$

Let us note that the genericity assumption (7.3) in Theorem 3 immediately implies that

$$\begin{aligned} \lim _{\rho \rightarrow 0}\lambda (X_{\rho }^{\varvec{m}}) =0\qquad \text {for each fixed}\, \varvec{m}\in q^{-1}\mathbb {Z}^d. \end{aligned}$$
(7.22)

Next, since \(\varvec{\phi }\) is Lipschitz and \({\mathcal U}\) is bounded (after the initial reduction at the start of the proof of Theorem 3), there exists a constant \(C_1>0\) such that for any \(\rho >0\) and \(\varvec{m}\in q^{-1}\mathbb {Z}^d\),

$$\begin{aligned} |\varvec{e}_1\cdot M\varvec{m}|>C_1\,\Rightarrow \, {\text {Leb}}\bigl (X_{\rho }^{\varvec{m}}\bigr )\ll \rho ^{d-1}|\varvec{e}_1\cdot M\varvec{m}|^{1-d}. \end{aligned}$$
(7.23)

(Here and in the rest of the proof, the implied constant in any \(\ll \) bound is allowed to depend on \(C,q,M,\varvec{w},\varvec{\phi }\), but not on \(\varvec{m},\eta ,\rho \).) Furthermore, increasing \(C_1\) if necessary, and assuming that \(\rho \) is so small that \(\eta \rho ^{1-d}\ge 1\) and \(C\rho <1\), we see that

$$\begin{aligned} |\varvec{e}_1\cdot M\varvec{m}|\ge C_1\eta \rho ^{1-d}\,\Rightarrow \, Y_{\rho }^{\varvec{m}}=\emptyset . \end{aligned}$$
(7.24)

and

$$\begin{aligned} \Vert (M\varvec{m})_\perp \Vert \ge C_1\bigl (1+|\varvec{m}M\cdot \varvec{e}_1|\bigr )\,\Rightarrow \, X_{\rho }^{\varvec{m}}=\emptyset . \end{aligned}$$

Hence if we set

$$\begin{aligned}&A_1=\{\varvec{m}\in q^{-1}\mathbb {Z}^d\, : \,|\varvec{e}_1\cdot M\varvec{m}|<C_1\eta \rho ^{1-d}\}; \\&A_2=\{\varvec{m}\in q^{-1}\mathbb {Z}^d\, : \,|\varvec{e}_1\cdot M\varvec{m}|>C_1\}; \\&A_3=\{\varvec{m}\in q^{-1}\mathbb {Z}^d\, : \,\Vert (\varvec{m}M)_\perp \Vert < C_1(1+|\varvec{e}_1\cdot M\varvec{m}|)\}, \end{aligned}$$

then for any \(0<\eta <1\) and \(0<\rho <\min (C^{-1},\eta ^{1/(d-1)})\), we have

$$\begin{aligned} \lambda \biggl (\bigcup _{\varvec{m}\in q^{-1}\mathbb {Z}^d}\widetilde{T}_{\rho }^{\varvec{m}}\biggr )\le \sum _{\varvec{m}\in A_1\cap A_3}\lambda \bigl (X_{\rho }^{\varvec{m}}\bigr ) \ll \sum _{\varvec{m}\in A_1\cap A_2\cap A_3}\rho ^{d-1}|\varvec{e}_1\cdot M\varvec{m}|^{1-d} +\sum _{\varvec{m}\in A_3\setminus A_2}\lambda (X_{\rho }^{\varvec{m}}). \end{aligned}$$

(In the last bound we used the fact that \(\lambda (A)\ll {\text {Leb}}(A)\) uniformly over all Borel sets \(A\subset {\mathcal U}\), because of our initial reduction at the start of the proof of Theorem 3) Here \(A_3\setminus A_2\) is a finite set, and hence the last sum above tends to zero as \(\rho \rightarrow 0\), by (7.22). Finally the set \(A_1\cap A_2\cap A_3\) can be covered by the dyadic pieces \(D_s=A_3\cap \{ 2^s C_1<|\varvec{e}_1\cdot M\varvec{m}|\le 2^{s+1}C_1\}\) with s running through \(0,1,\ldots ,S:=\lceil \log _2(\eta \rho ^{1-d})\rceil \). Here \(\# D_s\ll 2^{sd}\) and so

$$\begin{aligned} \sum _{\varvec{m}\in A_1\cap A_2\cap A_3}\rho ^{d-1}|\varvec{e}_1\cdot M\varvec{m}|^{1-d} \ll \rho ^{d-1}\sum _{s=0}^S 2^{sd}\cdot 2^{s(1-d)} \ll \rho ^{d-1}2^{S}\ll \eta . \end{aligned}$$

Taken together these bounds prove that (7.19) holds, and the lemma is proved. \(\square \)

We are now in a position to complete the proof of Theorem 3.

Conclusion of the proof of Theorem 3

We wish to prove that our arbitrary weak* limit \(\mu \) necessarily equals \(\mu _X\). Assume the contrary; \(\mu \ne \mu _X\); then in the ergodic decomposition (7.9) we have \(P({\mathcal E}\setminus \{\mu _{X}\})>0\). Using then Lemma 8, and the fact that there are only countably many \(q\in \mathbb {Z}^+\), and countably many subspaces \(U\subset \mathbb {R}^k\) intersecting \(\mathbb {Z}^k\) in a lattice, it follows that there exists some such subspace U of dimension \(<k\), and some \(q\in \mathbb {Z}^+\), such that \(\mu \bigl (\,\bigcup \,\bigl \{{\mathcal X}_{U,q,\varvec{\xi }}\, : \,\varvec{\xi }\in (\varvec{e}_1^\perp )^k\bigr \}\bigr )>0\). This contradicts Lemma 9. Hence Theorem 3 is proved. \(\square \)

Next we note the following consequence of Theorem 3.

Corollary 10

Let \(M\in {\text {SL}}(d,\mathbb {R})\), let \({\mathcal U}\subset \mathbb {R}^{d-1}\) be an open subset and let \(E_1:{\mathcal U}\rightarrow {\text {SO}}(d)\) be a smooth map such that the map \(\varvec{x}\mapsto E_1(\varvec{x})^{-1}\varvec{e}_1\) from \({\mathcal U}\) to \({\text {S}}_1^{d-1}\) has a nonsingular differential at (Lebesgue-)almost all \(\varvec{x}\in {\mathcal U}\). Let \(\varvec{\phi }:{\mathcal U}\rightarrow (\mathbb {R}^d)^k\) be a Lipschitz map, and let \(\lambda \) be a Borel probability measure on \({\mathcal U}\), absolutely continuous with respect to Lebesgue measure. Assume that for every \(\varvec{w}=(w_1,\ldots ,w_k)\in \mathbb {Z}^k\setminus \{\mathbf {0}\}\),

$$\begin{aligned} \lambda \biggl (\biggl \{\varvec{x}\in {\mathcal U}\, : \,\sum _{j=1}^k w_j\cdot \varvec{\phi }_j(\varvec{x})\in \mathbb {R}M^{-1}E_1(\varvec{x})^{-1}\varvec{e}_1+\mathbb {Q}^d\biggr \}\biggr )=0. \end{aligned}$$
(7.25)

Then for any bounded continuous function \(f:X\times {\mathcal U}\rightarrow \mathbb {R}\),

$$\begin{aligned} \lim _{\rho \rightarrow 0}\int _{{\mathcal U}} f\bigl (D(\rho ) E_1(\varvec{x})M(1_d,\varvec{\phi }(\varvec{x})),\varvec{x}\bigr )\,d\lambda (\varvec{x}) =\int _{X\times {\mathcal U}}f(g,\varvec{x})\,d\mu _X(g)\,d\lambda (\varvec{x}). \end{aligned}$$
(7.26)

Proof

Let us first note that if (7.4) holds for every bounded continuous function \(f:X\rightarrow \mathbb {R}\), then by a standard approximation argument (cf. [34, proof of Theorem 5.3]), also the following more general limit statement holds: For each small \(\rho >0\), let \(f_\rho :X\times {\mathcal U}\rightarrow \mathbb {R}\) be a continuous function satisfying \(|f_\rho |<B\) where B is a fixed constant, and assume that \(f_\rho \rightarrow f\) as \(\rho \rightarrow 0\), uniformly on compacta, for some continuous function \(f:X\times {\mathcal U}\rightarrow \mathbb {R}\). Then

$$\begin{aligned} \lim _{\rho \rightarrow 0}\int _{{\mathcal U}} f_\rho \bigl (D(\rho ) n_-(\varvec{v}) M (1_d,\varvec{\phi }(\varvec{v})),\varvec{v}\bigr )\,d\lambda (\varvec{v}) =\int _{X\times {\mathcal U}}f(g,\varvec{v})\,d\mu _X(g)\,d\lambda (\varvec{v}). \end{aligned}$$
(7.27)

Now Corollary 10 is proved by a direct mimic of the proof of [34, Cor. 5.4], using (7.27) in place of [34, Theorem 5.3]. (Recall that we translate from the setting in [34] by applying the transpose map, which also changes order of multiplication. Following the proof of [34, Cor. 5.4], the task becomes to prove that \(D(\rho ) n_-({\widetilde{\varvec{x}}})E_1(\varvec{x}_0)M(1_d,\varvec{\phi }(\varvec{x}))\), for \(\varvec{x}\) in a fixed small neighborhood of an arbitrary point \(\varvec{x}_0\in {\mathcal U}\), becomes asymptotically equidistributed in X as \(\rho \rightarrow 0\). Here \({\widetilde{\varvec{x}}}=-c(\varvec{x})^{-1}\varvec{v}(\varvec{x})\) with \(c(\varvec{x})\) and \(\varvec{v}(\varvec{x})\) given by \(\begin{pmatrix}c(\varvec{x})\\ \varvec{v}(\varvec{x})\end{pmatrix}=E_1(\varvec{x}_0) E_1(\varvec{x})^{-1}\varvec{e}_1\). The condition for equidistribution, (7.3), then becomes

$$\begin{aligned} \lambda \biggl (\biggl \{\varvec{x}\, : \,\sum _{j=1}^k w_j\cdot \varvec{\phi }_j(\varvec{x})\in \mathbb {R}M^{-1}E_1(\varvec{x}_0)^{-1}\begin{pmatrix}1\\ widetilde\varvec{x}\end{pmatrix} +\mathbb {Q}^d\biggr \}\biggr )=0, \end{aligned}$$

or equivalently, (7.25).) \(\square \)

Finally from Corollary 10 we derive the following equidistribution result, which is more directly adapted to the proof of Theorem 2. Recall from Sect. 6 that we have fixed the map \(\varvec{v}\mapsto R_{\varvec{v}}\), \({{\text {S}}_1^{d-1}}\rightarrow {\text {SO}}(d)\), such that \(R_{\varvec{v}} \varvec{v}=\varvec{e}_1\) for all \(\varvec{v}\in {{\text {S}}_1^{d-1}}\), and such that \(\varvec{v}\mapsto R_{\varvec{v}}\) is smooth throughout \({{\text {S}}_1^{d-1}}\setminus \{\varvec{v}_0\}\). Note that since the proof below involves using Sard’s Theorem, the proof does not apply to arbitrary Lipschitz maps.

Theorem 11

Let \({\mathcal U}\) be an open subset of \(\mathbb {R}^m\) (\(m\ge 1\)), let \(\lambda \) be a Borel probability measure on \({\mathcal U}\) which is absolutely continuous with respect to Lebesgue measure, and let \(\varvec{f}:{\mathcal U}\rightarrow {\mathbb R}^d\) be a smooth map. Assume that \(\varvec{f}({\varvec{J}})\ne \mathbf {0}\) for all \({\varvec{J}}\in {\mathcal U}\) and \(\lambda \) is \(\varvec{f}\)-regular. Also let \(\varvec{\phi }:{\mathcal U}\rightarrow (\mathbb {R}^d)^k\) be a smooth map such that for every \(\varvec{m}=(m_1,\ldots ,m_k)\in \mathbb {Z}^k\setminus \{\mathbf {0}\}\),

$$\begin{aligned} \lambda \bigg (\bigg \{ {\varvec{J}}\in {\mathcal U}: \sum _{j=1}^k m_j \, \varvec{\phi }_j({\varvec{J}}) \in {\mathbb R}\varvec{f}({\varvec{J}}) + {\mathbb Q}^d \bigg \}\bigg ) = 0. \end{aligned}$$
(7.28)

Then for any \(h\in {\text {C}}_b(X\times {\mathcal U})\), writing \(\varvec{v}({\varvec{J}}):=\Vert \varvec{f}({\varvec{J}})\Vert ^{-1}\varvec{f}({\varvec{J}})\),

$$\begin{aligned} \lim _{\rho \rightarrow 0} \int _{{\mathcal U}}h\big (D(\rho ) R_{\varvec{v}({\varvec{J}})}\big (1_d,\varvec{\phi }({\varvec{J}}) \big ),\,{\varvec{J}}\bigr )\,d\lambda ({\varvec{J}}) =\int _{{\mathcal U}}\int _{X}h(p,{\varvec{J}})\,d\mu _X(p)\,d\lambda ({\varvec{J}}). \end{aligned}$$
(7.29)

Proof

Note that \(\varvec{v}\) is a smooth map from \({\mathcal U}\) to \({{\text {S}}_1^{d-1}}\), and the fact that \(\lambda \) is \(\varvec{f}\)-regular means exactly that \(\varvec{v}_*(\lambda )\) is absolutely continuous with respect to the Lebesgue measure on \({{\text {S}}_1^{d-1}}\). Hence \(m\ge d-1\), and by Sard’s Theorem the set of critical values of \(\varvec{v}\) has measure zero with respect to \(\varvec{v}_*(\lambda )\), and so the set of critical points of \(\varvec{v}\) has measure zero with respect to \(\lambda \). For each point \({\varvec{J}}\in {\mathcal U}\) which is not a critical point of \(\varvec{v}\), there exists a diffeomorphism \(\iota \) from the unit box \((0,1)^m\) onto an open neighborhood of \({\varvec{J}}\) in \({\mathcal U}\) such that \(\varvec{v}(\iota (\varvec{x}))\) depends only on \((x_1,\ldots ,x_{d-1})\), and this function gives a diffeomorphism of \((0,1)^{d-1}\) onto an open subset of \({{\text {S}}_1^{d-1}}\). Hence by decomposition and approximation of \(\lambda \), it follows that it suffices to prove Theorem 11 in the case when \(\lambda \) is supported in a fixed such coordinate neighborhood. Changing coordinates via the diffeomorphism \(\iota \), we may assume from now on that \({\mathcal U}=(0,1)^m\) and that \(\varvec{v}(\varvec{x})\) depends only on \((x_1,\ldots ,x_{d-1})\) and gives a diffeomorphism of \((0,1)^{d-1}\) onto an open subset of \({{\text {S}}_1^{d-1}}\).

Let us first assume \(m=d-1\). Then \(\varvec{v}\) is a diffeomorphism of \({\mathcal U}=(0,1)^{d-1}\) onto an open subset of \({{\text {S}}_1^{d-1}}\). Recall that \(\varvec{v}\mapsto R_{\varvec{v}}\) is smooth throughout \({{\text {S}}_1^{d-1}}\setminus \{\varvec{v}_0\}\). If \(\varvec{v}_0\) is in the image of \(\varvec{v}\), then we replace \({\mathcal U}\) by \({\mathcal U}\setminus \varvec{v}^{-1}(\varvec{v}_0)\). Now the map \(\varvec{x}\mapsto R_{\varvec{v}(\varvec{x})}\) is smooth throughout \({\mathcal U}\), and \(\varvec{x}\mapsto R_{\varvec{v}(\varvec{x})}^{-1}\varvec{e}_1=\varvec{v}(\varvec{x})\) has everywhere nonsingular differential. Now (7.29) follows from Corollary 10 applied with \(M=1_d\) and \(E_1(\varvec{x})=R_{\varvec{v}(\varvec{x})}\).

It remains to consider the case \(m>d-1\). We are assuming that \(\lambda \) is absolutely continuous; hence \(\lambda \) has a density \(\lambda '\in \mathrm{L}^1((0,1)^m,d\varvec{x})\). Now (7.28) says that

$$\begin{aligned} \int _{(0,1)^m}I\biggl (\sum _{j=1}^k m_j\varvec{\phi }_j(\varvec{x})\in \mathbb {R}\varvec{v}(\varvec{x})+\mathbb {Q}^d\biggr ) \,\lambda '(\varvec{x})\,d\varvec{x}=0. \end{aligned}$$

Decompose \(\varvec{x}\) as \((\varvec{x}_1,\varvec{x}_2)\in \mathbb {R}^{d-1}\times \mathbb {R}^{m-d-1}\), and recall that \(\varvec{v}(\varvec{x})\) only depends on \(\varvec{x}_1\), i.e. we may write \(\varvec{v}(\varvec{x})=\varvec{v}(\varvec{x}_1)\). It follows that for (Lebesgue) a.e. \(\varvec{x}_2\in (0,1)^{m-d-1}\),

$$\begin{aligned} \int _{(0,1)^{d-1}}I\biggl (\sum _{j=1}^k m_j\varvec{\phi }_j(\varvec{x}_1,\varvec{x}_2) \in \mathbb {R}\varvec{v}(\varvec{x}_1)+\mathbb {Q}^d\biggr ) \,\lambda '(\varvec{x}_1,\varvec{x}_2)\,d\varvec{x}_1=0. \end{aligned}$$

Furthermore \(\int _{(0,1)^m}\lambda '(\varvec{x}_1,\varvec{x}_2)\,d\varvec{x}_1\,d\varvec{x}_2=1\); hence for a.e. \(\varvec{x}_2\) we have \(\int _{(0,1)^{d-1}}\lambda '(\varvec{x}_1,\varvec{x}_2)\,d\varvec{x}_1<\infty \). For each fixed \(\varvec{x}_2\in (0,1)^{m-d-1}\) which satisfies both the last two conditions, our result for the case \(m=d-1\) applies, showing that

$$\begin{aligned} \lim _{\rho \rightarrow 0}\int _{(0,1)^{d-1}}h_1\big (D(\rho ) R_{\varvec{v}(\varvec{x}_1)}(1_d,\varvec{\phi }(\varvec{x}_1,\varvec{x}_2)),(\varvec{x}_1,\varvec{x}_2)\big ) \,\lambda '(\varvec{x}_1,\varvec{x}_2)\,d\varvec{x}_1 \\ =\int _{(0,1)^{d-1}\times X} h_1\big (p,(\varvec{x}_1,\varvec{x}_2))\,\lambda '(\varvec{x}_1,\varvec{x}_2)\,d\varvec{x}_1\,d\mu _X(p). \end{aligned}$$

Now (7.29) follows by integrating the last relation over \(\varvec{x}_2\in (0,1)^{m-d-1}\), applying Lebesgue’s Bounded Convergence Theorem to change order of limit and integration. \(\square \)

8 Proof of Theorem thm:main001

We now give the proof of Theorem 2. We will only discuss the proof of (6.13) in detail. The proof of (6.12) is completely similar; basically one just has to replace \(\overline{\sigma }^{(k)}({\varvec{J}})\) with the constant \(\overline{\sigma }_\lambda ^{(k)}\) throughout the discussion; cf. Remark 8.1 below.

Recall that

$$\begin{aligned} \varvec{v}({\varvec{J}})=\frac{\varvec{f}({\varvec{J}})}{\Vert \varvec{f}({\varvec{J}})\Vert }\in {{\text {S}}_1^{d-1}}\qquad ({\varvec{J}}\in {\mathcal U}). \end{aligned}$$
(8.1)

We start by making some initial reductions. First, the assumptions of Theorem 2 imply that the open subset

$$\begin{aligned} \{{\varvec{J}}\in {\mathcal U}\, : \,\varvec{v}({\varvec{J}})\ne \varvec{v}_0,\,\varvec{u}_j({\varvec{J}})\ne \varvec{v}_0\,\forall j \} \end{aligned}$$
(8.2)

has full measure in \({\mathcal U}\) with respect to \(\lambda \), and so we may just as well replace \({\mathcal U}\) by that set. Hence from now on \(R_{\varvec{v}({\varvec{J}})}\) is a smooth function on all \({\mathcal U}\), and the same holds for \(R_{\varvec{u}_j({\varvec{J}})}\) for each \(j\in \{1,\ldots ,k\}\). Next let us set, for \(\eta >0\),

$$\begin{aligned} {\mathcal U}_\eta :=\{{\varvec{J}}\in {\mathcal U}\, : \,\Vert \varvec{\phi }_j({\varvec{J}})- \varvec{\phi }_\ell ({\varvec{J}})\Vert >\eta \,\forall j\ne \ell \}, \end{aligned}$$
(8.3)

where \(\Vert \cdot \Vert \) denotes distance to the origin in \({\mathbb T}^d\) (viz., \(\Vert \varvec{x}\Vert =\inf _{\varvec{m}\in \mathbb {Z}^d}\Vert \widetilde{\varvec{x}}-\varvec{m}\Vert \) for any \(\varvec{x}\in {\mathbb T}^d\), where \(\widetilde{\varvec{x}}\) is any lift of \(\varvec{x}\) to \(\mathbb {R}^d\)). Note that the fact that \((\varvec{\phi }_1,\ldots ,\varvec{\phi }_k)\) is \((\varvec{\theta },\lambda )\)-generic implies that for any \(j\ne \ell \), \(\varvec{\phi }_j({\varvec{J}})\ne \varvec{\phi }_\ell ({\varvec{J}})\) holds for \(\lambda \)-a.e. \({\varvec{J}}\in {\mathcal U}\). Hence \(\lambda ({\mathcal U}_\eta )\rightarrow 1\) as \(\eta \rightarrow 0\), and thus by a standard approximation argument (cf., e.g., [27, Theorem 4.28]), it suffices to prove that for all sufficiently small \(\eta >0\), the convergence (6.13) holds when \({\mathcal U}\) is replaced by \({\mathcal U}_\eta \) and \(\lambda \) is replaced by \(\lambda ({\mathcal U}_\eta )^{-1}\lambda _{|{\mathcal U}_\eta }\). In other words, from now on we may assume that there exists a constant \(0<\eta <1\) such that \(\Vert \varvec{\phi }_j({\varvec{J}})-\varvec{\phi }_\ell ({\varvec{J}})\Vert >\eta \) for all \(j\ne \ell \) and \({\varvec{J}}\in {\mathcal U}\).

For any \(j\in \{1,\ldots ,k\}\), \(\rho >0\), \(T>0\), we introduce the following “cylinder” subset of \(\mathbb {R}^d\times {\mathcal U}\):

$$\begin{aligned}&A_{j,\rho ,T}:=\biggl \{\biggl (t\varvec{f}({\varvec{J}})-\rho R_{\varvec{u}_j({\varvec{J}})}^{-1}\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) ,{\varvec{J}}\biggr )\,\bigg |\, (\varvec{x},{\varvec{J}})\in \Omega _j,\,0<t\le T\overline{\sigma }^{(k)}({\varvec{J}})\rho ^{1-d}\biggr \}. \end{aligned}$$
(8.4)

For any subset \(A\subset \mathbb {R}^d\times {\mathcal U}\) and \({\varvec{J}}\in {\mathcal U}\), we write \(A({\varvec{J}}):=\{\varvec{x}\in \mathbb {R}^d\, : \,(\varvec{x},{\varvec{J}})\in A\}\). Let us set

$$\begin{aligned} C:=\sup \bigl \{\Vert \varvec{x}\Vert \, : \,j\in \{1,\ldots ,k\},\,(\varvec{x},{\varvec{J}})\in \Omega _j\bigr \}; \end{aligned}$$
(8.5)

this is a finite positive real constant, since each \(\Omega _j\) is a non-empty bounded open set.

Lemma 12

For any \(0<\rho <\eta /(10C)\), \((\varvec{\theta },{\varvec{J}})\in {\mathbb T}^d\times {\mathcal U}\), \(n\in \mathbb {Z}^+\) and \(T>0\), the following equivalence holds:

$$\begin{aligned}&\frac{\rho ^{d-1}t_n(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho ^{(k)})}{\overline{\sigma }^{(k)}({\varvec{J}})}\le T \quad \Leftrightarrow \quad \sum _{j=1}^k\#\bigl (A_{j,\rho ,T}({\varvec{J}}) \cap (\varvec{\phi }_j({\varvec{J}})-\varvec{\theta }+\mathbb {Z}^d)\bigr )\ge n. \end{aligned}$$
(8.6)

(In (8.6), \(\varvec{\phi }_j({\varvec{J}})-\varvec{\theta }+\mathbb {Z}^d\) denotes a translate of the lattice \(\mathbb {Z}^d\), i.e. a subset of \(\mathbb {R}^d\). Note that this set is well-defined, i.e. independent of the choice of lifts of \(\varvec{\phi }_j({\varvec{J}})\) and \(\varvec{\theta }\) to \(\mathbb {R}^d\).)

Proof

Let \(\rho \), \((\varvec{\theta },{\varvec{J}})\), n and T be given as in the statement of the lemma. Note that the given restriction on \(\rho \) implies that each target set,

$$\begin{aligned} {\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j)({\varvec{J}})=\biggl \{ \varvec{\phi }_j({\varvec{J}})+\rho R_{\varvec{u}_j({\varvec{J}})}^{-1}\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) \,\bigg |\,\varvec{x}\in \Omega _j({\varvec{J}})\biggr \} \subset {\mathbb T}^d \end{aligned}$$
(8.7)

is contained within a ball of radius \(<\eta /10<1/10\), centered at \(\varvec{\phi }_j({\varvec{J}})\). In particular each target is injectively embedded in \({\mathbb T}^d\), and the targets for \(j=1,\ldots ,k\) are pairwise disjoint, since \(\Vert \varvec{\phi }_j({\varvec{J}})-\varvec{\phi }_\ell ({\varvec{J}})\Vert >\eta \) for all \(j\ne \ell \). Hence the left inequality in (8.6) holds if and only if

$$\begin{aligned} \sum _{j=1}^k\#\biggl \{t\in \bigl (0,T\overline{\sigma }^{(k)}({\varvec{J}}) \rho ^{1-d}\bigr ]\, : \,\varvec{\theta }+t\varvec{f}({\varvec{J}})\in {\mathcal D}_\rho (\varvec{u}_j, \varvec{\phi }_j,\Omega _j)({\varvec{J}})\biggr \}\ge n. \end{aligned}$$
(8.8)

Note that each set in the left hand side is a discrete set of t-values, since the target set \({\mathcal D}_\rho (\varvec{u}_j,\varvec{\phi }_j,\Omega _j)({\varvec{J}})\) is contained in a hyperplane orthogonal to \(\varvec{u}_j({\varvec{J}})\), and \(\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})>0\) by assumption. Lifting the situation from \({\mathbb T}^d\) to \(\mathbb {R}^d\) we now see, via (8.7) and (8.4), that for each j the corresponding term in the left hand side of (8.8) equals \(\#\bigl (A_{j,\rho ,T}({\varvec{J}})\cap (\varvec{\phi }_j({\varvec{J}})-\varvec{\theta }+\mathbb {Z}^d)\bigr )\). Hence the lemma follows. \(\square \)

Next we prove that the linear map \(D(\rho )R_{\varvec{v}({\varvec{J}})}\) takes the cylinder \(A_{j,\rho ,T}({\varvec{J}})\) into a cylinder which is approximately normalized, in an appropriate sense. Indeed, for any real numbers \(Y<Z\), define \(\widetilde{A}_{j,Y,Z}\subset \mathbb {R}^d\times {\mathcal U}\) through

$$\begin{aligned} \widetilde{A}_{j,Y,Z}:=\biggl \{\biggl (\left( \begin{matrix} t \\ -\widetilde{{\mathfrak R}}_j({\varvec{J}})\varvec{x} \end{matrix} \right) ,{\varvec{J}}\biggr )\,\bigg |\, (\varvec{x},{\varvec{J}})\in \Omega _j,\,\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert Y<t\le \overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert Z\biggr \}, \end{aligned}$$
(8.9)

where \(\widetilde{{\mathfrak R}}_j({\varvec{J}})\) is as in Sect. 6. We then have the following lemma.

Lemma 13

Given \(\varepsilon >0\) and \(T>0\), there exists \(\rho _0>0\) such that for all \(\rho \in (0,\rho _0)\), \(j\in \{1,\ldots ,k\}\) and \({\varvec{J}}\in {\mathcal U}\),

$$\begin{aligned} \widetilde{A}_{j,\varepsilon ,T-\varepsilon }({\varvec{J}})\subset D(\rho )R_{\varvec{v}({\varvec{J}})}A_{j,\rho ,T}({\varvec{J}})\subset \widetilde{A}_{j,-\varepsilon ,T+\varepsilon }({\varvec{J}}) \end{aligned}$$

Proof

By direct computation,

$$\begin{aligned}&D(\rho )R_{\varvec{v}({\varvec{J}})}A_{j,\rho ,T}({\varvec{J}}) \\&\quad =\biggl \{t\varvec{e}_1-\rho D(\rho ){\mathfrak R}_j({\varvec{J}})\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) \,\bigg |\, \varvec{x}\in \Omega _j({\varvec{J}}),\,0<t\le \overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert T\biggr \}. \end{aligned}$$

Using \(\rho D(\rho )={\text {diag}}(\rho ^d,1,\ldots ,1)\) and (8.5), it follows that for every \(\varvec{x}\in \Omega _j({\varvec{J}})\),

$$\begin{aligned} \rho D(\rho ){\mathfrak R}_j({\varvec{J}})\left( \begin{matrix} 0 \\ \varvec{x} \end{matrix} \right) =\left( \begin{matrix} r \\ \widetilde{{\mathfrak R}}_j({\varvec{J}})\varvec{x} \end{matrix} \right) \end{aligned}$$

where \(|r|\le C\rho ^d\). Note also that, by (6.8),

$$\begin{aligned} \bigl (\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \bigr )^{-1}= \sum _{j=1}^k{\text {Leb}}(\Omega _j({\varvec{J}}))\,\varvec{u}_j({\varvec{J}})\cdot \varvec{v}({\varvec{J}}) \end{aligned}$$

and this sum is bounded from above by a constant independent of \({\varvec{J}}\), since each set \(\Omega _j\) is bounded. The lemma follows from these observations. \(\square \)

Let \(G_1={\text {SL}}(d,\mathbb {R})\ltimes \mathbb {R}^d\). This is the group “G for \(k=1\)”; in particular \(G_1\) acts on \(\mathbb {R}^d\) (cf. (7.2)). For \(g=(M,(\varvec{\xi }_1,\ldots ,\varvec{\xi }_k))\in G\) and \(j\in \{1,\ldots ,k\}\) we write \(g^{[j]}:=(M,\varvec{\xi }_j)\in G_1\). We also introduce the short-hand notation \(\overline{N}:=\{1,\ldots ,N\}\). Given real numbers \(Y_n<Z_n\) for \(n\in \overline{N}\), we define \(B[(Y_n),(Z_n)]\) to be the following subset of \(X\times {\mathcal U}\):

$$\begin{aligned}&B[(Y_n),(Z_n)]:=\biggl \{(g\Gamma ,{\varvec{J}})\in X\times {\mathcal U}\, : \,\sum _{j=1}^k\#\bigl (\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}})\cap g^{[j]}(\mathbb {Z}^d)\bigr )\ge n \forall n\in \overline{N}\biggr \}. \end{aligned}$$
(8.10)

In the following the Lebesgue measure in various dimensions will appear within the same discussion; for clarity we will therefore write \({\text {Leb}}_m\) for the Lebesgue measure in \(\mathbb {R}^m\).

The following is a “trivial” variant of Siegel’s mean value theorem [41]:

Lemma 14

For any \(j\in \{1,\ldots ,k\}\) and \(f\in \mathrm{L}^1(\mathbb {R}^d)\),

$$\begin{aligned} \int _X\sum _{\varvec{m}\in \mathbb {Z}^d}f(g^{[j]}(\varvec{m}))\,d\mu _X(g) =\int _{\mathbb {R}^d}f(\varvec{x})\,d\varvec{x}. \end{aligned}$$
(8.11)

In particular for any Lebesgue measurable subset \(A\subset \mathbb {R}^d\),

$$\begin{aligned} \mu _X(\{\Gamma g\in X\, : \,g^{[j]}(\mathbb {Z}^d)\cap A\ne \emptyset \})\le {\text {Leb}}_d(A). \end{aligned}$$
(8.12)

Proof

(Cf., e.g., [46, proof of Lemma 10].) In the left hand side of (8.11) we write \(g=(M,(\varvec{\xi }_1,\ldots ,\varvec{\xi }_k))\), integrate out all variables \(\varvec{\xi }_\ell \), \(\ell \ne j\), and then substitute \(\varvec{\xi }_j=M\varvec{\eta }\); this gives

$$\begin{aligned} \int _X\sum _{\varvec{m}\in \mathbb {Z}^d}f(g^{[j]}(\varvec{m}))\,d\mu _X(g)= \int _F\int _{[0,1]^d}\sum _{\varvec{m}\in \mathbb {Z}^d}f(M(\varvec{m}+\varvec{\eta }))\,d\varvec{\eta }\,d\mu (M), \end{aligned}$$
(8.13)

where \(F\subset {\text {SL}}_d(\mathbb {R})\) is a fundamental domain for \({\text {SL}}_d(\mathbb {R})/{\text {SL}}_d(\mathbb {Z})\) and \(\mu \) is Haar measure on \({\text {SL}}_d(\mathbb {R})\) normalized so that \(\mu (F)=1\). Now (8.11) follows since the inner integral in (8.13) equals \(\int _{\mathbb {R}^d}f(\varvec{x})\,d\varvec{x}\) for every M. The last statement of the lemma then follows by noticing that the left hand side of (8.12) is bounded above by the left hand side of (8.11) with f equal to the characteristic function of A. \(\square \)

Lemma 15

The number \((\mu _X\times \lambda )\bigl (B[(Y_n),(Z_n)]\bigr )\) depends continuously on \(((Y_n),(Z_n))\).

(Here we keep \(((Y_n),(Z_n))\in \mathbb {R}^N\times \mathbb {R}^N\) subject to \(Y_n<Z_n\) for all \(n\in \overline{N}\), as before.)

Proof

Let \({\mathfrak D}({\varvec{J}})\in {\text {SL}}(d,\mathbb {R})\) be the diagonal matrix

$$\begin{aligned} {\mathfrak D}({\varvec{J}})={\text {diag}}\Bigl [\bigl (\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}}) \Vert \bigr )^{-1}, \bigl (\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \bigr )^{1/(d-1)},\ldots , \bigl (\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \bigr )^{1/(d-1)} \Bigr ]. \end{aligned}$$

Using the fact that \(\mu _X\) is G-invariant (thus invariant under \(g\Gamma \mapsto {\mathfrak D}({\varvec{J}})g\Gamma \)) we see that

$$\begin{aligned} (\mu _X\times \lambda )\bigl (B[(Y_n),(Z_n)]\bigr )=(\mu _X\times \lambda ) \bigl (B'[(Y_n),(Z_n)]\bigr ), \end{aligned}$$
(8.14)

where \(B'[(Y_n),(Z_n)]\) is the set obtained by replacing \(\widetilde{A}_{j,Y,Z}({\varvec{J}})\) by \(\widetilde{A}_{j,Y,Z}'({\varvec{J}}):={\mathfrak D}({\varvec{J}})\widetilde{A}_{j,Y,Z}({\varvec{J}})\) in the definition (8.10). Hence it now suffices to prove that \((\mu _X\times \lambda )\bigl (B'[(Y_n),(Z_n)]\bigr )\) depends continuously on \(((Y_n),(Z_n))\). Note also that

$$\begin{aligned} \widetilde{A}'_{j,Y,Z}({\varvec{J}}):=\biggl \{\left( \begin{matrix} t \\ -\varvec{x} \end{matrix} \right) \,\bigg |\, \varvec{x}\in \widetilde{\Omega }_j({\varvec{J}}),\,Y<t\le Z\biggr \}, \end{aligned}$$
(8.15)

where \(\widetilde{\Omega }_j({\varvec{J}})\) is as in (6.18).

To prove the continuity, consider any real numbers \(Y_n,Z_n,Y_n',Z_n'\) for \(n\in \overline{N}\), subject to \(Y_n<Z_n\) and \(Y_n'<Z_n'\). Writing \(\triangle \) for symmetric set difference, we have

$$\begin{aligned}&B'[(Y_n),(Z_n)]\,\triangle \, B'[(Y'_n),(Z'_n)] \\&\subset \bigcup _{n\in \overline{N}}\bigcup _{j=1}^k\biggl \{(g\Gamma ,{\varvec{J}})\in X\times {\mathcal U}\, : \,\bigl (\widetilde{A}'_{j,Y_n,Z_n}({\varvec{J}})\,\triangle \,\widetilde{A}'_{j,Y'_n,Z'_n}({\varvec{J}})\bigr )\cap g^{[j]}(\mathbb {Z}^d)\ne \emptyset \biggr \}, \end{aligned}$$

and hence by (8.12) and (8.15),

$$\begin{aligned}&(\mu _X\times \lambda )\bigl (B'[(Y_n),(Z_n)]\,\triangle \, B'[(Y'_n),(Z'_n)]\bigr )\\&\quad \le \sum _{n\in \overline{N}}\sum _{j=1}^k\int _{{\mathcal U}} {\text {Leb}}_d\bigl (\widetilde{A}_{j,Y_n,Z_n}'({\varvec{J}})\,\triangle \,\widetilde{A}'_{j,Y'_n,Z'_n}({\varvec{J}})\bigr ) \,d\lambda ({\varvec{J}}) \\&\quad \le \sum _{n\in \overline{N}}\sum _{j=1}^k\int _{{\mathcal U}}{\text {Leb}}_1\Bigl ((Y_n,Z_n]\, \triangle \,(Y_n',Z_n']\Bigr ) {\text {Leb}}_{d-1}(\widetilde{\Omega }_j({\varvec{J}}))\,d\lambda ({\varvec{J}}). \end{aligned}$$

However it follows from (6.16) and (6.18) that

$$\begin{aligned} {\text {Leb}}_{d-1}(\widetilde{\Omega }_j({\varvec{J}}))=\overline{\sigma }^{(k)}({\varvec{J}}){\text {Leb}}_{d-1} (\Omega _j({\varvec{J}}))\,\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}}), \end{aligned}$$

and using also (6.8) it follows that

$$\begin{aligned} \sum _{j=1}^k{\text {Leb}}_{d-1}(\widetilde{\Omega }_j({\varvec{J}}))=1 \end{aligned}$$
(8.16)

for all \({\varvec{J}}\in {\mathcal U}\). Hence we conclude

$$\begin{aligned}&\Bigl |(\mu _X\times \lambda )\bigl (B'[(Y_n),(Z_n)]\bigr )-(\mu _X\times \lambda ) \bigl (B'[(Y'_n),(Z'_n)]\bigr )\Bigr | \le \sum _{n\in \overline{N}}\bigl |Y_n-Y_n'\bigr |+\sum _{n\in \overline{N}}\bigl |Z_n-Z_n'\bigr |. \end{aligned}$$

This proves the desired continuity. \(\square \)

We wish to prove that the limit relation (7.29) in Theorem 11 holds with h equal to the characteristic function of \(B=B[(Y_n),(Z_n)]\). For this we need to prove that the boundary, \(\partial B\), has measure zero with respect to \(\mu _X\times \lambda \). Here by \(\partial B\) we denote the boundary of B in \(X\times {\mathcal U}\), and similarly \(\partial \widetilde{A}_{j,Y_n,Z_n}\) denotes the boundary of \(\widetilde{A}_{j,Y_n,Z_n}\) in \(\mathbb {R}^d\times {\mathcal U}\). (The alternative would have been to consider the boundaries in \(X\times \mathbb {R}^m\) and \(\mathbb {R}^d\times \mathbb {R}^m\), respectively.)

Lemma 16

For any \(B=B[(Y_n),(Z_n)]\), if \((g\Gamma ,{\varvec{J}})\in \partial B\) then \(g^{[j]}(\mathbb {Z}^d)\cap (\partial \widetilde{A}_{j,Y_n,Z_n})({\varvec{J}})\ne \emptyset \) for some \(j\in \{1,\ldots ,k\}\) and \(n\in \overline{N}\).

Proof

Assume \((g\Gamma ,{\varvec{J}})\in \partial B\). Then there exist sequences \(\{(g_m\Gamma ,{\varvec{J}}_m)\}\) and \(\{(\widetilde{g}_m\Gamma ,\widetilde{{\varvec{J}}}_m)\}\) in \(X\times {\mathcal U}\) such that both \((g_m\Gamma ,{\varvec{J}}_m)\rightarrow (g\Gamma ,{\varvec{J}})\) and \((\widetilde{g}_m\Gamma ,\widetilde{{\varvec{J}}}_m)\rightarrow (g\Gamma ,{\varvec{J}})\) as \(m\rightarrow \infty \), and \((g_m\Gamma ,{\varvec{J}}_m)\in B\) and \((\widetilde{g}_m\Gamma ,\widetilde{{\varvec{J}}}_m)\notin B\) for all m. In particular for each m there is some \(n\in \overline{N}\) such that

$$\begin{aligned} \sum _{j=1}^k\#\bigl (\widetilde{A}_{j,Y_n,Z_n}(\widetilde{{\varvec{J}}}_m)\cap \widetilde{g}_m^{[j]}(\mathbb {Z}^d)\bigr )<n. \end{aligned}$$
(8.17)

By passing to an appropriate subsequence, we may in fact assume that n is fixed in (8.17), i.e. n does not depend on m. On the other hand \((g_m\Gamma ,{\varvec{J}}_m)\in B\) for each m, and thus

$$\begin{aligned} \sum _{j=1}^k\#\bigl (\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}}_m)\cap g_m^{[j]}(\mathbb {Z}^d)\bigr )\ge n. \end{aligned}$$
(8.18)

Hence for each m there is some \(j\in \{1,\ldots ,k\}\) such that

$$\begin{aligned} \#\bigl (\widetilde{A}_{j,Y_n,Z_n}(\widetilde{{\varvec{J}}}_m)\cap \widetilde{g}_m^{[j]}(\mathbb {Z}^d)\bigr ) <\#\bigl (\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}}_m)\cap g_m^{[j]}(\mathbb {Z}^d)\bigr ). \end{aligned}$$
(8.19)

By again passing to a subsequence we may assume that also j is independent of m. We have \(g_m\Gamma \rightarrow g\Gamma \) as \(m\rightarrow \infty \), and by choosing the \(g_m\)’s appropriately we may even assume \(g_m\rightarrow g\); similarly we may assume \(\widetilde{g}_m\rightarrow g\). Using now \(g_m\rightarrow g\) and \({\varvec{J}}_m\rightarrow {\varvec{J}}\) together with the fact that \(\Omega _j\) is bounded, it follows that there exists a compact set \(C\subset \mathbb {R}^d\) such that \((g_m^{[j]})^{-1}\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}}_m)\subset C\) for all m, and in particular the cardinality of \(\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}}_m)\cap g_m^{[j]}(\mathbb {Z}^d)\) remains the same if we replace \(\mathbb {Z}^d\) with the finite set \(C\cap \mathbb {Z}^d\). Now (8.19) implies that for each m there is some \(\varvec{q}\in C\cap \mathbb {Z}^d\) such that \(\widetilde{g}_m^{[j]}(\varvec{q})\notin \widetilde{A}_{j,Y_n,Z_n}(\widetilde{{\varvec{J}}}_m)\) but \(g_m^{[j]}(\varvec{q})\in \widetilde{A}_{j,Y_n,Z_n}({\varvec{J}}_m)\); and since \(C\cap \mathbb {Z}^d\) is finite we may assume, after passing to a subsequence, that \(\varvec{q}\) is independent of m. Taking now \(m\rightarrow \infty \) it follows that \((g^{[j]}(\varvec{q}),{\varvec{J}})\in \partial \widetilde{A}_{j,Y_n,Z_n}\), and the lemma is proved. \(\square \)

Lemma 17

Every set \(B=B[(Y_n),(Z_n)]\) satisfies \((\mu _X\times \lambda )(\partial B)=0\).

Proof

In view of Lemma 16 and (8.12) in Lemma 14, it suffices to prove that for every \(j\in \{1,\ldots ,k\}\) and \(n\in \overline{N}\), \(\partial \widetilde{A}_{j,Y_n,Z_n}\) has measure zero with respect to \({\text {Leb}}_d\times \lambda \). Recalling (8.9) we see that, for any \(Y<Z\),

$$\begin{aligned} \partial \widetilde{A}_{j,Y,Z}= \biggl \{\biggl (\left( \begin{matrix} t \\ -\widetilde{{\mathfrak R}}_j({\varvec{J}})\varvec{x} \end{matrix} \right) ,{\varvec{J}}\biggr )\,\bigg |\, (\varvec{x},{\varvec{J}})\in \partial \Omega _j,\, \overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \,Y\le t\le \overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \,Z\biggr \} \\ \bigcup \,\biggl \{\biggl (\left( \begin{matrix} t \\ -\widetilde{{\mathfrak R}}_j({\varvec{J}})\varvec{x} \end{matrix} \right) ,{\varvec{J}}\biggr )\,\bigg |\, (\varvec{x},{\varvec{J}})\in \overline{\Omega _j},\, t\in \bigl \{\overline{\sigma }^{(k)}({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \,Y,\overline{\sigma }^{(k)} ({\varvec{J}})\Vert \varvec{f}({\varvec{J}})\Vert \,Z\bigr \}\biggr \}. \end{aligned}$$

Now the claim follows by Fubini’s Theorem, using the assumption from Theorem 2 that \(\partial \Omega _j\) has measure zero with respect to \({\text {Leb}}_{d-1}\times \lambda \). \(\square \)

We are now ready to complete the proof of Theorem 2.

Conclusion of the proof of Theorem 2

Let \(\widetilde{\varvec{\phi }}:{\mathcal U}\rightarrow (\mathbb {R}^d)^k\) be the map

$$\begin{aligned} {\varvec{J}}\mapsto \bigl (\varvec{\phi }_1({\varvec{J}})-\varvec{\theta }({\varvec{J}}),\ldots ,\varvec{\phi }_k({\varvec{J}})- \varvec{\theta }({\varvec{J}})\bigr ). \end{aligned}$$

Then Theorem 11 applies for our \({\mathcal U},\lambda ,\varvec{f}\) and \(\widetilde{\varvec{\phi }}\); in particular, the condition (7.28) holds for \(\widetilde{\varvec{\phi }}\) since we assume that \((\varvec{\phi }_1,\ldots ,\varvec{\phi }_k)\) is \((\varvec{\theta },\lambda )\)-generic. Now for any fixed set \(B=B[(Y_n),(Z_n)]\), since \((\mu _X\times \lambda )(\partial B)=0\) by Lemma 17, a standard approximation argument (cf., e.g., [27, Theorem 4.25]) shows that the conclusion of Theorem 11, (7.29), applies also for \(h=\mathbbm {1}_{B}\), the characteristic function of B. In other words,

$$\begin{aligned} \lim _{\rho \rightarrow 0}\lambda \bigl (\bigl \{{\varvec{J}}\in {\mathcal U}\, : \,\bigl (D(\rho )R_{\varvec{v}({\varvec{J}})}\bigl (1_d,\widetilde{\varvec{\phi }}({\varvec{J}}) \bigr ),\,{\varvec{J}}\bigr )\in B\bigr \}\bigr ) =(\mu _X\times \lambda )(B). \end{aligned}$$
(8.20)

Combining this with the definition of \(B=B[(Y_n),(Z_n)]\), (8.10), we conclude:

$$\begin{aligned}&\lim _{\rho \rightarrow 0}\lambda \biggl (\biggl \{{\varvec{J}}\, : \,\sum _{j=1}^k\#\bigl (\widetilde{A}_{j,Y_n,Z_n}({\varvec{J}})\cap D(\rho )R_{\varvec{v}({\varvec{J}})}\bigl (\varvec{\phi }_j({\varvec{J}})- \varvec{\theta }({\varvec{J}})+\mathbb {Z}^d\bigr )\bigr )\ge n \forall n\in \overline{N}\biggr \}\biggr ) \nonumber \\&=(\mu _X\times \lambda )(B). \end{aligned}$$
(8.21)

Now let positive real numbers \(T_1,\ldots ,T_n\) be given, and consider a number \(\varepsilon \) subject to \(0<\varepsilon <\frac{1}{2}\min (T_1,\ldots ,T_n)\). Applying (8.21) with \(Y_n=\varepsilon \) and \(Z_n=T_n-\varepsilon \) we get, via Lemma 13:

$$\begin{aligned}&\liminf _{\rho \rightarrow 0}\lambda \biggl (\biggl \{{\varvec{J}}\, : \,\sum _{j=1}^k\#\bigl (A_{j,\rho ,T_n}({\varvec{J}})\cap \bigl (\varvec{\phi }_j({\varvec{J}})-\varvec{\theta }({\varvec{J}})+\mathbb {Z}^d\bigr )\bigr )\ge n \forall n\in \overline{N}\biggr \}\biggr ) \nonumber \\&\ge (\mu _X\times \lambda )\bigl (B[(\varepsilon )_{n=1}^N, (T_n-\varepsilon )_{n=1}^N]\bigr ). \end{aligned}$$
(8.22)

Similarly if we take \(Y_n=-\varepsilon \) and \(Z_n=T_n+\varepsilon \) then we get

$$\begin{aligned}&\limsup _{\rho \rightarrow 0}\lambda \biggl (\biggl \{{\varvec{J}}\, : \,\sum _{j=1}^k\#\bigl (A_{j,\rho ,T_n}({\varvec{J}})\cap \bigl (\varvec{\phi }_j({\varvec{J}})-\varvec{\theta }({\varvec{J}})+\mathbb {Z}^d\bigr )\bigr )\ge n \forall n\in \overline{N}\biggr \}\biggr ) \nonumber \\&\le (\mu _X\times \lambda )\bigl (B[(-\varepsilon )_{n=1}^N, (T_n+\varepsilon )_{n=1}^N]\bigr ). \end{aligned}$$
(8.23)

These relations hold for all sufficiently small \(\varepsilon >0\); letting \(\varepsilon \rightarrow 0\) we get, via Lemma 15, when also rewriting the left hand side using Lemma 12:

$$\begin{aligned} \lim _{\rho \rightarrow 0}\lambda \biggl (\biggl \{{\varvec{J}}\, : \,\frac{\rho ^{d-1}t_n(\varvec{\theta },{\varvec{J}},{\mathcal D}_\rho ^{(k)})}{\overline{\sigma }^{(k)}({\varvec{J}})}\le T_n\forall n\in \overline{N}\biggr \}\biggr ) =(\mu _X\times \lambda )\bigl (B[(0)_{n=1}^N,(T_n)_{n=1}^N]\bigr ). \end{aligned}$$
(8.24)

The fact that (8.24) holds for any \(T_1,\ldots ,T_N>0\) implies that (6.13) in Theorem 2 holds. \(\square \)

Remark 8.1

As mentioned, the proof of (6.12) in Theorem 2 is completely similar; in principle one only has to replace \(\overline{\sigma }^{(k)}({\varvec{J}})\) with the constant \(\overline{\sigma }_\lambda ^{(k)}\) throughout the discussion. However a couple of extra technicalities appear. First of all, it may happen that \(\overline{\sigma }_\lambda ^{(k)}=\infty \); however in this case (6.12) is trivial, with \(\tau _i=0\) for all i. Hence from now on we assume \(0<\overline{\sigma }_\lambda ^{(k)}<\infty \). Secondly, the last steps of the proofs of Lemmata 13 and 15 do not carry over verbatim. One way to manage those steps is to assume from start that \(0<\eta<\Vert \varvec{f}({\varvec{J}})\Vert <\eta ^{-1}\) for all \({\varvec{J}}\in {\mathcal U}\); this is permissible by the argument given below (8.3), but with \({\mathcal U}_\eta \) replaced with

$$\begin{aligned} {\mathcal U}_\eta :=\{{\varvec{J}}\in {\mathcal U}\, : \,\Vert \varvec{\phi }_j({\varvec{J}})-\varvec{\phi }_\ell ({\varvec{J}}) \Vert >\eta \,\forall j\ne \ell \,\text { and }\, \eta<\Vert \varvec{f}({\varvec{J}})\Vert <\eta ^{-1}\}. \end{aligned}$$
(8.25)

With this assumption, we have \(\bigl (\overline{\sigma }_\lambda ^{(k)}\Vert \varvec{f}({\varvec{J}})\Vert \bigr )^{-1} <\bigl (\overline{\sigma }_\lambda ^{(k)}\eta \bigr )^{-1}\) for all \({\varvec{J}}\in {\mathcal U}\), and using this the proof of Lemma 13 extends to the present situation. Furthermore, by (6.17) and (6.16),

$$\begin{aligned} {\text {Leb}}_{d-1}\bigl (\overline{\Omega }_j({\varvec{J}})\bigr )= \bigl (\overline{\sigma }^{(k)}_\lambda \,\varvec{u}_j({\varvec{J}})\cdot \varvec{f}({\varvec{J}})\bigr ) {\text {Leb}}_{d-1}\bigl (\Omega _j({\varvec{J}})\bigr ) <\overline{\sigma }^{(k)}_\lambda \eta ^{-1}{\text {Leb}}_{d-1}\bigl (\Omega _j({\varvec{J}})\bigr ), \end{aligned}$$

which is bounded from above by a constant independent of \({\varvec{J}}\), since the set \(\Omega _j\) is bounded. Using this fact, the proof of the continuity in Lemma 15 carries over to the present situation.

Concerning the distribution of the limit variables \(({\widetilde{\tau }}_1,\ldots ,{\widetilde{\tau }}_N)\), we see from the above proof of (6.13) that for any \(T_1,\ldots ,T_n>0\),

$$\begin{aligned}&{\mathbb P}\bigl ({\widetilde{\tau }}_n\le T_n\forall n\in \overline{N}\bigr ) =(\mu _X\times \lambda )\bigl (B[(0)_{n=1}^N,(T_n)_{n=1}^N]\bigr ). \end{aligned}$$
(8.26)

Combining this with (8.14) and (8.15), we get

$$\begin{aligned}&{\mathbb P}\bigl ({\widetilde{\tau }}_n\le T_n\forall n\in \overline{N}\bigr ) \nonumber \\&\quad =(\mu _X\times \lambda )\biggl (\biggl \{(g\Gamma ,{\varvec{J}})\, : \,\sum _{j=1}^k\#\biggl \{\left( \begin{matrix} t \\ \varvec{x} \end{matrix} \right) \in g^{[j]}(\mathbb {Z}^d)\, : \,0<t\le T_n,\, \varvec{x}\in -\widetilde{\Omega }_j({\varvec{J}})\biggr \}\ge n \nonumber \\&\qquad \quad \forall n\in \overline{N}\biggr \}\biggr ). \end{aligned}$$
(8.27)

Hence the limit variables \(({\widetilde{\tau }}_i)_{i=1}^\infty \) may be described as follows. Recall (6.14). Let \({\varvec{J}}\) be a random point in \({\mathcal U}\) distributed according to \(\lambda \), and let \(g\Gamma \) be a random point in X distributed according to \(\mu _X\), and independent from \({\varvec{J}}\). Then \(({\widetilde{\tau }}_i)_{i=1}^\infty \) can be taken to be the elements of the random set

$$\begin{aligned} \bigcup _{j=1}^k{\mathcal P}(g^{[j]}(\mathbb {Z}^d),\widetilde{\Omega }_j({\varvec{J}})), \end{aligned}$$
(8.28)

ordered by size. Similarly, \((\tau _i)_{i=1}^\infty \) can be taken to be the elements of the random set

$$\begin{aligned} \bigcup _{j=1}^k{\mathcal P}(g^{[j]}(\mathbb {Z}^d),\overline{\Omega }_j({\varvec{J}})), \end{aligned}$$
(8.29)

ordered by size. This description clearly agrees with the one in (6.19) and (6.20). Let us also note that it follows from (8.26) and Lemma 15, and the \(\overline{\sigma }^{(k)}_\lambda \)-analogues of these, that the distribution functions \({\mathbb P}\bigl (\tau _n\le T_n\forall n\in \overline{N}\bigr )\) and \({\mathbb P}\bigl ({\widetilde{\tau }}_n\le T_n\forall n\in \overline{N}\bigr )\) depend continuously on \((T_n)\in \mathbb {R}_{>0}^N\), as stated in Sect. 6.