1 Introduction

The design of efficient algorithms to (approximately) compute evaluations of partition functions and graph polynomials, such as the matching polynomial, the independence polynomial, the number of proper colorings and more generally the partition function of the Potts model, is an active area of research. There are two main approaches to obtain deterministic algorithms for this task. One is based on a notion of decay of correlations, related to the absence of phase transitions in statistical physics, called strong spatial mixing. This method was pioneered by Weitz [39] and Bandyopadhyay and Gamarnik [2] and dates back about fifteen years. The other is the interpolation method of Barvinok [3] in combination with an algorithm of Patel and the author [33], which is based on absence of complex zeros of the partition function and relates to absence of phase transitions in the Lee–Yang [40] sense.

Let us for concreteness give an example to illustrate some of these notions.

Example 1

(The hard-core model) Let \(G=(V,E)\) be a graph and \(\lambda \in {\mathbb {C}}\). The independence polynomial of G evaluated at \(\lambda \) is given by

$$\begin{aligned} Z_G(\lambda )=\sum _{\begin{array}{c} I\subseteq V\\ I \text { independent} \end{array}}\lambda ^{|I|}. \end{aligned}$$
(1)

where a set \(I\subseteq V\) is called independent if it does not span any edge of G. In statistical physics \(Z_G(\lambda )\) is known as the partition function of the hard-core model and \(\lambda \) is called the fugacity. For positive \(\lambda \) there is a natural associated probability measure, \(\mu _{G,\lambda }\), on the collection of all independent sets of G, which is called the hard-core measure and is defined by

$$\begin{aligned} \mu _{G,\lambda }(I)=\frac{\lambda ^{|I|}}{Z_G(\lambda )}, \end{aligned}$$

for an independent set I. Often we just write \(\mu \) instead of \(\mu _{G,\lambda }\).

Let for a positive integer \(\Delta \), \({\mathcal {G}}_\Delta \) be the family of graphs of maximum degree at most \(\Delta \). If for any \(G\in {\mathcal {G}}_\Delta \) and any two vertices \(u,v\in V(G)\),

$$\begin{aligned} \left| \Pr _{\mu }[u,v\in I]-\Pr _{\mu }[u\in I]\Pr _{\mu }[v\in I]\right| < \delta (d_G(u,v)), \end{aligned}$$

where \(\delta :{\mathbb {N}}\rightarrow [0,\infty )\) is a function that goes to 0 as its input goes to infinity and \(d_G(u,v)\) denotes the graph distance between the vertices u and v in G, then we say that \(\mu _{G,\lambda }\) satisfies (point to point) decay of correlations with rate \(\delta \) on \({\mathcal {G}}_\Delta \).

Weitz [39] showed that for \(\lambda \in (0,\lambda _c)\), where \(\lambda _c=\frac{(\Delta -1)^{\Delta -1}}{(\Delta -2)^\Delta }\), \(\mu _{G,\lambda }\) satisfies a stronger form of decay of correlation called strong spatial mixing, which we will formally define below, and used this to device a deterministic polynomial time approximation algorithm for computing \(Z_G(\lambda )\) for \(G\in {\mathcal {G}}_\Delta \) and \(\lambda \in (0,\lambda _c)\).Footnote 1

Peters and the author [34] showed that there exists an open set \(U\subset {\mathbb {C}}\) containing the interval \((0,\lambda _c)\) such that for all \(\lambda \in U\) and \(G\in {\mathcal {G}}_\Delta \), \(Z_G(\lambda )\ne 0\). Combined with Barvinok’s interpolation method [3, 33] this also yields a deterministic polynomial time approximation algorithm for computing \(Z_G(\lambda )\) for \(G\in {\mathcal {G}}_\Delta \) and \(\lambda \in (0,\lambda _c)\).

Despite the difference in the respective approaches, surprisingly both approaches have given comparable results in many situations. Not just for the independence polynomial as mentioned in the example above, but also for the matching polynomial [6, 33], the edge cover polynomial [8, 25, 26] and the graph homomorphism partition function [3, 5, 29]. This begs the question of how these two approaches are related.

Many of the above mentioned polynomials and partition functions originate in statistical physics, where they are typically studied on structured subgraphs of lattices such as \({\mathbb {Z}}^d\). Dobrushin and Shlossmann [16, 17] came up with an extensive list of equivalent characterization of what they call completely analytical interactions, in particular showing that absence of zeros and (some forms of) decay of correlations are equivalent for models like the hard-core model and many others. Their proof depends strongly on the fact that balls of radius r in the graph \({\mathbb {Z}}^d\) (for a fixed d) grow only polynomially with r. This is of course not true in general bounded degree graphs. So the work of Dobrushin and Shlossmann only gives a suggestion of what could be true for other families of graphs.

Understanding the connection between strong spatial mixing and absence of zeros on families of graphs like \({\mathcal {G}}_\Delta \) has recently started to receive attention [20, 27, 28, 36]. In particular in [28, 36] it is shown that a standard method for proving strong spatial mixing can be used to prove absence of zeros for partition functions of several models. Very recently, Gamarnik [20] showed that absence of zeros of the partition function implies a weaker form of strong spatial mixing for the hardcore model and certain graph homomorphism models, but his result does not apply to all bounded degree graphs.

In the present paper we will show that absence of zeros of the partition function does indeed imply strong spatial mixing for the hardcore model and certain graph homomorphism models for all bounded degree graphs, confirming in a strong form a variant of a conjecture of Gamarnik [20].

Below we shall give formal definitions of the notion of strong spatial mixing that we use and state our main results. For concreteness we will limit ourselves to two types of models: the hard-core model and graph homomorphisms. We shall later indicate how our approach can be used for other models as well.

1.1 The hard-core model

We continue our discussion of Example 1. To introduce the notion of strong spatial mixing we need to consider boundary conditions. For a graph \(G=(V,E)\) and \(\Lambda \subset V\) we call \(\sigma :\Lambda \rightarrow \{0,1\}\) a boundary condition if \(\sigma ^{-1}(1)\) is an independent set in the graph induced by \(\Lambda \). We denote by \(\Pr _\mu [v\in I\mid \sigma ]\) the probability that the vertex v is in the random independent set I drawn according to the hard-core measure conditioned on \(\sigma \), meaning that we condition on the event that \(\sigma ^{-1}(1)\subset I\) and \(\sigma ^{-1}(0)\cap I=\emptyset \). For another boundary condition \(\tau \) on \(\Lambda \) and a vertex \(v\notin \Lambda \), we denote by \(d_G(v,\sigma \ne \tau )\) the graph distance from v to the nearest vertex in \(\Lambda \) at which \(\sigma \) and \(\tau \) differ.

Let \({\mathcal {G}}\) be an infinite family of graphs and let \(\lambda >0\). The hard-core measure at \(\lambda \) satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r>1\) if there exists a constant \(C>0\) such that for any graph \(G=(V,E)\in {\mathcal {G}}\), any vertex \(v\in V\), any \(\Lambda \subseteq V\setminus \{v\}\) and any two boundary conditions \(\sigma \) and \(\tau \) on \(\Lambda \),

$$\begin{aligned} \left| \Pr _\mu [v\in I\mid \sigma ]-\Pr _\mu [v\in I\mid \tau ]\right| \le Cr^{-d_G(v,\sigma \ne \tau )}. \end{aligned}$$
(2)

Note that strong spatial mixing implies point to point correlation with an exponentially decaying rate.

For a set \(S\subset {\mathbb {C}}\) and \(\varepsilon >0\) we denote \({\mathcal {N}}(S,\varepsilon )=\{z\in {\mathbb {C}}\mid d(z,S)\le \varepsilon \}\), where d denotes the Euclidean metric on \({\mathbb {C}}\). We can now state our main result for the hard-core measure, which we prove in Sect. 3.

Theorem 1

Let \(\Delta \ge 2\) be an integer and let \({\mathcal {G}}\subset {\mathcal {G}}_\Delta \) be a family of bounded degree graphs that is closed under taking induced subgraphs. Let \(\lambda ^\star >0\) be such that there exists \(\varepsilon >0\) such that for each \(G\in {\mathcal {G}}\) and any \(\lambda \in {\mathcal {N}}([0,\lambda ^\star ),\varepsilon )\), \(Z_G(\lambda )\ne 0\). Then for any \(\lambda \in (0,\lambda ^\star ]\) the hard-core measure at \(\lambda \) satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r=1+\exp (-O(\lambda /\varepsilon ))\).

A celebrated result by Chudnovsky and Seymour [14] states that for any claw-free graph G the zeros of \(Z_G\) are all real and hence negative. Since for bounded degree graphs the zeros of its independence polynomial do not approach 0 by a result of Shearer [37] and Scott and Sokal [35] (cf. Lemma 6 below), we can apply our main theorem to the family of bounded degree claw-free graphs (which is certainly closed under taking induced subgraphs) to obtain an improvement on a result implicit in [6]. We can however get a much better exponential rate as the following result states.

Theorem 2

Let \(\Delta \ge 2\) be an integer and let \({\mathcal {G}}\subset {\mathcal {G}}_\Delta \) be the family of claw-free graphs of maximum degree at most \(\Delta \). Then for any \(\lambda >0\), the hard-core measure at \(\lambda \) satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r=1+O((\lambda \Delta )^{-1/2})\).

We prove this result in Sect. 3.

1.1.1 Algorithms

As remarked in [6], strong spatial mixing by itself is not sufficient to approximately compute the probabilities \(\Pr _\mu [v\in I]\) in polynomial time (in case one wants the additive error to be at most order 1/|V(G)|). Our approach for showing strong spatial mixing implicitly yields a polynomial time algorithms for this task. We comment on this at the end of Sect. 3.

1.2 Graph homomorphism measures

Let \(q\ge 2\) be an integer and let A be a symmetric \(q\times q\) matrix. For a graph \(G=(V,E)\) we define the graph homorphism partition function \(Z_G(A)\) by

$$\begin{aligned} Z_G(A):=\sum _{\psi \rightarrow [q]}\prod _{uv\in E} A_{\psi (u),\psi (v)}, \end{aligned}$$
(3)

where \([q]:=\{1,\ldots ,q\}.\) Note that in case A is the adjacency matrix of a graph H, then \(Z_A(G)\) is equal to the number of graph homomorphisms from G to H. For nonnegative (and nonzero) matrices A there is a natural associated probability measure, the graph homomorphism measure \(\mu _{G,A}\), on the set of all q-colorings of the vertices of G, \(\Omega _{V,q}=\{\psi : V\rightarrow [q]\}\), defined by for \(\psi \in \Omega _{V,q}\),

$$\begin{aligned} \mu _{G,A}(\psi ):=\frac{\prod _{uv\in E}A_{\psi (u),\psi (v)}}{Z_G(A)}, \end{aligned}$$

where we implicitly assume that \(Z_G(A)\ne 0\). For \(\Lambda \subset V\) we call any \(\sigma :\Lambda \rightarrow [q]\) a boundary condition on \(\Lambda \). Let \(v\in V\setminus \Lambda .\) We denote for \(i\in [q]\) by \(\Pr _\mu [\psi (v)=i \mid \sigma ]\) the probability that the vertex v gets color i in the random q-coloring \(\psi \) drawn according to the measure \(\mu _{G,A}\) conditioned on \(\sigma \), meaning that we condition on the event that \(\psi \) agrees with \(\sigma \) on \(\Lambda \), where we implicitly assume that this latter event has positive measure. As in the case of the independence polynomial, for another boundary condition \(\tau \) on \(\Lambda \), we denote by \(d_G(v,\sigma \ne \tau )\) the graph distance from v to the nearest vertex in \(\Lambda \) at which \(\sigma \) and \(\tau \) differ.

Let \({\mathcal {G}}\) be an infinite family of graphs. We say that the measure \(\mu =\mu _{G,A}\) satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r>1\) if there exists a constant \(C>0\) such that for any graph \(G=(V,E)\in {\mathcal {G}}\), any vertex \(v\in V\), any \(i\in [q]\), any \(\Lambda \subseteq V\setminus \{v\}\) and any two boundary conditions \(\sigma \) and \(\tau \) on \(\Lambda \),

$$\begin{aligned} \left| \Pr _\mu [\psi (v)=i\mid \sigma ]-\Pr _\mu [\psi (v)=i\mid \tau ]\right| \le Cr^{-d_G(v,\sigma \ne \tau )}. \end{aligned}$$
(4)

Recall that by \({\mathcal {G}}_\Delta \) we denote the family of graphs of maximum degree at most \(\Delta \). Using Barvinok’s [3, Theorem 7.1.4], or rather its proof, which provides a zero-free region for the graph homomorphism partition function, we prove in Sect. 4 the following result.

Theorem 3

Let \(\Delta \ge 3\) and \(q\ge 2\) be integers and let for some \(\alpha =\alpha _\Delta <2\pi /3\Delta \),

$$\begin{aligned} \delta _\Delta :=\sin (\alpha /2)\cos (\alpha \Delta /2). \end{aligned}$$

Fix \(\eta \in (0,1)\). Then for any real \(q\times q\) symmetric matrix A satisfying \(|A_{i,j}-1|<(1-\eta )\delta _\Delta \) for all \(i,j=1,\ldots ,q\) the measure \(\mu _{G,A}\) satisfies strong spatial mixing on \({\mathcal {G}}_\Delta \) with exponential rate \(r=1/(1-\eta )\).

Note that \(\delta _\Delta =\Omega (1/\Delta )\). Moreover note that this result is qualitatively similar to (but quantitatively better than) a result implicit in [29]. We also note that the conditions in the theorem guarantee that the graph homomorphism partition function is non-zero on graphs of maximum degree at most \(\Delta \) for complex matrices A satisfying \(|A_{i,j}-1|\le \delta _\Delta \) for all ij by [3, Theorem 7.1.4]. Improvements to [3, Theorem 7.1.4] do not automatically lead to improvements to Theorem 3, since in our proof we require that the zero-freeness also holds for graphs with boundary conditions. See Remark 4 for further discussion.

1.3 Related work

Our work falls into a recent series of contributions in which absence of complex zeros of the probability generating function of a discrete distribution gives rise to detailed probabilistic information about the distribution.

As mentioned earlier, the notion of strong spatial mixing is intimately connected to the design of efficient algorithms to (approximately compute) evaluations of graph polynomials and partition functions. Another well known approach for designing such algorithms for this task is based on Markov chains, in particular the Glauber dynamics. This of course then leads to randomized algorithms. Very recently Chen et al. [13], building on [1], showed that absence of zeros for partition functions of several models in a multivariate sense leads to proofs of rapid mixing of the Glauber dynamics for these models.

Another application of absence of complex zeros is found in [22, 24, 30, 31], where central limit theorems are derived for discrete probability distributions taking a finite number of values in the nonnegative integers, whose probability generating function p(X), defined as \(p(x)=\sum _{k\ge 0}\Pr [X=k]x^k\), has no zeros in the vicinity of \(x=1.\)

1.4 Overview of proof

Our proof consists of essentially two main steps. The first step is to view the conditional probability that we try to control as an evaluation of a rational function P(z) at \(z=1\) and utilize absence of complex zeros to show that |P(z)| is bounded on some domain containing \(z=1\) and \(z=0\). This is done in two different ways. For the graph homomorphism partition function this is done using absence of zeros in the multivariate sense, while for the independence polynomial we only require absence of zeros for the univariate polynomial by using the powerful Montel theorem from complex analysis. Once it is known that the rational function P(z) is bounded, then by using Cauchy’s formula we obtain bounds on the coefficients of its series expansion. We interpret these coefficients combinatorially with the aid of the cluster expansion to arrive at the desired strong spatial mixing results.

The remainder of the paper is organized as follows. In the next section we gather the tools that we need to prove our results. In Sect. 3 we prove our two results on the hard-core model and in Sect. 4 we prove Theorem 3. Finally in Sect. 5 we conclude with some remarks and questions.

2 Tools

2.1 Convention

We will often deal with functions f holomorphic on some open set \(U\subset {\mathbb {C}}\) containing 0. Therefore, near 0, f has a convergent series expansion \(f(z)=\sum _{k\ge 0}a_k z^k\). In such a situation we often just write \(f(z)=\sum _{k\ge 0}a_k z^k\) near 0.

For \(r>0\) we denote by \({\mathbb {D}}_r\) the open disk centered at 0 of radius r.

2.2 The cluster expansion

The cluster expansion is a formal series expansion of the logarithm of a so-called polymer partition function [19]. The polymer partition function can also be viewed as the multivariate independence polynomial of an associated graph [35], which is the perspective we take here.

Let \(G=(V,E)\) be a graph. Let \(w=(w_v)_{v\in V}\) be a vector of complex variables. Then the multivariate independence polynomial of G is defined as

$$\begin{aligned} Z_G(w)=\sum _{\begin{array}{c} I\subseteq V\\ I \text { independent } \end{array}}\prod _{v\in I}w_v. \end{aligned}$$
(5)

For a sequence of (not necessarily distinct) vertices \((v_1,\ldots ,v_k)\), \(v_i\in V\), \(i=1,\ldots ,k\), we denote by \(G(v_1,\ldots ,v_k)\) the graph on the vertex set \(\{1,\ldots ,k\}\) where for \(i\ne j\), i is adjacent to j if and only if \(v_i=v_j\) or \(\{v_i,v_j\}\in E\) and we call \(G(v_1,\ldots ,v_k)\) the cluster induced by \(v_1,\ldots ,v_k\). The Ursell function of a graph H is defined as

$$\begin{aligned} \phi (H)=\sum _{\begin{array}{c} F\subseteq E(H)\\ (V(H),F) \text { connected} \end{array}}(-1)^{|F|}. \end{aligned}$$
(6)

Note that by definition \(\phi (H)=0\) if the graph H is not connected.

The cluster expansion is the following formal power series representation of \(\log (Z_G(w))\) [23, 35],

$$\begin{aligned} \log (Z_G(w))=\sum _{k\ge 1}\frac{1}{k!}\sum _{v_1,\ldots ,v_k\in V} \phi (G(v_1,\ldots ,v_k)) \prod _{i=1}^k w_{v_i}. \end{aligned}$$
(7)

Under certain conditions on the \(w_v\) the cluster expansion converges [23, 35]. We will however not need to use these conditions. For our purposes it suffices that if all \(w_v\) are small enough in absolute value (possibly depending on the underlying graph G), then the cluster expansion converges.

2.3 Some complex analysis

The next lemma is a consequence of Cauchy’s differentiation’s theorem (which in turn follows from the integral formula).

Lemma 1

Let P(z) be a holomorphic function on \({\mathbb {D}}_r\) for some \(r>1\), with series expansion \(P(z)=\sum _{k\ge 0} a_k z^k\) near 0. Suppose that |P(z)| is bounded by M on \({\mathbb {D}}_r\). Then the radius of convergence of the series expansion is bigger than 1 and for any \(N\in {\mathbb {N}}\),

$$\begin{aligned} |P(1)-\sum _{k=0}^{N-1} a_k|\le \frac{Mr}{(r-1)r^{N}}. \end{aligned}$$

Proof

Choose \(\rho \) so that \(1<\rho <r\). By Cauchy’s differentiation’s theorem we have

$$\begin{aligned} a_k=\frac{1}{2\pi i}\int _{\partial {\mathbb {D}}_\rho } \frac{P(w)}{w^{k+1}} dw. \end{aligned}$$

This implies that \(|a_k|\le M/\rho ^{k}\). Since this holds for any \(1<\rho <r\), it follows that \(|a_k|\le M/r^{k}\). Bounding P(z) by a geometric series it follows that the radius of convergence is bigger than 1. It follows similarly that \(|\sum _{k\ge N}a_k|\le \frac{M r^{-N}}{1-1/r}=\frac{M r}{(r-1)r^{N}}\), as desired. \(\square \)

Typically we will not have functions defined on disks, but rather on neighbourhoods of real intervals.

Lemma 2

Let \(P(z)=\sum _{k\ge 0} a_k z^k\) and \(Q(z)=\sum _{k\ge 0} b_k z^k\) be two holomorphic functions defined on some open set containing 0 that satisfy \(a_k=b_k\) for \(k=0,\ldots , N\) for some \(N\in {\mathbb {N}}\). Then

  1. (i)

    If there exists \(\varepsilon >0\) and \(M>0\) such that both |P(z)| and |Q(z)| are bounded by M on \({\mathcal {N}}([0,1],2\varepsilon )\), then there exists a constant \(r=1+O(e^{-1/\varepsilon })\) such that

    $$\begin{aligned} |P(1)-Q(1)|\le \frac{2M r}{(r-1)r^{N}}. \end{aligned}$$
    (8)
  2. (ii)

    If there exists \(\delta >0\) such that for any compact set \(S\subset \{z\mid \Re (z)>-\delta \}\) intersecting the positive real line there exists a constant \(M=M_S\) such that both |P(z)| and |Q(z)| are bounded by \(M_S\) on S, then there exists a compact set S such that with \(r= 1+\sqrt{\delta }\),

    $$\begin{aligned} |P(1)-Q(1)|\le \frac{2M_S r}{(r-1)r^{N}}. \end{aligned}$$
    (9)

Proof

We start with the proof of part (i). Let \(r=\frac{1-e^{-1-1/\varepsilon }}{1-e^{-1/\varepsilon }}\) and let \(\alpha =1-e^{-1/\varepsilon }\). Note that \(r=1+O(e^{-1/\varepsilon })\). Define \(g(z)=\varepsilon \log (1/(1-\alpha z))\) on \({\mathbb {D}}_r\) taking the branch of the logarithm that satisfies \(g(0)=0\). Barvinok shows in his proof of [3, Lemma 2.2.3] that g maps the disk \({\mathbb {D}}_r\) into \({\mathcal {N}}([0,1],2\varepsilon )\) and that \(g(1)=1.\)

We now consider the compositions \(P\circ g\) and \(Q\circ g\) on \({\mathbb {D}}_r\). Let us write \(P\circ g=\sum _{k\ge 0} a_k g(z)^k=\sum _{k\ge 0} a'_k z^k\) and \(Q\circ g=\sum _{k\ge 0} b_k g(z)^k=\sum _{k\ge 0}b'_k z^k\). Then the coefficients \(a'_k\) (resp. \(b'_k\)) depend only on \(a_0,\ldots , a_k\) (resp. \(b_0,\ldots , b_k)\) and the first k coefficients of the Taylor series of g around 0, since g has constant term equal to 0. This implies that \(a'_k=b'_k\) for \(k=0,\ldots , N.\) Since both \(|(P\circ g)(z)|\) and \(|(Q\circ g)(z)|\) are bounded by M on \({\mathbb {D}}_r\), Lemma 1 in combination with the triangle inequality now implies that

$$\begin{aligned} |P(1)-Q(1)|=|(P\circ g)(1)-(Q\circ g) (1)|\le 2 \frac{M r}{(r-1)r^{N}}, \end{aligned}$$

as desired.

For the proof of part (ii) consider for \(\xi =1-\sqrt{\frac{\delta }{1+\delta }}\) the map \(h(z)=\frac{\delta }{(1-\zeta z)^2}-\delta \). By Lemma 2.4 of [4], h maps the open disk \({\mathbb {D}}_{\xi ^{-1}}\) into the set \({\mathbb {C}}\setminus \{z\in {\mathbb {R}}\mid z<-3/4\delta \}\) and satisfies \(h(0)=0\) and \(h(1)=1.\) Let \(r=1+\sqrt{\delta }\). Then \(1<r<\xi ^{-1}.\) Denote by S the image under h of the closure of \({\mathbb {D}}_{r}\). Then S is a compact set contained in \(\{z\mid \Re (z)\ge -\delta \}\) and therefore both P and Q are bounded on this set by some constant \(M:=M_S=M_\delta \). The proof now proceeds in exactly the same way as in case (i). \(\square \)

Remark 1

We could have also used the Riemann map** theorem in the proof above to get suitable map g and h. For concreteness the present ones are convenient, as we can easily compute their Taylor series, thereby making them suitable for algorithmic applications.

2.3.1 Montel’s theorem

An important tool in our proof of Theorem 2 is the use of Montel’s theorem. This is a cornerstone result in the theory of modern complex dynamical systems [12, 32, 41] and has recently found applications in the study of determining the location of zeros and the complexity of approximating evaluations of the independence polynomial [9, 11, 15].

We need a definition to state the theorem. We denote by \({\widehat{{\mathbb {C}}}}={\mathbb {C}}\cup \{\infty \}\) the extended complex plane. Let \(U\subset {\mathbb {C}}\) be an open set. A family \({\mathcal {F}}\) of holomorphic functions \(f:U\rightarrow {\widehat{{\mathbb {C}}}}\) is called a normal family if each infinite sequence of elements of \({\mathcal {F}}\) has a subsequence that converges locally uniformly to a holomorphic function.

Theorem 4

(Montel) \(U\subset {\mathbb {C}}\) be a connected open set and let \({\mathcal {F}}\) be a family of holomorphic functions \(f:U\rightarrow {\widehat{{\mathbb {C}}}}\). Suppose that there exists three distinct points \(a,b,c\in {\widehat{{\mathbb {C}}}}\) such that \(f(U)\subset {\widehat{{\mathbb {C}}}} \setminus \{a,b,c\}\) for all \(f\in {\mathcal {F}}\). Then the family \({\mathcal {F}}\) is normal.

See e.g. [12, 32, 41] for variations, extensions and a proof of Theorem 4.

3 The hard-core model

In this section we will prove Theorems 1 and  2.

Let us introduce for a graph G and a vertex v of G, the ratio,

$$\begin{aligned} P_{G,v}(\lambda )=\frac{\lambda Z_{G\setminus N[v]}(\lambda )}{Z_{G}(\lambda )}, \end{aligned}$$
(10)

considered as a rational function in \(\lambda \). Here N[v] denotes the closed neighbourhood of v. Note that for positive \(\lambda \), \(P_{G,v}(\lambda )\) is just the probability of the vertex v being in the random independent set drawn from the hard-core measure.

We introduce some further notation to facilitate the discussion. Let \(G=(V,E)\) be a graph and let \(v\in V\) and \(\Lambda \subset V\setminus \{v\}\). For a boundary condition \(\sigma \) on \(\Lambda \) we denote by \(G[\sigma ]\) be the graph obtained from \(G\setminus \Lambda \) by removing all neighbours in \(G\setminus \Lambda \) of vertices in \(\Lambda \) that are set to ‘in’ by \(\sigma \). Note that \(G[\sigma ]\) is an induced subgraph of G. Denote by \(\sigma _{v,1}\) the boundary condition on \(\Lambda \cup \{v\}\) extending \(\sigma \) that assigns 1 to v. Let \(\lambda >0\). Then

$$\begin{aligned} \Pr _{\mu }[v \in I\mid \sigma ]=P_{G[\sigma ],v}(\lambda ). \end{aligned}$$
(11)

Indeed, by Bayes’ rule, writing \(\Pr _\mu [\sigma ]\) for the probability that the random independent set I drawn according to \(\mu \) satisfies \(\sigma ^{-1}(1)\subseteq I\) and \(\sigma ^{-1}(0)\cap I=\emptyset \), we have

$$\begin{aligned} Pr_{\mu }[v \in I\mid \sigma ]=\frac{\Pr _\mu [\sigma _{v,1}]}{\Pr _\mu [\sigma ]}=\frac{\lambda ^{|\sigma ^{-1}(1)|+1}Z_{G[\sigma _{v,1}]}(\lambda )/Z_G(\lambda )}{\lambda ^{|\sigma ^{-1}(1)|}Z_{G[\sigma ]}(\lambda )/Z_G(\lambda )}=P_{G[\sigma ],v}(\lambda ), \end{aligned}$$

as claimed.

3.1 Bounded ratios imply strong spatial mixing

We start by giving a series expansion of the ratios. For a sequence of (not necessarily distinct) vertices \((v_1,\ldots ,v_k)\) from G and a vertex \(v\in V\) we denote by \(m_v(v_1,\ldots ,v_k)\) the number of i such that \(v=v_i\).

Lemma 3

Let \(G=(V,E)\) be a graph, v a fixed vertex of G and let \(\lambda \) be a complex variable. Then near \(\lambda =0\) we have the following series expansion of \(P_{G,v}\)

$$\begin{aligned} P_{G,v}(\lambda )=\sum _{k\ge 1}\frac{1}{k!} \sum _{v_1,\ldots ,v_{k}\in V}\phi (G(v_1,\ldots ,v_k))m_v(v_1,\ldots ,v_{k})\lambda ^{k}. \end{aligned}$$
(12)

Proof

We introduce a vector of complex variables \(w=(w_v)_{v\in V}\). Then for the multivariate independence polynomial, we have

$$\begin{aligned} \frac{w_{v}Z_{G\setminus N[v]}(w)}{Z_G(w)}=w_v\frac{\partial }{\partial w_{v}} \log (Z_G(w)), \end{aligned}$$

as long as \(Z_G(w)\ne 0\). By the cluster expansion (7), this implies that for small enough \(w_u\) (for all \(u\in V\)), we have

$$\begin{aligned} \frac{w_{v}Z_{G\setminus N[v]}(w)}{Z_G(w)}&=w_v\sum _{k\ge 1}\frac{1}{k!}\sum _{v_1,\ldots ,v_{k}\in V} \phi (G(v_1,\ldots ,v_{k}))\frac{\partial }{\partial w_{v}}\prod _{i=1}^{k} w_{v_i}. \end{aligned}$$

Now note that

$$\begin{aligned} \frac{\partial }{\partial w_{v}}\prod _{i=1}^{k} w_{v_i}=m_v(v_1,\ldots ,v_k)\frac{\prod _{i=1}^{k} w_{v_i}}{w_v}. \end{aligned}$$

By plugging in \(w_v=\lambda \) for each v and making use of the uniqueness of the coefficients of the power series representation, this completes the proof. \(\square \)

For a graph \(G=(V,E)\), a vertex \(v\in V\) and a positive integer k, we denote

$$\begin{aligned} B_G(v,k):=\{u\in V\mid d_G(u,v)\le k\}, \end{aligned}$$

the distance at most k-neighbourhood of v.

Lemma 4

Let \({\mathcal {G}}\) be a family of graphs that is closed under taking induced subgraphs.

  1. (i)

    Let \(\lambda ^\star \) be such that there exists constants \(\varepsilon >0\) and \(M>0\) such that for all \(\lambda \in {\mathcal {N}}([0,\lambda ^\star ],\varepsilon )\), the ratios, \(P_{G,v}\), satisfy \(|P_{G,v}(\lambda )|\le M\) for all \(G\in {\mathcal {G}}\) and all vertices \(v\in V(G)\). Then for any \(\lambda \in (0,\lambda ^\star ]\) the hard-core measure satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r=1+O(e^{-\lambda /\varepsilon })\).

  2. (ii)

    If there exists \(\delta >0\) such that for any compact set \(S\subset \{z\mid \Re (z)>-\delta \}\) intersecting the positive real line there exists a constant \(M=M_S\) such that the ratios, \(P_{G,v}\), satisfy \(|P_{G,v}(\lambda )|\le M_S\) for all \(G\in {\mathcal {G}}\) all vertices \(v\in V(G)\) and all \(\lambda \in S\). Then for any \(\lambda >0\) the hard-core measure satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate \(r=1+O(\sqrt{\delta /\lambda })\).

Proof

We start with the proof of (i). Fix \(\lambda \in [0,\lambda ^\star ]\) and let \(\varepsilon '>0\) be such that \(\lambda \cdot {\mathcal {N}}([0,1],\varepsilon ')\subset {\mathcal {N}}([0,\lambda ^\star ],\varepsilon ))\). Note that \(\varepsilon '=O(\varepsilon /\lambda )\). Let \(r>1\) be the constant obtained from Lemma 2(i) upon input of \(\varepsilon '/2\). Choose any \(G\in {\mathcal {G}}\) and fix a vertex v of G. Choose \(\Lambda \subset V(G)\setminus \{v\}\) and two boundary conditions \(\sigma =\sigma _\Lambda \) and \(\tau =\tau _\Lambda \) on \(\Lambda \). We will show that

$$\begin{aligned} \big |{\mathbb {P}}_{G,\lambda }[v \text { in }\mid \sigma _\Lambda ]-{\mathbb {P}}_{G,\lambda }[v \text { in } \mid \tau _\Lambda ]\big |\le \frac{2Mr}{(r-1)r^{d_G(v,\sigma \ne \tau )-1}}, \end{aligned}$$
(13)

implying the desired statement.

Define for \(z\in {\mathbb {C}}\), rational functions in z, \(P(z):=P_{G[\sigma ],v}(z\lambda )\) and \(Q(z):=P_{G[\tau ],v}(z \lambda )\). Then by (11),

$$\begin{aligned} {\mathbb {P}}[v \text { in }\mid \sigma _\Lambda ]=P(1) \quad \text {and} \quad {\mathbb {P}}[v \text { in }\mid \tau _\Lambda ]=Q(1). \end{aligned}$$
(14)

By Lemma 3 we know that, as formal series in z,

$$\begin{aligned} P(z)&=\sum _{k\ge 1}\frac{1}{k!} \sum _{v_1,\ldots ,v_{k}\in V}\phi (G[\sigma ](v_1,\ldots ,v_{k}))m_v(v_1,\ldots ,v_k)\lambda ^{k}z^{k},\nonumber \\ Q(z)&=\sum _{k\ge 1}\frac{1}{k!} \sum _{v_1,\ldots ,v_{k}\in V}\phi (G[\tau ](v_1,\ldots ,v_{k}))m_v(v_1,\ldots ,v_k)\lambda ^{k}z^{k}. \end{aligned}$$
(15)

In particular the coefficient of \(z^{k}\) in P (resp. Q) depends only on the vertices in \(G[\sigma ]\) (resp. \(G[\tau ]\)) that have distance at most \(k-1\) to v since by definition \(\phi (G(v_1,\ldots ,v_k))m_v(v_1,\ldots ,v_k)=0\) if the cluster induced by \(v_1,\ldots ,v_k\) is not connected or if it does not contain the vertex v. For \(k<d_G(v,\sigma \ne \tau )-1\) the distance at most k-neighbourhoods, \(B_{G[\sigma ]}(v,k)\) and \(B_{G[\tau ]}(v,k)\) are equal. Therefore for any \(k=0,\ldots ,d_G(v,\sigma \ne \tau )-1\) we know that the coefficients of \(z^k\) of the power series representations for P(z) and Q(z) are equal.

Now by construction for any \(z\in {\mathcal {N}}([0,1],\varepsilon ')\) we have \(\lambda z\in {\mathcal {N}}([0,\lambda ^\star ],\varepsilon )\) and therefore |P(z)| and |Q(z)| are bounded by M on \({\mathcal {N}}([0,1],\varepsilon ')\). We now use Lemma 2(i) to conclude that \(|P(1)-Q(1)|\) is bounded by

$$\begin{aligned} \frac{2Mr}{(r-1)r^{d_G(v,\sigma \ne \tau )-1}}, \end{aligned}$$

proving (13).

The proof of (ii) is very similar. Again we fix \(\lambda >0\) and define \(\delta '=\delta /\lambda \). Let \(r>1\) be the constant obtained from Lemma 2(ii) upon input of \(\delta '\). As in the proof of (i), choose any \(G\in {\mathcal {G}}\) and fix a vertex v of G. Choose \(\Lambda \subset V(G)\setminus \{v\}\) and two boundary conditions \(\sigma =\sigma _\Lambda \) and \(\tau =\tau _\Lambda \) on \(\Lambda \). As above we define \(P(z)=P_{G[\sigma ],v}(\lambda z)\) and \(Q(Z)=P_{G[\tau ],v}(\lambda z)\). We let S be the compact set from Lemma 2(ii) upon input \(\delta '\) and let \(M=M_S\). Then in exactly the same way as above, with \(r=1+\sqrt{\delta '}\) we obtain (13). This finishes the proof. \(\square \)

3.1.1 Algorithm

To approximately compute the conditional probabilities \(\Pr _\mu [v\in {I}\mid \sigma ]\) we need to approximate P evaluated at 1. The basic idea is just to truncate the series for \(P\circ g\) at depth \(K=O(\log (n/\varepsilon ))\) for an n-vertex graph G, so as to obtain an additive \(\varepsilon /n\)-approximation. Since the coefficients of \(P\circ g\) can be easily computed from those of g and P (in time \(O(K^2)\) using Horner’s method), the real algorithmic task is to compute the first K coefficients of P. This can be done efficiently (i.e. in time \(\Delta ^{O(K)}\) on graphs of maximum degree at most \(\Delta \)) with an algorithm appearing in the proof of Theorem 6 from [21]. We leave the details to the reader.

3.2 Absence of zeros implies bounded ratios

In this section we cover the final ingredients to prove our main result for the hard-core measure on bounded degree graphs.

Before we start we note that for a vertex transitive graph G it is not hard to see that

$$\begin{aligned} \frac{P_{G,v}(\lambda )}{\lambda }=\frac{d}{d\lambda }\frac{\log (Z_G(\lambda ))}{|V(G)|} \end{aligned}$$

and therefore absence of zeros of \(Z_G\) on an infinite family of vertex transitive graphs on some open set containing 0 implies boundedness of the ratios, \(P_{G,v}\), as can be derived from the proof of [3, Lemma 2.2.1] (in combination with the Riemann map** theorem). Below we show this is also true for bounded degree graphs in general.

Lemma 5

Let \({\mathcal {G}}\) be a family of graphs that is closed under taking induced subgraphs. Let \(U\subseteq {\mathbb {C}}\) be an open set such that for any graph \(G\in {\mathcal {G}}\) and any \(\lambda \in U\) \(Z_G(\lambda )\ne 0\). Then for any compact set \(S\subset U\setminus \{0\}\) that intersects the positive real line there exists a constant \(M>0\) such that for all \(G\in {\mathcal {G}}\), \(v\in V(G)\) and \(\lambda \) in S,

$$\begin{aligned} |P_{G,v}(\lambda )|\le M. \end{aligned}$$

Proof

Suppose to the contrary that the ratios are unbounded on S. Then there exists a sequence of graphs \((G_n)_{n\ge 1}\) and vertices \(v_n\in V(G_n)\) and a sequence of points \((\lambda _n)_{n\ge 1}\) in S such that

$$\begin{aligned} |P_{G_n,v_n}(\lambda _n)|\ge n. \end{aligned}$$
(16)

Since for any graph G and vertex \(v\in V(G)\), \(Z_{G}(\lambda )=\lambda Z_{G\setminus N[v]}(\lambda )+Z_{G-v}(\lambda )\) and since \({\mathcal {G}}\) is closed under taking induced subgraphs, it follows that for any \(\lambda \in U\setminus \{0\}\), the ratio \(P_{G,v}(\lambda )\) must avoid the points \(\infty ,0\) and 1. Therefore, by Theorem 4 (Montel’s theorem), the family of rational functions \(\{\lambda \mapsto P_{G_n,v_n}(\lambda )\mid n\ge 1 \}\) forms a normal family on \(U\setminus \{0\}\) and hence contains a subsequence that converges locally uniformly to some holomorphic function f. In particular this convergence is uniform on S by compactness. Since for positive real \(\lambda \) the ratios are just probabilities and hence contained in [0, 1], it follows that f is not constant \(\infty \). But then the image of S under f must be bounded, implying that the ratios \(P_{G_n,v_n}(\lambda _n)\) cannot be unbounded on S by uniform convergence. This contradicts our assumption and therefore there must be some bound \(M=M_S\) on the absolute values of the ratios on the set S, as desired. \(\square \)

The next lemma shows boundedness of the ratios for \(\lambda \) near 0 and is essentially due to Shearer [37] and Scott and Sokal [35].

Lemma 6

Let \({\mathcal {G}}_\Delta \) be the family of graphs of maximum degree at most \(\Delta \ge 3\). For any \(\lambda \) such that \(|\lambda |< \frac{(\Delta -1)^{\Delta -1}}{\Delta ^\Delta }\), any graph \(G\in {\mathcal {G}}_\Delta \), \(v\in V(G)\) we have \(|P_{G,v}(\lambda )|< \frac{1}{\Delta -2}\) and moreover \(Z_{G}(\lambda )\ne 0\).

Proof

We use [15, Lemma 2.10] that states that for \(\lambda \) as in the statement of the lemma we have with \(R_{G,v}(\lambda ):=\frac{\lambda Z_{G\setminus N[v]}(\lambda )}{Z_{G-v(\lambda )}}\), \(|R_{G,v}(\lambda )|<\frac{1}{\Delta -1}\) and \(Z_{G}(\lambda )\ne 0\). Using that \(P_{G,\lambda }=\frac{R_{G,v}(\lambda )}{1+R_{G,v}(\lambda )}\) the first statement also follows. \(\square \)

We can now prove our main results concerning strong spatial mixing for the hard-core measure.

Proof of Theorem 1

Let \({\mathcal {G}}\subset {\mathcal {G}}_{\Delta }\) be a family of graphs of maximum degree at most \(\Delta \) that is closed under taking induced subgraphs and let \(\lambda ^\star >0\) and \(\varepsilon >0\) be such that for all \(\lambda \in {\mathcal {N}}([0,\lambda ^\star ],\varepsilon )\) and \(G\in {\mathcal {G}}\), \(Z_G(\lambda )\ne 0.\) Then by Lemmas 6 and 5 we know that the ratios \(P_{G,v}\) are bounded on \({\mathcal {N}}([0,\lambda ^\star ],\varepsilon )\) by some constant M for all graphs \(G\in {\mathcal {G}}\) and all \(v\in V(G)\). Therefore the result follows from Lemma 4(i). \(\square \)

Proof of Theorem 2

Let \({\mathcal {G}}'_\Delta \subset {\mathcal {G}}_{\Delta }\) be the family of claw-free graphs of maximum degree at most \(\Delta \). By the Chudnovsky–Seymour theorem [14] we know that all roots of \(Z_G\) for \(G\in {\mathcal {G}}'_\Delta \) are negative reals. By Lemma 6, it follows that \(Z_{G}(\lambda )\ne 0\) as long as \(\lambda >-\tfrac{1}{e\Delta }\). Combining Lemmas 6 and 5 we conclude that on any compact set S that avoids the set \(\{z\in {\mathbb {R}}\mid z\le -\tfrac{1}{e\Delta }\}\) the ratios, \(P_{G,v}\), for \(G\in {\mathcal {G}}'_\Delta \) and any \(v\in V(G)\) are bounded in absolute value by some constant \(M_S\). Therefore, Lemma 4(ii) implies that hard-core measure on \({\mathcal {G}}'_\Delta \) at any \(\lambda >0\) satisfies strong spatial mixing with exponential rate \(r=1+O((\lambda \Delta )^{-1/2})\). \(\square \)

Remark 2

Bencs [7] shows that the independence polynomials of graphs containing a moderate number of claws are zero free in some sector. In a similar way one can show that also for bounded degree graphs in this class of graphs the hard-core measure satisfies strong spatial mixing cf. [4, Sect. 3.5].

4 The graph homomorphism partition function

Here we prove Theorem 3. We follow the same strategy as in the previous section.

Let us start by introducing the ratios. Let \(G=(V,E)\) be a graph and let \(v\in V\), \(i\in [q]\) and let A be a symmetric, nonnegative \(q\times q\) matrix, where q is a positive integer such that \(Z_G(A)>0\). For a boundary condition \(\sigma :\Lambda \rightarrow [q]\) on some set \(\Lambda \subset V\setminus \{v\}\). We denote by \(\sigma _{v,i}\) the extension of \(\sigma \) to \(\Lambda \cup \{v\}\) that assigns i to the vertex v. In what follows we denote by J the all ones matrix of the appropriate size. We define the following rational function in the variable z

$$\begin{aligned} P^\sigma _{G,v,i;A}(z)=\frac{Z^{\sigma _{v,i}}_{G}(J+z(A-J))}{Z^{\sigma }_{G}(J+z(A-J))} \end{aligned}$$
(17)

and refer to it as a ratio at v. We note that, as in the case of the hard-core model, we have

$$\begin{aligned} P^\sigma _{G,v,i;A}(1)=\Pr _{\mu _A}[\phi (v)=i\mid \sigma ]. \end{aligned}$$

4.1 Ratios and their series expansion

We do have to do a bit more work to find the series expansion of these ratios. In particular, we need to equip the model with an external field parameter \(\xi \in {\mathbb {C}}^{V\times [q]}\). We define for a graph \(G=(V,E)\),

$$\begin{aligned} Z_{G}(A,\xi )=\sum _{\phi :V\rightarrow [q]}\prod _{v\in V}\xi _{v,\phi (v)}\cdot \prod _{uv\in E}A_{\phi (u),\phi (v)}. \end{aligned}$$
(18)

Moreover, for a boundary condition \(\sigma :\Lambda \rightarrow [q]\) on some set \(\Lambda \subset V\) we denote by \(Z^\sigma _G(A,\xi )\) the partition function defined as above where we only sum over those \(\phi \) that restricted to \(\Lambda \) coincide with \(\sigma \). The following lemma explains the usefulness of introducing the external field parameters.

Lemma 7

Let A be a symmetric \(q\times q\) matrix. Let \(G=(V,E)\) be a graph, let \(v\in V\) and \(i\in [q]\) and let \(\Lambda \subset V\setminus \{v\}\) be equipped with a boundary condition \(\sigma : \Lambda \rightarrow [q]\). Then

$$\begin{aligned} P^\sigma _{G,v,i;A}(z)=\frac{\partial }{\partial \xi _{v,i}}\log (Z_{G}(J+z(A-J),\xi ))|_{\xi =1}. \end{aligned}$$

Proof

This follows directly from the fact that

$$\begin{aligned} \frac{\partial }{\partial \xi _{v,i}}Z^{\sigma }_{G}(J+z(A-J),\xi )|_{\xi =1}=Z^{\sigma _{v,i}}_{G}(J+z(A-J)) \end{aligned}$$

and the standard rules for the derivative of the logarithm. \(\square \)

Next we wish to use the cluster expansion to find a series expansion for \(\log (Z^\sigma _{G}(J+z(A-J),\xi ))\) with \(\sigma \) a boundary condition on some set \(\Lambda \subseteq V\setminus \{v\}\). To do this we will have to realize the graph homomorphism partition function as the multivariate independence polynomial of an auxiliary graph \(\Gamma \). This will be done in a similar way as in [10, 38].

The vertex set of the auxiliary graph \(\Gamma \) will consist of the connected subgraphs of V(G) with at least one edge. Two vertices \(H_1=(S_1,E_1)\) and \(H_2=(S_2,E_2)\) of \(\Gamma \) are connected by an edge if and only if \(S_1\) and \(S_2\) intersect. Next we define the vertex weights. For a connected subgraph \(H=(S,F)\) of G we define the weight, \(w^\sigma (H)\), of H by

$$\begin{aligned} w^\sigma (H):=\frac{z^{|F|}Z^{\sigma }_H(A-J,\xi )}{\left( \prod _{v\in S\setminus \Lambda }\sum _{i=1}^q\xi _{v,i}\right) \cdot \prod _{v\in \Lambda \cap S}\xi _{v,\sigma (v)}}, \end{aligned}$$
(19)

where we understand \(\sigma \) to be restricted to \(\Lambda \cap S\). We also define

$$\begin{aligned} p^\sigma (\xi ):=\left( \prod _{v\in V\setminus \Lambda }\sum _{i=1}^q\xi _{v,i}\right) \cdot \prod _{v\in \Lambda }\xi _{v,\sigma (v)}. \end{aligned}$$

Lemma 8

With definitions as above we have

$$\begin{aligned} p^\sigma (\xi )Z_{\Gamma }(w^\sigma )=Z^\sigma _{G}(J+z(A-J),\xi ). \end{aligned}$$

Proof

This follows from expanding the product over E in the definition of \(Z^\sigma _{G}(J+z(A-J),\xi )\). We have that \((p^\sigma (\xi ))^{-1}Z^\sigma _{G}(J+z(A-J),\xi )\) is equal to

$$\begin{aligned}&(p^\sigma (\xi ))^{-1}\sum _{\begin{array}{c} \phi :V\rightarrow [q]\\ \phi |_{\Lambda }=\sigma \end{array}}\prod _{v\in V}\xi _{v,\phi (v)}\cdot \prod _{uv\in E}( J+z(A-J))_{\phi (u),\phi (v)}\\&\quad =(p^\sigma (\xi ))^{-1} \sum _{\begin{array}{c} \phi :V\rightarrow [q]\\ \phi |_{\Lambda }=\sigma \end{array}}\prod _{v\in V}\xi _{v,\phi (v)}\cdot \sum _{F\subseteq E}z^{|F|}\prod _{uv\in F}(A-J)_{\phi (u),\phi (v)}\\&\quad =\sum _{F\subseteq E} z^{|F|} \left( \prod _{v\in V(F)\setminus \Lambda }\sum _{i=1}^q\xi _{v,i}\right) ^{-1}\cdot \left( \prod _{v\in \Lambda \cap V(F)}\xi _{v,\sigma (v)}\right) ^{-1} \\&\qquad \times \sum _{\begin{array}{c} \phi :V(F)\rightarrow [q]\\ \phi |_{\Lambda \cap V(F)}=\sigma |_{\Lambda \cap V(F)} \end{array}}\prod _{v\in V(F)}\xi _{\phi (v)}\cdot \prod _{uv\in F}(A-J)_{\phi (u),\phi (v)}. \end{aligned}$$

Now for \(F\subseteq E\) fixed, the contribution to the sum is multiplicative over the connected components of F and for such a component H this contribution is exactly given by \(w^\sigma (H)\). This implies the statement of the lemma as the independent sets in \(\Gamma \) are exactly the collections of pairwise vertex disjoint connected subgraphs with at least one edge of G. \(\square \)

For a graph G, two positive integers \(\ell ,k\) and a vertex \(v\in V(G)\) we define \({\mathcal {C}}_{v;\ell ,k}(G)\) to be the collection consisting of sequences \((H_1,\ldots ,H_k)\) of connected subgraphs of G with at least two vertices satisfying

  1. (i)

    \(\sum _{j=1}^k |E(H_j)|=\ell \),

  2. (ii)

    \(v\in \bigcup _{i=j}^k V(H_j)\),

  3. (iii)

    the graph \(\Gamma (H_1,\ldots ,H_k)\) is connected.

Let us denote the scaled weights \({\widehat{w}}^\sigma (H)=w^\sigma (H)z^{-|E(H)|}\) for any connected subgraph H of G. By applying the cluster expansion to \(Z_\Gamma (w^\sigma )\) we obtain the following series expansion for the ratio:

Lemma 9

As a series in z we have that \(P^\sigma _{G,v,i;A}(z)\) near \(z=0\) is equal to

$$\begin{aligned} 1/q+\sum _{\ell \ge 1}z^\ell \sum _{k\ge 1}\frac{1}{k!}\sum _{(H_1,\ldots ,H_k)\in {\mathcal {C}}_{v;\ell ,k}(G)}\phi (\Gamma (H_1,\ldots ,H_k)) \frac{\partial }{\partial \xi _{v,i}}\prod _{j=1}^k w^\sigma (H_j)|_{\xi =1}. \end{aligned}$$
(20)

In particular, the \(\ell \)-th term of the series only depends on the distance at most \(\ell \) neighbouhood of the vertex v in G (and the boundary condition \(\sigma \) restricted to this neighbourhood).

Proof

By Lemma 7 and the previous lemma it suffices to compute the partial derivative with respect to \(\xi _{v,i}\) of \(\log (p^\sigma (\xi ))\) and \(\log (Z_\Gamma (w^\sigma ))\), evaluate the result at \(\xi =1\) and add these.

It is not difficult to see that

$$\begin{aligned} \frac{\partial }{\partial \xi _{v,i}}(\log (p^\sigma (\xi )))|_{\xi =1}=1/q. \end{aligned}$$

For the other derivative we first use (7) to obtain that as a series in z, near \(z=0\), whenever the \(\xi _{u,j}\) are sufficently close to 1, we have

$$\begin{aligned} \log (Z_\Gamma (w^\sigma ))&=\sum _{k\ge 1}\frac{1}{k!}\sum _{H_1,\ldots ,H_k\in V(\Gamma )} \phi (\Gamma (H_1,\ldots ,H_k)) \prod _{i=1}^k w^\sigma (H_i). \\&=\sum _{\ell \ge 1}z^\ell \sum _{k\ge 1}\frac{1}{k!}\sum _{\begin{array}{c} H_1,\ldots ,H_k\in V(\Gamma )\\ \sum _{i=1}^q |E(H_i)|=\ell \end{array}} \phi (\Gamma (H_1,\ldots ,H_k)) \prod _{i=1}^k {\widehat{w}}^\sigma (H_i). \end{aligned}$$

Next observe that for a connected subgraph H of G we have \(\frac{\partial }{\partial \xi _{v,i}}{\widehat{w}}^\sigma (H)=0\) if \(v\notin V(H)\). Therefore \(\frac{\partial }{\partial \xi _{v,i}}(\log (Z_\Gamma (w^\sigma ))|_{\xi =1}\) has the following series expansion in z near \(z=0:\)

$$\begin{aligned} \sum _{\ell \ge 1}z^\ell \sum _{k\ge 1}\frac{1}{k!}\sum _{(H_1,\ldots ,H_k)\in {\mathcal {C}}_{v;\ell ,k}(G)}\phi (\Gamma (H_1,\ldots ,H_k)) \frac{\partial }{\partial \xi _{v,i}}\prod _{j=1}^k {\widehat{w}}^\sigma (H_j)|_{\xi =1}. \end{aligned}$$

This finishes the proof. \(\square \)

4.2 Bounded ratios imply strong spatial mixing

Lemma 10

Let \(q\ge 2\) be an integer and let A be a nonnegative and nonzero symmetric \(q\times q\) matrix. Let \({\mathcal {G}}\) be a family of graphs. Suppose there exists constants \(r>1\) and \(M>0\) such that for all \(z\in {\mathbb {D}}_{r}\), the ratios, \(P^\sigma _{G,v,i;A}\), satisfy \(|P^\sigma _{G,v,i;A}(z)|\le M\) for all \(G\in {\mathcal {G}}\), all vertices \(v\in V(G)\) and all boundary conditions \(\sigma :\Lambda \rightarrow [q]\) for \(\Lambda \subset V\setminus \{v\}.\) Then the measure \(\mu _A\) satisfies strong spatial mixing on \({\mathcal {G}}\) with exponential rate r.

Proof

Let \(G=(V,E)\in {\mathcal {G}}\), let \(v\in V\) and let \(\sigma \) and \(\tau \) be two boundary conditions on some set \(\Lambda \subset V\setminus \{v\}\). Since by Lemma 9 for \(\ell =0,\ldots d_G(\sigma \ne \tau )-1\) the coefficients of the series expansion around \(z=0\) for \(P^\sigma _{G,v,i;A}(z)\) and \(P^\tau _{G,v,i;A}(z)\) are the same, the result follows from applying Lemma 1 to \(P(z):=P^\sigma _{G,v,i;A}(z)-P^\tau _{G,v,i;A}(z)\). \(\square \)

4.2.1 Algorithms

Just as for the hard-core model, implicit in our proof of the above lemma there is an efficient algorithm for (approximately) computing the probabilities \(\Pr _\mu [{\psi }(v)=i\mid \sigma ]\) on graphs of maximum degree at most \(\Delta \). Again the basic idea is just to truncate the series for \(P^\sigma _{G,v,i;A}\) at convenient depth, say K, so as to obtain the desired approximation. This can be done efficiently (i.e. in time \(\Delta ^{O(K)}\)) with an algorithm appearing in the proof of Theorem 6 from [21]. We leave the details to the reader.

4.3 Bounded ratios from absence of zeros

For a graph \(G=(V,E)\) with a given orientation of the edges and \(q\times q\) matrices \(A^e\) for each edge \(e=(u,v)\), we define

$$\begin{aligned} Z_{G}((A^e)_{e\in E})=\sum _{\psi :V\rightarrow [q]}\prod _{(u,v)\in E}A^{(u,v)}_{\psi (u),\psi (v)}. \end{aligned}$$

As before we have a similar definition for \(Z^\sigma _{G}((A^e)_{e\in E})\) for a boundary condition \(\sigma \) on some \(\Lambda \subseteq V\).

We will need the following result due to Barvinok [3]:

Theorem 5

(Theorem 7.1.4 of [3]) Let \(\Delta \ge 3\) and \(q\ge 2\) be integers and let for some \(\alpha =\alpha _\Delta <2\pi /3\Delta \),

$$\begin{aligned} \delta _\Delta =\sin (\alpha /2)\cos (\alpha \Delta /2). \end{aligned}$$

let A be any \(q\times q\) matrix satisfying \(|A_{i,j}-1|\le \delta _\Delta \) for all \(i,j=1,\ldots ,q\). Then for any orientation of any graph \(G=(V,E)\in {\mathcal {G}}_\Delta \) and any boundary condition \(\sigma \) on any \(\Lambda \subseteq V\), \(Z^\sigma _G(A)\ne 0\).

Remark 3

In fact this theorem is not stated as Theorem 7.1.4 in [3]. However, in his proof of Theorem 7.1.4 in [3], Barvinok shows that the statement of the theorem is true for symmetric matrices and ordinary graphs satisfying the condition. The extension to not necessarily symmetric matrices follows along exactly the same lines. See [13] for a proof of an analogues statement derived from [5]. We therefore omit a proof.

Remark 4

Barvinok has proven a stronger zero-freeness result for matrices whose entries are close to the real axis [3, Theorem 7.2.2]. However it only applies to boundary conditions defined on connected sets and this assumption is crucial in the proof. It would be interesting to see if the connectedness assumption can be removed somehow.

The following result can be derived from the theorem above in combination with an idea from [13].

Lemma 11

Let \(\Delta \ge 3\) and \(q\ge 2\) be integers and let for some \(\alpha =\alpha _\Delta <2\pi /3\Delta \),

$$\begin{aligned} \delta _\Delta :=\sin (\alpha /2)\cos (\alpha \Delta /2). \end{aligned}$$

Choose \(\eta >0\), \(\varepsilon >0\) and let \(A\in {\mathbb {R}}^{q\times q}\) be a symmetric matrix such that \(|A_{i,j}-1| \le \frac{\delta _\Delta }{ (1+\varepsilon )(1+\eta )}\) for all \(i,j=1,\ldots ,q\). Then for any connected graph \(G=(V,E)\in {\mathcal {G}}_\Delta \), any vertex \(v\in V\) and any \(i\in [q]\) and any boundary condition \(\sigma \) on \(\Lambda \subset V\setminus \{v\}\)

$$\begin{aligned} |P^\sigma _{G,v,i;A}(z)|\le 1/\varepsilon . \end{aligned}$$

for all \(z\in {\mathbb {D}}_{1+\eta }\).

Proof

We argue by contradiction. Suppose that for some \(z\in {\mathbb {D}}_{1+\eta }\), \(P:=P^\sigma _{G,v,i;A}(z)\) satisfies \(|P|>1/\varepsilon \).

Recall the definition of the partition function with boundary conditions \(\xi \in {\mathbb {C}}^{V\times [q]}\) (18). Orient the edges of G. For \(\xi _{u,j}\in B(1,\varepsilon )\) for each \(u\in V\) and \(j\in [q]\), to be determined later, define for each edge \(e=(u,w)\) a matrix \(B^e\) by

$$\begin{aligned} B^e_{i,j}=1+z(A_{i,j}-1)\cdot \xi _{u,i}^{1/\deg (u)}\xi _{w,j}^{1/\deg (w)} \end{aligned}$$

for \(i,j=1,\ldots ,q\). Now we set \(\xi _{u,j}=1\) unless \(u=v\) and \(j=i\) in which case we set \(\xi _{v,i}=1-1/P\in B(1,\varepsilon )\). By construction, the matrices \(B^e\) satisfy the condition of Theorem 5 and hence

$$\begin{aligned} Z^\sigma _{G}((B^e)_{e\in E}))\ne 0. \end{aligned}$$

However, expanding the sum over all possible colors of the vertex v, we get

$$\begin{aligned} Z^\sigma _{G}((B^e)_{e\in E}))=\sum _{j=1}^qZ^{\sigma _{v,j}}_G(J+z(A-J))-1/P Z^{\sigma _{v,i}}_G(J+z(A-J)=0, \end{aligned}$$

by definition of the ratio P. This is clearly a contradiction and finishes the proof. \(\square \)

The proof of Theorem 3 now follows quickly.

Proof of Theorem 3

Using Lemma 11 combined with Lemma 10 the desired result is immediate. \(\square \)

Remark 5

Using Theorem 4 (Montel’s theorem) it is possible to prove a version of Lemma 11 only requiring univariate zero-freeness as opposed to the possibly stronger notion of multivariate zero-freeness. A sufficient condition would for example be that the numerator and denominator in the definition of the ratio are nonzero as well as that their difference is nonzero, so that the ratio avoids the points 0, 1 and \(\infty \). (Various variations are possible since Theorem 4 (Montel’s theorem) is quite flexible to use.) Conceivably this could lead to better bounds for specific matrices A, but we are not aware of any concrete examples.

5 Concluding remarks

As mentioned in the introduction our approach is quite robust and is applicable to many other models as well. The two examples that were covered in the previous sections essentially suggest a recipe for proving strong spatial mixing from absence of complex zeros. Roughly the steps are as follows.

  1. 1.

    Express the conditional probability as a rational function and bound this function using knowledge about absence of complex zeros of the partition function (with boundary conditions) either using Theorem 4 (Montel’s theorem) or a variant of Lemma 11.

  2. 2.

    Express the partition function of the model as the multivariate independence polynomial of an auxiliary graph with suitable weights.

  3. 3.

    Use the cluster expansion to obtain a combinatorial interpretation of the coefficients of the series expansion of the rational function and show that the kth coefficient depends only on the depth O(k) neighbourhood of the root vertex.

It would be very interesting to know if strong spatial mixing with exponential rate implies absence of zeros in some qualitative sense. We expect some version of this implication to be true, but for now we refrain from making any bold conjectures. Instead, we state a concrete question for the independence polynomial, but the question is equally interesting for other models as well.

Question 1

Let \({\mathcal {G}}\) be an infinite family of bounded degree graphs. Suppose there exist constants \(r>1\) and \(\lambda ^\star >0\) such that the hard-core measure at any \(\lambda \in (0,\lambda ^\star ]\) satisfies strong spatial mixing with exponential rate r on \({\mathcal {G}}\). Does there exist an open set \(U\subset {\mathbb {C}}\) containing \([0,\lambda ^\star ]\) such that for all \(G\in {\mathcal {G}}\) and \(\lambda \in U\), \(Z_G(\lambda )\ne 0\)?

Another interesting question can be found when looking at colorings of trees:

Question 2

Consider the \(q\times q\) matrix \(J-I\), where I denotes the identity matrix. The partition function \(Z_G(J-I)\), is equal to the number of proper q-colorings of the graph G. It was recently shown that \(\mu _{(J-I)}\) satisfies strong spatial mixing on the collection of all trees of maximum degree at most \(\Delta \) provided \(q\ge 1.59 \Delta \) [18]. It is however only known that there exists some \(\varepsilon >0\) such that \(Z_T(J-I+zI)\ne 0\) for all z in an \(\varepsilon \)-neighbourhood of the unit interval and all trees of maximum degree at most \(\Delta \) with boundary conditions for \(q\ge 2\Delta \) [28]. Can the constant 2 be replaced by 1.59? So as to match the strong spatial mixing result.