1 Introduction

An invariant is a logical assertion at a certain program location that always holds whenever the program executes across that location. Invariants are indispensable parts of program analysis and formal verification, and thus the generation of invariants has been key to the proof and analysis of crucial properties like reachability [3, 6, 15], time complexity [9] and safety [2, 32]. To ease program analysis and formal verification, there has been a long thread of research on approaches to automatic generation of invariants, including constraint solving [10, 12, 27], recurrence analysis [17, 24, 29, 31], abstract interpretation [13, 14], logical inference [18, 19, 38], dynamic analysis [33, 39], and machine learning [20, 23, 44]. To guarantee that an assertion is indeed an invariant, the widely-adopted paradigm is to generate an inductive invariant that holds for the first execution and for every periodic execution to the particular program location [12, 32]. In this work, we consider an important subclass of invariants called numerical invariants which are assertions over the numerical values taken by the program variables, and are closely related to many common vulnerabilities like integer overflow, buffer overflow, division by zero and array out-of-bound. More specifically, we consider affine inductive invariants in the form of an affine inequality over program variables, and focus on affine while loops that have affine loop guards (as a conjunction of affine inequalities) and affine updates for the program variables but do not have nested loops.

To automate the generation of affine inductive invariants, we adopt the constraint-solving based approach with three steps. First, it establishes a template with unknown parameters for the target invariants. Second, it collects constraints derived from the inductive conditions. Finally, it solves the unknown parameters to get the desired invariants. Prior work in this space [12, 37] leverages Farkas’ lemma to provide a sound and complete characterization for the inductive conditions and then generates the affine inductive invariants either by the complete approach of quantifier elimination [12] or through several heuristics [37]. Specifically, the StInG invariant generator [40] implements the approach in [37], and the InvGen invariant generator [22] integrates abstract interpretation as well as the approach in [37]. Furthermore, a recent effort [34] leverages eigenvalues and eigenvectors for inferring a restricted class of invariants. Finally, some recent work considers decidable logic fragments that directly verify properties of loops [4, 11, 28, 30]. Compared with other approaches such as machine learning and dynamic analysis, constraint solving has a theoretical guarantee on the correctness and accuracy of the generated invariants, yet typically at the cost of higher runtime complexity.

The novelty of our approach lies in that it completely addresses the constraints derived from Farkas’ lemma by matrix methods, thus ensuring both generality and efficiency. In detail, this paper makes the following contributions (due to the page limit, the current paper is abridged. The full version is available at [25]):

  • For affine while loops with tautological guard, we prove that the affine inductive invariants are determined by the eigenvalues and eigenvectors of the matrices that describe variable updates in the loop body.

  • For affine while loops whose loop guard is a conjunction of affine inequalities, we solve the affine inductive invariants by first deriving through matrix inverse a formula with a key parameter in the application of Farkas’ lemma, then solving the feasible domain of the key parameter from the inductive conditions, and finally showing that it suffices to choose a finite number of values for the key parameter if one imposes a tightness condition on the invariants.

  • We generalize our results to affine while loops with non-deterministic updates and to bidirectional affine invariants. A continuity property on the invariants w.r.t. the key parameter is also proved for tackling the numerical issue arising from the computation of eigenvectors. Experimental results on existing benchmarks and new benchmarks arising from linear dynamical systems demonstrate the generality and efficiency of our approach.

1.1 Related Work

Constraint Solving. There have been several prior approaches [12, 37] using constraint solving for invariant generation based on Farkas’ lemma. Compared to the approach in [12] that uses quantifier elimination to solve the constraints from Farkas’ lemma, our approach is more efficient since it only involves the matrix computation. Compared with [37] that uses several heuristics, our approach is more general and complete in addressing all the cases in affine invariant generation. While the approach in [34] also uses eigenvectors, it is restricted to the subclass of equality and convergent invariants. In contrast, our approach targets at general affine inductive invariants over affine while loops. Other prior work [4, 11, 28, 30] considers to have a decidable logic for unnested affine while loops with tautological guard but no conditional branches. Compared with them, our approach handles general affine while loops and targets at invariant generation.

Abstract Interpretation. A long thread of research to infer inductive invariants is using abstract interpretation [1, 7, 22, 35] framework which constructs sound approximations for program semantics. In a nutshell, it first establishes an abstract domain for the specific form of properties to be generated, and then performs fixed-point computation in the abstract domain. Abstract interpretation generates invariants whose precision depends on the abstract domain and abstract operators, except for rare special cases [21, 37].

Recurrence Analysis. Another closely-related technique is recurrence analysis [8, 17, 24, 29, 31]. The main idea is transforming the problem of invariant generation into a recurrence relation problem and then solve the latter one. The main limitation of recurrence analysis is that it requires the underlying recurrence relation to have a closed-form solution. This requirement, unfortunately, does not hold for the general case of affine inductive invariants over affine while loops.

Logical Inference. Invariants could also be obtained through logical inference, such as abductive inference [16], Craig interpolation [18], ICE learning [19, 43], random search [38], etc. These approaches, however, cannot provide any theoretical guarantee on the accuracy of the generated numerical invariants. In contrast, our approach essentially addresses this issue.

Dynamic Analysis. Dynamic analysis [33, 39] has also been exploited to invariant generation. The major process is first to collect the execution traces of a particular program by running it multiple times, and then guess the invariants based on these traces. As indicated in its process, dynamic analysis provides no guarantee on the correctness or accuracy of the inferred invariants, yet still pays the price of running the program at a large amount of time.

Machine Learning. There is a recent trend of applying machine learning [20, 23, 44] to solve the invariant-generation problem. Such approaches first establish a (typically large) training set of data, then use training approaches such as neural networks to generate invariants. Compared to our approach, those approaches require a large training set, while still having no theoretical guarantee on the correctness or accuracy. Specifically, such approaches cannot produce specific numerical values (e.g., eigenvalues) that are required to handle some examples in this work.

2 Preliminaries

In this section, we specify the class of affine while loops and define the affine-invariant-generation problem over such loops. Throughout the paper, we use \(V=\{x_{1},...,x_{n}\}\) to denote the set of program variables in an affine while loop; we abuse the notation V so that it also represents the current values (before the execution of the loop body) of the original variables in V, and use the primed variables \(V^{\prime }:=\{x^{\prime }\mid x\in V\}\) for the next values (after the execution of the loop body). Furthermore, we denote by \(\mathbf {x}=[x_{1},...,x_{n}]^{\mathrm {T}}\) the vector variable that represents the current values of the program variables, and by \(\mathbf {x}^{\prime }=[x^{\prime }_{1},...,x^{\prime }_{n}]^{\mathrm {T}}\) the vector variable for the next values.

An affine while loop is a while loop without nested loops that has affine updates in each assignment statement and possibly multiple conditional branches in the loop body. To formally specify the syntax of it, we first define affine inequalities and assertions, program states and satisfaction relation between them as follows.

Affine Inequalities and Assertions. An affine inequality \(\phi \) is an inequality of the form \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {y} + d\le 0\) where \(\mathbf {c}\) is a real vector, \(\mathbf {y}\) is a vector of real-valued variables and d is a real scalar. An affine assertion is a finite conjunction of affine inequalities. An affine assertion is satisfiable if it is true under some assignment of real values to its variables. Given an affine assertion \(\psi \) over vector variable \(\mathbf {x}\), we denote by \(\psi ^{\prime }\) the affine assertion obtained by substituting \(\mathbf {x}\) in \(\psi \) with its next-value variable \(\mathbf {x}^{\prime }\).

Program States. A program state \(\mathbf {v}\) is a real vector \(\mathbf {v}=[v_{1},...,v_{n}]^{\mathrm {T}}\) such that each \(v_i\) is a concrete value for the variable \(x_i\) (in the vector variable \(\mathbf {x}\)). We say that a program state \(\mathbf {v}\) satisfies an affine inequality \(\phi =\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\), written as \(\mathbf {v}\models \phi \), if it holds that \(\mathbf {c}^\mathrm {T}\cdot \mathbf {v}+d\le 0\). Likewise, \(\mathbf {v}\) satisfies an affine assertion \(\psi \) if it satisfies every conjunctive affine inequality in \(\psi \). Furthermore, given an affine assertion \(\psi \) with both \(\mathbf {x}\) and \(\mathbf {x}'\), we say that two program states \(\mathbf {v},\mathbf {v}^{\prime }\) satisfy \(\psi \), written as \(\mathbf {v},\mathbf {v}'\models \psi \), if \(\psi \) is true when one substitutes \(\mathbf {x}\) by \(\mathbf {v}\) and \(\mathbf {x}^{\prime }\) by \(\mathbf {v}^{\prime }\).

We then illustrate the syntax of (unnested) affine while loops as follows.

Affine While Loops. We consider affine while loops that take the form:

figure a

where (i) \(\theta \) is an affine assertion that specifies the initial condition for inputs and is given by the real matrix \(\mathbf {R}\) and vector \(\mathbf {f}\), (ii) G is an affine assertion serving as the loop guard given by the real matrix \(\mathbf {P}\) and vector \(\mathbf {q}\), and (iii) each \(\psi _{j}\) is an affine assertion that represents a conditional branch, with the relationship between the current-state vector \(\mathbf {x}\) and the next-state vector \(\mathbf {x}^{\prime }\) given by the affine assertion \(\tau _j:=\mathbf {T}_{j}\cdot {\textbf {x}}-\mathbf {T}_{j}^{\prime }\cdot \mathbf {x}^{\prime }+\mathbf {b}_{j}\le \mathbf {0}\) with transition matrices \(\mathbf {T}_{j},\mathbf {T}_{j}^{\prime }\) and vector \(\mathbf {b}_{j}\). In this work, we always assume that the rows of \(\mathbf {R}\) are linearly independent (this condition means that every variable \(x_{i}\) has one independent initial condition attached to it, which holds in most situations such as a fixed initial program state), such that \(\mathbf {R}^{\mathrm {T}}\) is left invertible; we denote its left inverse as \((\mathbf {R}^{\mathrm {T}})^{-1}_\mathrm {L}\).

The execution of an affine while loop is as follows. First, the loop starts with an arbitrary initial program state \(\mathbf {v}^*\) that satisfies the initial condition \(\theta \). Then in each loop iteration, the current program state \(\mathbf {v}\) is checked against the loop guard G. In the case that \(\mathbf {v}\models G\), the loop arbitrarily chooses a conditional branch \(\psi _j\) satisfying \(\mathbf {v}\models \psi _j\), and sets the next program state \(\mathbf {v}'\) non-deterministically such that \(\mathbf {v},\mathbf {v}'\models \tau _j\); the next program state \(\mathbf {v}^{\prime }\) is then set as the current program state. Otherwise (i.e., \(\mathbf {v}\not \models G\)), the loop halts immediately.

Now we define affine inductive invariants over affine while loops. Informally, an affine inductive invariant is an affine inequality satisfying the initiation and consecution conditions which mean that the inequality holds at the start of the loop (initiation) and is preserved under every iteration of the loop body (consecution).

Affine Inductive Invariants. An affine inductive invariant for an affine while loop (\(\dag \)) is an affine inequality \(\varPhi \) that satisfies the initiation and consecution conditions as follows:

  • (Initiation) \({\theta }\) implies \(\varPhi \), i.e., \(\mathbf {v}\models {\theta }\) implies \(\mathbf {v}\models \varPhi \) for all program states \(\mathbf {v}\);

  • (Consecution) for all program states \(\mathbf {v},\mathbf {v}^{\prime }\) and every \(\psi _j,\tau _j\) (\(1\le j\le k\)) in (\(\dag \)), we have that \((\mathbf {v}\models G\wedge \mathbf {v}\models \varPhi \wedge \mathbf {v},\mathbf {v}^{\prime }\models \tau _j) \Rightarrow \mathbf {v}^{\prime }\models \varPhi ^{\prime }\).

From the definition above, it can be observed that an affine inductive invariant is an invariant, in the sense that every program state traversed (as a current state at the start or after every loop iteration) in some execution of the underlying affine while loop will satisfy the affine inductive invariant.

From now on, we abbreviate affine while loops as affine loops and affine inductive invariants as affine invariants.

Problem Statement. In this work, we study the problem of automatically generating affine invariants over affine loops. Our aim is to have a complete mathematical characterization on all such invariants and develop efficient algorithms for generating these invariants.

3 Affine Invariants via Farkas’ Lemma

Affine invariant generation through Farkas’ lemma is originally proposed in [12, 37]. Farkas’ lemma is a fundamental result in the theory of linear inequalities that leads to a complete characterization for the affine invariants. Since our approach is based on Farkas’ lemma, we present a detailed account on the approaches in [12, 37], and point out the weakness of each of the approaches.

Theorem 1

(Farkas’ Lemma). Consider the following affine assertion S over real-valued variables \(y_{1}\), ..., \(y_{n}\):

$$ S: \begin{bmatrix} a_{11}y_{1} + ... + a_{1n}y_{n} + b_{1} \le 0 \\ \vdots \\ a_{k1}y_{1} + ... + a_{kn}y_{n} + b_{k} \le 0 \end{bmatrix} $$

when S is satisfiable, it entails a given affine inequality

$$ \phi : c_{1}y_{1} + ... + c_{n}y_{n} + d \le 0 $$

if and only if there exist non-negative real numbers \(\lambda _{0}\), ..., \(\lambda _{k}\) such that (i) \(c_{j} = \sum ^{k}_{i=1} \lambda _{i}a_{ij}\) for \(1\le j \le n\) and (ii) \(d = (\sum ^{k}_{i=1} \lambda _{i}b_{i}) - \lambda _{0}\).

The application of Farkas’ lemma can be visualized by a table form as follows:

figure b

The intuition of the table form above is that one first multiplies the \(\lambda _i\)’s on the left to their corresponding affine inequalities (in the same row) on the right, and then sums these affine inequalities together to obtain the affine inequality at the bottom. In this paper, we will call the table form as Farkas table.

Given an affine loop as (\(\dag \)), the approaches in [12, 37] first establish a template \(\varPhi :c_{1}x_{1} + ... + c_{n}x_{n} +d \le 0\) for an affine invariant where \(c_1,\dots ,c_n,d\) are the unknown coefficients. Second, they establish constraints for the unknown coefficients from the initiation and consecution conditions for an affine invariant, as follows.

Initiation. By Farkas’ lemma, the initiation condition can be solved from the Farkas table (\(\ddag \)) with \(S:=\theta \) and \(\phi :=\varPhi \):

figure c

Here we rephrase the affine inequalities in \(\theta \) and \(\varPhi \) with the condensed matrix forms \(\mathbf {R}\cdot \mathbf {x}+\mathbf {f} \le \mathbf {0}\) and \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d \le 0\); we also use \(\boldsymbol{\lambda }=[\lambda _{1},\dots ,\lambda _k]^\mathrm {T}\) to denote the non-negative parameters in the leftmost column of (\(\ddag \)).

Consecution. The consecution condition can be solved by handling each conditional branch (specified by \(\tau _j,\psi _j\) in (\(\dag \))) separately. By Farkas’ lemma, we treat each conditional branch by the Farkas table (\(\ddag \)) with \(S:=\varPhi \wedge G\wedge \tau _{j}\) and \(\phi :=\varPhi ^{\prime }\):

figure d

Note that the Farkas table above contains quadratic constraints as we multiply an unknown non-negative parameter \(\mu \) to the unknown invariant \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d\le 0\) in the table. The Farkas tables for all conditional branches are grouped conjunctively together to represent the whole consecution condition.

The weakness of the approaches presented in [12, 37] lies at the treatment of the quadratic constraints from the consecution condition. The approach in [12] addresses the quadratic constraints by quantifier elimination that guarantees the theoretical completeness but typically has high runtime complexity. The approach in [37] solves the quadratic constraints by several heuristics that guess possible values for the key parameter \(\mu \) in (\(*\)) which causes non-linearity, hence losing completeness. Our approach considers to address parameter \(\mu \) through matrix-based methods (eigenvalues and eigenvectors, matrix inverse, etc.), which is capable of efficiently generating affine invariants (as compared with quantifier elimination in [12]) while still ensuring theoretical completeness (as compared with the heuristics in [37]).

4 Single-Branch Affine Loops with Deterministic Updates

For the sake of simplicity, we first consider the affine invariant generation for a simple class of affine loops where there are no conditional branches in the loop body and the updates of the next-value vector \(\mathbf {x}'\) are deterministic.

Formally, an affine loop with deterministic updates and a single branch takes the following form:

figure e

For the loop above, we aim at non-trivial affine invariants, i.e., affine invariants \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) with \(\mathbf {c}\ne \mathbf {0}\). We summarize our results below.

  1. 1.

    When the loop guard is ‘true’, there are only finitely many independent non-trivial invariants \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) where \(\mathbf {c}\) is an eigenvector of the transpose of the transition matrix \(\mathbf {T}\).

  2. 2.

    When the loop guard is not a tautology, there can be infinitely many more non-trivial invariants \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) with \(\mathbf {c}\) given by a direct formula in \(\mu \); in this case we derive the feasible domain of \(\mu \) and select finitely many optimal ones (which we call tight choices) among them.

In Sect. 4.1, we first derive the constraints from the initiation (\(\#\)) and consecution (\(*\)) conditions satisfied by the invariants. Then we solve these constraints for the tautological loop guard case in Sect. 4.2 and the single-constraint loop guard case in Sect. 4.3. Finally we generalize the results to the multi-constraint loop guard case in Sect. 4.4.

4.1 Derived Constraints from the Farkas Tables

We first derive the constraints from the Farkas tables as follows:

Initiation. Recall the Farkas table (\(\#\)) for initiation. We first compare the coefficients of \(\mathbf {x}\) above and below the horizontal line in (\(\#\)), and obtain

$$\begin{aligned} \boldsymbol{\lambda }^{\mathrm {T}}\cdot \mathbf {R}=\mathbf {c}^{\mathrm {T}}\ \Rightarrow \ \mathbf {R}^{\mathrm {T}}\cdot \boldsymbol{\lambda }=\mathbf {c}. \end{aligned}$$
(1)

Then by comparing the constant terms in (\(\#\)), we have:

$$\begin{aligned} -\lambda ^\mathrm {I}_{0}+\boldsymbol{\lambda }^{\mathrm {T}}\cdot \mathbf {f}=d\ \Rightarrow \ \mathbf {f}^{\mathrm {T}}\cdot \boldsymbol{\lambda }-d=\lambda ^\mathrm {I}_{0}\ge 0. \end{aligned}$$
(2)

Note that \(\mathbf {R}^\mathrm {T}\) has left inverse \((\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\), thus constraint (1) is equivalent to \(\boldsymbol{\lambda }=(\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}} \cdot \mathbf {c}\). Plugging it into (2) yields

$$\begin{aligned} \mathbf {f}^{\mathrm {T}}\cdot (\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\cdot \mathbf {c}-d=\lambda ^\mathrm {I}_{0}\ge 0. \end{aligned}$$
(3)

Consecution. The Farkas table (\(*\)) for consecution in the case of single-branch affine loops with deterministic updates is as follows:

figure f

Here the transition matrix \(\mathbf {T}\) is a \(n\times n\) square matrix, and \(\mathbf {b}\) is a n-dimensional vector. Since \(\tau \) contains only equalities, the components \(\eta _{1},...,\eta _{n}\) of the vector parameter \(\boldsymbol{\eta }\) do not have to be non-negative (while the components \(\xi _{1},...,\xi _{n}\) of \(\boldsymbol{\xi }\) and \(\mu \) must be non-negative). In this table, by comparing the coefficients of \(\mathbf {x}^{\prime }\) above and below the horizontal line, we easily get \(-\boldsymbol{\eta }=\mathbf {c}\). Then we substitute \(\boldsymbol{\eta }\) by \(-\mathbf {c}\) and compare the coefficients of \(\mathbf {x}\) above and below the horizontal line. We get

$$\begin{aligned} \mu \cdot \mathbf {c}^{\mathrm {T}}+\boldsymbol{\xi }^{\mathrm {T}}\cdot \mathbf {P} - \mathbf {c}^{\mathrm {T}}\cdot \mathbf {T}=\mathbf {0}^{\mathrm {T}}\ \Rightarrow \ \mu \cdot \mathbf {c} -\mathbf {T}^{\mathrm {T}}\cdot \mathbf {c}+\mathbf {P}^{\mathrm {T}}\cdot \boldsymbol{\xi }=\mathbf {0}. \end{aligned}$$
(4)

We also compare the constant terms and get

$$\begin{aligned} \mu \cdot d-\lambda ^\mathrm {C}_{0}+\boldsymbol{\xi }^{\mathrm {T}}\cdot \mathbf {q}-\mathbf {c}^{\mathrm {T}}\cdot \mathbf {b}=d\ \Rightarrow \ (\mu -1)d -\mathbf {b}^{\mathrm {T}}\cdot \mathbf {c}+\mathbf {q}^{\mathrm {T}}\cdot \boldsymbol{\xi }=\lambda ^\mathrm {C}_{0}\ge 0. \end{aligned}$$
(5)

The rest of this section is devoted to solving the invariants \(\varPhi :\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d\le 0\) which satisfy all constraints (1)–(5).

4.2 Loops with Tautological Guard

We first consider the simplest case where the loop guard is ‘true’:

figure g

In order for completely solving the non-linear constraints, we take three steps:

  1. 1.

    choose the correct \(\mu \), thus turn the non-linear constraints into linear ones;

  2. 2.

    use linear algebra method to solve out the vector \(\mathbf {c}\);

  3. 3.

    with \(\mu \) and \(\mathbf {c}\) known, find out the feasible domain of d and determine the optimal value of it. Here ‘optimality’ is defined by the fact that all invariants with other d’s in this domain are implied by the invariant with the ‘optimal’ d.

Step 1 and Step 2. We address the values of \(\mu ,\mathbf {c}\) by eigenvalues and eigenvectors in the following proposition:

Proposition 1

For any non-trivial invariant \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d\le 0\) of the loop (\(\diamond \)), we have that \(\mathbf {c}\) must be an eigenvector of \(\mathbf {T}^{\mathrm {T}}\) with a non-negative eigenvalue \(\mu \).

Proof

Since the loop guard is a tautology, we take the parameter \(\boldsymbol{\xi }\) to be \(\mathbf {0}\) in (4):

$$\begin{aligned} \mu \cdot \mathbf {c}-\mathbf {T}^{\mathrm {T}}\cdot \mathbf {c}=\mathbf {0}. \end{aligned}$$

It’s obvious that \(\mu \) must be a non-negative eigenvalue of \(\mathbf {T}^{\mathrm {T}}\) and \(\mathbf {c}\) is the corresponding eigenvector.   \(\square \)

Example 1

(Fibonacci numbers). Consider the sequence \(\{s_{n}\}\) defined by initial condition \(s_{1}=s_{2}=1\) and recursive formula \(s_{n+2}=s_{n+1}+s_{n}\) for \(n\ge 1\). If we use variables \((x_{1},x_{2})\) to represent \((s_{n},s_{n+1})\), then the sequence can be written as a loop:

figure h

The eigenvalues of matrix \(\mathbf {T}^\mathrm {T}\) are \(\frac{1-\sqrt{5}}{2},\frac{1+\sqrt{5}}{2}\); only the second one is non-negative. This eigenvalue \(\mu =\frac{1+\sqrt{5}}{2}\) yields eigenvector \(\mathbf {c}=[c_{1}, \frac{1+\sqrt{5}}{2}c_{1}]^\mathrm {T}\), here \(c_{1}\) is a free variable, which could be fixed in the final form of the invariant.   \(\square \)

Step 3. After solving \(\mu \) and \(\mathbf {c}\), we illustrate the feasible domain of d and its optimal value by the following proposition:

Proposition 2

For any \(\mu \) and \(\mathbf {c}\) given by Proposition 1, the feasible domain of d is an interval determined by the two conditions below:

$$ d\le \mathbf {f}^{\mathrm {T}}\cdot (\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\cdot \mathbf {c}\quad \text {and} \quad (\mu -1)d\ge \mathbf {b}^{\mathrm {T}}\cdot \mathbf {c}. $$

If the above conditions have empty solution set, then no affine invariant is available from such \(\mu \) and \(\mathbf {c}\); otherwise, the optimal value of d falls in one of the two choices:

$$ d=\mathbf {f}^{\mathrm {T}}\cdot (\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\cdot \mathbf {c}\quad \text {or}\quad (\mu -1)d =\mathbf {b}^{\mathrm {T}}\cdot \mathbf {c}. $$

Proof

Constraint (3) provides one condition for d:

$$\begin{aligned} \mathbf {f}^{\mathrm {T}}\cdot (\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\cdot \mathbf {c}-d=\lambda ^\mathrm {I}_{0}\ge 0\ \Rightarrow \ \mathbf {f}^{\mathrm {T}}\cdot (\mathbf {R}^{\mathrm {T}})^{-1}_{\mathrm {L}}\cdot \mathbf {c}\ge d; \end{aligned}$$

while constraint (5) with \(\boldsymbol{\xi }=\mathbf {0}\) provides the other condition:

$$ (\mu -1)d -\mathbf {b}^{\mathrm {T}}\cdot \mathbf {c}=\lambda ^\mathrm {C}_{0}\ge 0\ \Rightarrow \ (\mu -1)d\ge \mathbf {b}^{\mathrm {T}}\cdot \mathbf {c}. $$

To obtain the strongest inequality \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d\le 0\), we need to take d to be either minimal or maximal value, i.e., some boundary point of its interval; thus the invariant with this d would imply all invariants with the same \(\mathbf {c}\) and other d’s in this interval. The boundary is achieved when one of the two conditions achieves the equality.   \(\square \)

Example 2

(Fibonacci, Part 2). We continue with Example 1. Recall that \(\mu =\frac{1+\sqrt{5}}{2},\ \mathbf {c}=[c_{1},\frac{1+\sqrt{5}}{2}c_{1} ]^\mathrm {T}\); in this case, constraints (3) (5) (with \(\boldsymbol{\xi }=\mathbf {0}\)) read \(-\frac{3+\sqrt{5}}{2}c_{1}\ge d\) and \(\frac{-1+\sqrt{5}}{2}d\ge 0\), hence yield \(0\le d\le -\frac{3+\sqrt{5}}{2}c_{1}.\) The free variable \(c_{1}\) must be negative here, so we choose \(c_{1}=-2\) and thus \(\mathbf {c}=[-2,-1-\sqrt{5}]^\mathrm {T}\) and \( 0\le d\le 3+\sqrt{5}\); there are two boundary values \(d=0\) and \(d=3+\sqrt{5}\), where \(d=3+\sqrt{5}\) leads to the strongest invariant:

figure i

4.3 Loops with Guard: Single-Constraint Case

Here we study the loops with non-tautological guard. First of all, the eigenvalue method of Sect. 4.2 applies to this case as well; thus for the rest of Sect. 4, we always assume that \(\mu \) is not any eigenvalue of \(\mathbf {T}\) (and \(\mathbf {c}\) is not any eigenvector of \(\mathbf {T}^{\mathrm {T}}\) either) and aim for other invariants than the ones from the eigenvectors.

Let us start with the case that the loop guard consists of only one affine inequality:

figure j

where \(\mathbf {p}\) is a n-dimensional real vector and q is a real number.

We again take three steps to compute the invariants; these steps are different from the previous case:

  1. 1.

    we derive a formula to compute \(\mathbf {c}\) in terms of \(\mu \); so for any non-negative real value \(\mu \), we get a corresponding \(\mathbf {c}\);

  2. 2.

    however, not all \(\mu \)’s would produce invariants that satisfy all constraints (1)–(5). We will determine the feasible domain of \(\mu \) that does so;

  3. 3.

    we will select finitely many \(\mu \)’s from its feasible domain which provide tight invariants; the meaning of tightness will be defined later. For every single \(\mu \), we will also determine the feasible domain of d and optimal value of it.

Step 1. We first establish the relationship between \(\mu \) and \(\mathbf {c}\) through the constraints. The initiation is still (1) (2) (3), while the consecution (4) (5) becomes:

figure k

where the matrix \(\mathbf {P}\) in (4) degenerates to vector \(\mathbf {p}^\mathrm {T}\) and the vectors \(\mathbf {q},\boldsymbol{\xi }\) in (5) both have just one component \(q, \xi \) here. Note that \(\xi \) is a non-negative parameter.

In contrast to Sect. 4.2, we assume that \(\mu \) is not any eigenvalue of \(\mathbf {T}\), and \(\xi \ne 0\). For such \(\mu \), we have a new formula to compute \(\mathbf {c}\):

Proposition 3

For any non-trivial invariant \(\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d\le 0\) of the loop (\(\diamond ^{\prime }\)), we have that \(\mathbf {c}\) is given by

$$\begin{aligned} \mathbf {c}=\xi \cdot (\mathbf {T}^{\mathrm {T}}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}\quad \text {with}\quad \xi \ge 0 \end{aligned}$$
(6)

when \(\mu \) is fixed, \(\mathbf {c}\)’s with different \(\xi \)’s are proportional to each other and yield equivalent invariants.

Proof

Since \(\mu \) is not any eigenvalue of \(\mathbf {T}\), the matrix \(\mu \cdot \mathbf {I}-\mathbf {T}^{\mathrm {T}}\) is invertible; thus (\(4^{\prime }\)) is equivalent to

figure l

Example 3

(Fibonacci, Part 3). We add a loop guard \(x_{1}\le 10\) to Example 1:

figure m

and search for more invariants. The formula (6) here reads

figure n

Step 2. With formula (6) in hand, every non-negative value \(\mu \) would give us a vector \(\mathbf {c}\); the next step is to find such \(\mu \)’s that (1) (2) (3) (\(5^{\prime }\)) are all satisfied. We call this set the feasible domain of \(\mu \).

Notice that (3) and (\(5^{\prime }\)) are two inequalities both containing d. When the value of \(\mu \) changes, there is a possibility that (3) and (\(5^{\prime }\)) conflict each other, hence make no invariant available. So the feasible domain consists of such \(\mu \)’s that make the two inequalities compatible with each other:

Proposition 4

For the loop (\(\diamond ^{\prime }\)), any feasible \(\mu \) falls in \([0,1)\cup \big (K\cap [1,+\infty )\big )\), where K is the solution set to the following rational inequality of \(\mu \) (which we call ‘compatibility condition’):

$$\begin{aligned} \mathbf {b}^\mathrm {T}\cdot (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}-q\le (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}(\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}. \end{aligned}$$
(7)

Proof

We multiply \((\mu -1)\) on both sides of (3) and get

figure o

compare them with (\(5^{\prime }\)), we see: (\(3^{\prime }\)) (\(5^{\prime }\)) would not conflict each other because they are both about \((\mu -1)d\) being ‘larger’ than something. However, (\(3^{\prime \prime }\)) (\(5^{\prime }\)) are two inequalities of opposite directions, they together must satisfy

$$ \mathbf {b}^\mathrm {T}\cdot \mathbf {c}-\xi \cdot q\le (\mu -1)d \le (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}\cdot \mathbf {c} $$

to be compatible. Substitute \({\textbf {c}}\) by (6) in the above inequality and cancel out \(\xi >0\), we obtain the desired inequality:

$$\begin{aligned} \mathbf {b}^\mathrm {T}\cdot (\mathbf {T}^\mathrm {T}-\mu \cdot \mathrm {I})^{-1}\cdot \mathbf {p}-q \le (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}(\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}. \end{aligned}$$

Every \(\mu \) from [0, 1) and \(K\cap [1,+\infty )\) would lead to non-trivial invariant satisfying all constraints (1) (2) (3) (\(4^{\prime }\)) (\(5^{\prime }\)).   \(\square \)

Example 4

(Fibonacci, Part 4). Let us compute the feasible domain of \(\mu \) for Example 3. Inequality (\(5^{\prime }\)) is \((\mu -1)d \ge 10\xi \); inequality (\(3^{\prime \prime }\)) is

$$\begin{aligned} (\mu -1)[-1,-1]\cdot \begin{bmatrix} 1 &{} 0 \\ 0 &{} 1 \end{bmatrix}\cdot \mathbf {c}=\frac{\xi (\mu -1)\mu }{\mu ^2-\mu -1}\ge (\mu -1)d\ \ (\text {when}\ \mu \ge 1). \end{aligned}$$

We combine them to form the compatibility condition (7) as

$$\begin{aligned} 10\le \frac{(\mu -1)\mu }{\mu ^2-\mu -1}\ \Rightarrow \ 0\le -\frac{9(\mu -\frac{5}{3})(\mu +\frac{2}{3})}{(\mu -\frac{1-\sqrt{5}}{2})(\mu -\frac{1+\sqrt{5}}{2})}\ \ (\text {when}\ \mu \ge 1). \end{aligned}$$

The solution domain of it is \((\frac{1+\sqrt{5}}{2},\frac{5}{3}]\). Thus by Proposition 4, the feasible domain of \(\mu \) is \(\ [0,1)\cup (\frac{1+\sqrt{5}}{2},\frac{5}{3}]\).   \(\square \)

Step 3. Proposition 4 provides us with a continuum of candidates for \(\mu \), thus produces infinitely many legitimate invariants. We want to find a basis consisting of finitely many invariants, such that all invariants are non-negative linear combinations of the basis; however, this idea does not work out, where the reason is explained thoroughly in the full version of this paper [25, Appendix A.1 and A.2]. Instead, we impose a weaker form of optimality called tightness coming from the equality cases of constraints (3) (\(5^{\prime }\)):

$$\begin{aligned} \mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}\cdot \mathbf {c}-d&=\lambda ^\mathrm {I}_{0}= 0 \\ (\mu -1)d -\mathbf {b}^\mathrm {T}\cdot \mathbf {c}+\xi \cdot q&=\lambda ^\mathrm {C}_{0}= 0 \end{aligned}$$

we call an invariant tight and the corresponding \(\mu \) as tight choice when both equalities are achieved:

  • \(\lambda ^\mathrm {I}_{0}= 0\): The invariant is tight at the initial state, i.e., the invariant reaches equality at the initial state;

  • \(\lambda ^\mathrm {C}_{0}= 0\): The invariant stays as close to being tight as much at later iterations.

The non-tight choices could be kept as back-up for invariant generation. The tight choices are characterized by the following proposition:

Proposition 5

For the loop (\(\diamond ^{\prime }\)), the tight choices of \(\mu \) consist of 0 and the positive real roots of the following rational equation:

$$\begin{aligned} \mathbf {b}^\mathrm {T}\cdot (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}-q = (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}(\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}. \end{aligned}$$
(8)

Note that these roots are also the boundary points of the intervals in K defined in Proposition 4.

Proof

Recall Proposition 2, constraints (3) (5) form the two boundaries of the domain of d, which can not be achieved simultaneously in the case of loops with tautological guard. Nevertheless, in the case of loops with guard, we have an extra freedom on \(\mu \) which allows us to set \(\lambda ^\mathrm {I}_{0}=\lambda ^\mathrm {C}_{0}=0\):

$$\begin{aligned}&\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}\cdot \mathbf {c}=d\ \wedge \ (\mu -1)d =\mathbf {b}^\mathrm {T}\cdot \mathbf {c}-\xi \cdot q\nonumber \\ \Rightarrow \quad&\mathbf {b}^\mathrm {T}\cdot (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}-q = (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}(\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {p}. \end{aligned}$$

Equation (8) is just the case that (7) achieves the equality, hence is a rational equation of \(\mu \) with finite number of roots. These roots are also the boundary points of K since K is the solution domain to (7). Besides the roots of (8), \(\mu =0\) is also a boundary point of the feasible domain; its corresponding invariant reflects the feature of the loop guard itself. Thus we add it into the list of tight choices.   \(\square \)

With \(\mu \) determined and \(\mathbf {c}\) fixed up to a scaling factor, the last thing remains is to determine the optimal d. The strategy here is similar to Proposition 2:

Proposition 6

Suppose \(\mu \) is from the feasible domain and \(\mathbf {c}\) is given by Proposition 3. Then the optimal value of d is determined by one of the two choices below:

$$\mathbf {b}^\mathrm {T}\cdot \mathbf {c}-\xi \cdot q=(\mu -1)d \quad \text {or}\quad \mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}\cdot \mathbf {c}=d.$$

The proof is omitted here and can be found in our full version [25].

Example 5

(Fibonacci, Part 5). Remember that

$$\begin{bmatrix} c_{1} \\ c_{2} \end{bmatrix} =\frac{\xi }{\mu ^2-\mu -1}\begin{bmatrix} 1-\mu \\ -1 \end{bmatrix}\ \text {and the feasible domain of}\ \mu \ \text {is}\ [0,1)\cup (\frac{1+\sqrt{5}}{2},\frac{5}{3}].$$

We compute the tight choices of \(\mu \) and tight invariants. The equation (8) here is

$$ 0= \frac{-9\mu ^{2}+9\mu +10}{\mu ^2-\mu -1}=-\frac{9(\mu -\frac{5}{3})(\mu +\frac{2}{3})}{(\mu -\frac{1-\sqrt{5}}{2})(\mu -\frac{1+\sqrt{5}}{2})} $$

which has only one positive root \(\mu =\frac{5}{3}\). By Proposition 5 and Proposition 6, We get two invariants:

figure p

4.4 Loops with Guard: Multi-constraint Case

After settling the single-constraint loop guard case, we consider the more general loop guard which contains the conjunction of multiple affine constraints:

figure q

where the loop guard \(\mathbf {P}\cdot \mathbf {x}+\mathbf {q}\le \mathbf {0}\) contains m affine inequalities.

We can easily generalize the results of Sect. 4.3 to this case. First of all, we generalize Proposition 3: one simply needs to modify the formula (6) into

figure r

here \(\boldsymbol{\xi }\) is a free non-negative m-dimensional vector parameter. With a fixed \(\mu \), we take \(\boldsymbol{\xi }\) to traverse all vectors in the standard basis \(\{\mathbf {e}_{1},...,\mathbf {e}_{m}\}\) to get m conjunctive invariants.

Next, we generalize Proposition 4 which describes the feasible domain of \(\mu \):

Proposition 7

For the loop (\(\diamond ^{\prime \prime }\)), the feasible domain of \(\mu \) is \([0,1)\cup \big (\overline{K}\cap [1,+\infty )\big )\), where \(\overline{K}\) is the solution set to the following generalized compatibility condition:

$$ \mathbf {b}^\mathrm {T}\cdot \mathbf {c}-\mathbf {q}^\mathrm {T}\cdot \boldsymbol{\xi }\le (\mu -1)d \le (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})^{-1}_\mathrm {L}\cdot \mathbf {c} $$

substitute \(\mathbf {c}\) by (\(6^{\prime }\)) and take \(\boldsymbol{\xi }\) to traverse all vectors in the standard basis (in order for all constraints in the loop guard to be satisfied by the invariant), we have the above condition completely decoded as m conjunctive inequalities:

figure s

where \(\mathbf {u}(\mu ),\mathbf {w}(\mu )\) are two m-dimensional vector functions in \(\mu \). The meaning of (\(7^{\prime }\)) is that the i-th component of \(\mathbf {u}(\mu )\) is no larger than the i-th component of \(\mathbf {w}(\mu )\) for all \(1\le i\le m\); when \(m=1\), it goes back to (7).

At last, we consider the tight choices of \(\mu \). The first idea comes up to mind is to repeat Proposition 5: setting \(\lambda ^\mathrm {I}_{0}=\lambda ^\mathrm {C}_{0}=0\) for arbitary \(\boldsymbol{\xi }\) such that the generalized compatibility condition achieves equality, i.e., \(\mathbf {u}(\mu )=\mathbf {w}(\mu )\); however, this is the conjunction of m rational equations and probably contains no solution.

Thus we use a different idea: recall that in the single-constraint case, the tight choices are also the (positive) boundary points of K along with 0; so we adopt this property as the definition in the multi-constraint case:

Definition 1

For the loop (\(\diamond ^{\prime \prime }\)), the tight choices of \(\mu \) consist of 0 and the (positive) boundary points of the domain \(\overline{K}\) defined in Proposition 7.

The generalized compatibility condition (\(7^{\prime }\)) contains m inequalities; at each (positive) boundary point of \(\overline{K}\), at least one inequality achieves equality and all other inequalities are satisfied (equivalently, \(\lambda ^\mathrm {I}_{0}=\lambda ^\mathrm {C}_{0}=0\) is achieved for at least one non-trivial evaluation of the free vector parameter \(\boldsymbol{\xi }\)). This is indeed a natural generalization of Proposition 5.

Example 6

We consider the loop:

figure t

There is one eigenvalue \(\mu =1\) with geometric multiplicity 2; we solve three independent invariants from it:

$$\begin{aligned} x_{1}+x_{2}-2\le 0,\ x_{1}+x_{2}-2\ge 0;\ -x_{1}+x_{2}\le 0. \end{aligned}$$

Next we find out the other invariants from tight \(\mu \)’s. In this case (\(7^{\prime }\)) read \( \frac{11-10\mu }{1-\mu }\le 1\ \wedge \ \frac{6-5\mu }{1-\mu }\le -1\quad (\text {when}\ \mu >1). \) Then \(\overline{K}=(1,\frac{10}{9}]\cap (1,\frac{7}{6}]=(1,\frac{10}{9}]\) and the feasible domain of \(\mu \) is \([0,1)\cup (1,\frac{10}{9}].\) The tight choices are \(0,\frac{10}{9}\) (taking \(\boldsymbol{\xi }\) to be \([1,0]^\mathrm {T},[0,1]^\mathrm {T}\) respectively yields the two conjunctive invariants for each \(\mu \)):

figure u

5 Generalizations

In this section, we extend our theory developed in Sect. 4 in two directions. For one direction, we consider the invariants \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) for the affine loops in the general form (\(\dag \)): we will derive the relationship of \(\mu \) and \(\mathbf {c}\), as well as the feasible domain and tight choices of \(\mu \). For the other direction, we stick to the single-branch affine loops with deterministic updates and tautological guard (\(\diamond \)), yet generalize the invariants to bidirectional-inequality form \(d_{1}\le \mathbf {c}^\mathrm {T}\cdot \mathbf {x}\le d_{2}\); we will apply eigenvalue method to this case for solving the invariants. At the end of the section, we also give a brief discussion on some other possible generalizations.

5.1 Affine Loops with Non-deterministic Updates

In Sect. 4, we handled the loops with deterministic updates; here we generalize the results to the non-deterministic case in the form of (\(\dag \)). We focus on the single-branch loops here, because the multi-branch ones can be handled similarly by taking the conjunction of all branches, as illustrated in the full version of this paper [25, Appendix A.3].

figure v

For this general form, the initiation constraints are still (1) (2) (3), while the consecution constraints from Farkas table (\(*\)) are

$$\begin{aligned} \mu \cdot \mathbf {c}+\mathbf {P}^\mathrm {T}\cdot \boldsymbol{\xi }+\mathbf {T}^\mathrm {T}\cdot \boldsymbol{\eta }&=\mathbf {0} \end{aligned}$$
(9)
$$\begin{aligned} -(\mathbf {T}^{\prime })^\mathrm {T}\cdot \boldsymbol{\eta }&=\mathbf {c} \end{aligned}$$
(10)
$$\begin{aligned} (\mu -1)d +\mathbf {q}^\mathrm {T}\cdot \boldsymbol{\xi }+\mathbf {b}^\mathrm {T}\cdot \boldsymbol{\eta }&=\lambda ^\mathrm {C}_{0}\ge 0 \end{aligned}$$
(11)

with \(\boldsymbol{\xi },\boldsymbol{\eta }\ge \mathbf {0}\). The relationship of \(\mathbf {c}\) and \(\boldsymbol{\eta }\) is given by (10); plugging it into (9) yield

figure w

Hence for any non-trivial invariant \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) of this loop (\(\dag ^{\prime }\)), we have \(\mathbf {c}=-(\mathbf {T}^{\prime })^\mathrm {T}\cdot \boldsymbol{\eta }\), where \(\boldsymbol{\eta }\) is characterized differently in the following three cases:

  1. 1.

    \(\mathbf {T}\) and \(\mathbf {T}^{\prime }\) are square matrices and the loop guard is ‘true’. In this case, we take \(\boldsymbol{\xi }=\mathbf {0}\) in (\(9^{\prime }\)) and easily see that \(\mu \) must be a root of \(\det \big (\mathbf {T}^\mathrm {T}-\mu \cdot (\mathbf {T}^{\prime })^\mathrm {T}\big )=0\) and \(\boldsymbol{\eta }\) is a kernel vector of the matrix \(\mathbf {T}^\mathrm {T}-\mu \cdot (\mathbf {T}^{\prime })^\mathrm {T}.\)

  2. 2.

    \(\mathbf {T}\) and \(\mathbf {T}^{\prime }\) are square matrices and the loop guard is non-tautological. In this case, we set \(\mu \) to be values other than the roots of \(\det \big (\mathbf {T}^\mathrm {T}-\mu \cdot (\mathbf {T}^{\prime })^\mathrm {T}\big )=0\), thus the inverse matrix \(\big (\mathbf {T}^\mathrm {T}-\mu \cdot (\mathbf {T}^{\prime })^\mathrm {T}\big )^{-1}\) exists; we multiply it on (\(9^{\prime }\)) and get that \(\boldsymbol{\eta }(\mu )=-\big (\mathbf {T}^\mathrm {T}-\mu \cdot (\mathbf {T}^{\prime })^\mathrm {T}\big )^{-1}\mathbf {P}^\mathrm {T}\cdot \boldsymbol{\xi }.\)

  3. 3.

    Neither \(\mathbf {T}\) nor \(\mathbf {T}^{\prime }\) is square matrix. In this case, we need to use Gaussian elimination method (with parameters) to solve (\(9^{\prime }\)). By linear algebra, the solution \(\boldsymbol{\eta }(\mu )\) would contain ‘homogeneous term’ (which does not involve \(\boldsymbol{\xi }\) but possibly some free variables \(\overline{\boldsymbol{\eta }}=[\eta _{1},...,\eta _{l}]^\mathrm {T}\)) and ‘non-homogeneous term’ (which contains \(\boldsymbol{\xi }\) linearly). Thus \(\boldsymbol{\eta }(\mu )\) could be written in parametric vector form as \(\mathbf {M}(\mu )\cdot \overline{\boldsymbol{\eta }}+\mathbf {N}(\mu )\cdot \boldsymbol{\xi }\), where \(\mathbf {M}(\mu ),\mathbf {N}(\mu )\) are matrix functions only in \(\mu \).

For Case 2 and Case 3, we have a continuum of candidates for \(\mu \). The feasible domain of \(\mu \) is given by \(\Big ([0,1)\cup \big (\widetilde{K}\cap [1,+\infty )\big )\Big )\cap J\), where \(\widetilde{K}\) is the solution set to the following compatibility condition (obtained by combining constraints (\(3^{\prime \prime }\)) (11)):

$$ \mathbf {b}^\mathrm {T}\cdot \boldsymbol{\eta }(\mu )+\mathbf {q}^\mathrm {T}\cdot \boldsymbol{\xi }\ge (\mu -1)\mathbf {f}^\mathrm {T}\cdot (\mathbf {R}^\mathrm {T})_{\mathrm {L}}^{-1}(\mathbf {T}^{\prime })^\mathrm {T}\cdot \boldsymbol{\eta }(\mu ) $$

and J is the solution set to constraints \(\boldsymbol{\eta }(\mu )\ge \mathbf {0}\). Here both \(\overline{\boldsymbol{\eta }}\) and \(\boldsymbol{\xi }\) as free non-negative vector parameters are taken to traverse all standard basis vectors, just in the same way as Proposition 7. The tight choices of \(\mu \) consists of 0 and the positive boundary points of \(\widetilde{K}\cap J\), in the same sense as Definition 1.

5.2 An Extension to Bidirectional Affine Invariants

Here we restrict ourselves to single-branch affine loops with deterministic updates and tautological loop guard (\(\diamond \)), but aim for the invariants of bidirectional-inequality form \(d_{1}\le \mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}\le d_{2}\). This is actually the conjunction of two affine inequalities: \(\varPhi _{1}:-\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}+d_{1}\le 0\) \(\wedge \) \(\varPhi _{2}:\mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}-d_{2}\le 0\). We have the following proposition:

Proposition 8

For any bidirectional invariant \(d_{1}\le \mathbf {c}^{\mathrm {T}}\cdot \mathbf {x}\le d_{2}\) of the loop (\(\diamond \)), we have that \(\mathbf {c}\) must be an eigenvector of \(\mathbf {T}^\mathrm {T}\) with a negative eigenvalue.

Proof

We can easily write down the initiation condition: \(\theta \models (\varPhi _{1}\wedge \varPhi _{2})\) and the corresponding constraints (with \(\boldsymbol{\lambda },\ \boldsymbol{\widetilde{\lambda }}\) being two different vector parameters):

$$\begin{aligned} \mathbf {R}^\mathrm {T}\cdot \boldsymbol{\lambda } =\mathbf {c},\quad \mathbf {f}^\mathrm {T}\cdot \boldsymbol{\lambda }+d_{2}=\lambda ^\mathrm {I}_{0}\ge 0;\quad \mathbf {R}^\mathrm {T}\cdot \boldsymbol{\widetilde{\lambda }} = -\mathbf {c},\quad \mathbf {f}^\mathrm {T}\cdot \boldsymbol{\widetilde{\lambda }}-d_{1}=\widetilde{\lambda }^\mathrm {I}_{0}\ge 0. \end{aligned}$$

However, there are two possible ways to propose the consecution condition:

$$\begin{aligned} (\varPhi _{1}\wedge \tau \models \varPhi _{1}^{\prime }\ \ \text {and}\ \ \varPhi _{2}\wedge \tau \models \varPhi _{2}^{\prime })\quad {\textbf {or}}\quad (\varPhi _{1}\wedge \tau \models \varPhi _{2}^{\prime }\ \ \text {and}\ \ \varPhi _{2}\wedge \tau \models \varPhi _{1}^{\prime }) \end{aligned}$$

If we choose the first one, there will be nothing different from the things we did in Sect. 4.2. Thus we choose the second one: making the two inequalities induct each other. Hence the Farkas tables are

figure x

We write out the constraints of consecution:

$$\begin{aligned} -\mu \cdot {\textbf {c}} =\mathbf {T}^\mathrm {T}&\cdot \mathbf {c}=-\widetilde{\mu }\cdot \mathbf {c} \\ \mu \cdot d_{1}+d_{2}- \mathbf {b}^\mathrm {T}\cdot \mathbf {c}=\lambda ^\mathrm {C}_{0}\ge 0,&\ \ -\widetilde{\mu }\cdot d_{2}-d_{1}+\mathbf {b}^\mathrm {T}\cdot \mathbf {c}=\widetilde{\lambda }^\mathrm {C}_{0}\ge 0\nonumber \end{aligned}$$
(12)

the proposition is verified by (12) since \(\mu ,\widetilde{\mu }\ge 0\).   \(\square \)

Example 7

(Fibonacci, Part 6). Recall that in this example we have a negative eigenvalue \(\frac{1-\sqrt{5}}{2}\). It yields the eigenvector \(\mathbf {c}=[c_{1},\frac{1-\sqrt{5}}{2}c_{1}]^\mathrm {T}\). The other constraints are computed as:

$$\begin{aligned} -(3-\sqrt{5})c_{1}/2+d_{2}=\lambda ^\mathrm {I}_{0}\ge 0,&\quad (3-\sqrt{5})c_{1}/2-d_{1}=\widetilde{\lambda }^\mathrm {I}_{0}\ge 0. \\ -(1-\sqrt{5})d_{1}/2+d_{2}=\lambda ^\mathrm {C}_{0}\ge 0,&\quad (1-\sqrt{5})d_{2}/2-d_{1}=\widetilde{\lambda }^\mathrm {C}_{0}\ge 0. \end{aligned}$$

If we choose \(c_{1}=2, \lambda ^\mathrm {I}_{0}=0=\widetilde{\lambda }^\mathrm {C}_{0}\) (or \(c_{1}=-2, \widetilde{\lambda }^\mathrm {I}_{0}=0=\lambda ^\mathrm {C}_{0}\)), we get an invariant

$$\begin{aligned} \mu =\mathopen |(1-\sqrt{5})/2\mathclose |:\ 2(2-\sqrt{5})\le 2x_{1}+(1-\sqrt{5})x_{2}\le 3-\sqrt{5} \end{aligned}$$

which reflects the ‘golden ratio’ property of the Fibonacci numbers.   \(\square \)

Remark 1

The generalizations for bidirectional affine invariants to the loops with non-tautological guard or multiple branches are practicable but with some restrictions. The main restriction lies at the point that we need to assume the affine loop guard to also be bidirectional to make our approach for bidirectional affine invariants work. The issue of multiple branches is not critical as the bidirectional invariants can be derived in almost the same way as single-inequality invariants (illustrated in full version [25, Appendix A.3]), with the only difference at the adaption to bidirectional inequalities.

5.3 Other Possible Generalizations

Integer-valued Variables. One direction is to transfer some of the results for affine loops over real-valued variables to those over integer-valued variables. Our approach is based on Farkas’ lemma which is dedicated to real-valued variables, thus can only provide a sound but not exact treatment for integer-valued variables. An exact treatment for integer-valued variables would require Presburger arithmetics [16], rather than Farkas’ lemma.

Strict-inequality Invariants. We handle the non-strict-inequality affine invariants in this work. It’s natural to consider the affine invariants of the strict-inequality form. For strict inequalities, we could utilize an extended version of Farkas’ lemma in [6, Corollary 1], so that strict inequalities can be generated by either relaxing the non-strict ones obtained from our method or restricting the \(\mu \) value to be positive. Since Motzkin transposition theorem is a standard theorem for handling strict inequalities, we believe that Motzkin transposition theorem can also achieve similar results, but may require more tedious manipulations.

6 Approximation of Eigenvectors through Continuity

In Sect. 4.2 and Sect. 5.2, we need to solve the characteristic polynomial of the transition matrix to get eigenvalues; while general polynomials with degree \(\ge 5\) do not have algebraic solution formula due to Abel-Ruffini theorem. We can develop a number sequence \(\{\lambda _{i}\}\) to approximate the eigenvalue \(\lambda \) through root-finding algorithms; however, we cannot approximate the eigenvector of \(\lambda \) by solving the kernel of \(\mathbf {T}^\mathrm {T}-\lambda _{i}\cdot \mathbf {I}\) since it has trivial kernel. In the case of dimensions \(\ge 5\), i.e., when an explicit formula for eigenvalues is unavailable, we introduce an approximation method of the eigenvectors through a continuity property of the invariants:

Continuity of Invariants w.r.t. \(\mu \) . In Sect. 4, we have shown that for any invariant \(\mathbf {c}^\mathrm {T}\cdot \mathbf {x}+d\le 0\) of single-branch affine loops with deterministic updates, the relationship of \(\mathbf {c}\) and \(\mu \) is given in two ways:

$$\begin{aligned} \mathbf {c}= {\left\{ \begin{array}{ll}\text {kernel vector of}\ \mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I}\quad &{}\text {when}\ \det (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})=0\\ (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})^{-1}\cdot \mathbf {z}\quad &{}\text {when}\ \det (\mathbf {T}^\mathrm {T}-\mu \cdot \mathbf {I})\ne 0 \end{array}\right. } \end{aligned}$$

with \(\mathbf {z}=\mathbf {P}^\mathrm {T}\cdot \boldsymbol{\xi }\). Thus \(\mathbf {c}=\mathbf {c}(\mu )\) could be seemed as a vector function in \(\mu \) expressed differently at eigenvalues from other points. \(\mathbf {c}(\mu )\) is undoubtedly continuous at the points other than eigenvalues, while the following proposition illustrates the continuity property of \(\mathbf {c}(\mu )\) at the eigenvalues:

Proposition 9

Suppose \(\lambda \) is a real eigenvalue of \(\mathbf {T}^\mathrm {T}\) with eigenvector \(\mathbf {c}(\lambda )\); and \(\{\lambda _{i}\}\) is a sequence lying in the feasible domain of \(\mu \) which converges to \(\lambda \). If \(\lambda \) has geometric multiplicity 1, then the sequence \(\{\mathbf {c}(\lambda _{i})\}\) converges to \(\mathbf {c}(\lambda )\) as well; otherwise, \(\{\mathbf {c}(\lambda _{i})\}\) converges to \(\mathbf {0}\).

Due to the lack of space, the proof of Proposition 9 is omitted here and available in our full version [25].

An Algorithmic Approach to Eigenvalue Method in Dimensions \(\ge 5\) . By Proposition 9, if \(\lambda \) has geometric multiplicity 1, we can compute \(\mathbf {c}(\lambda _{i})=(\mathbf {T}^\mathrm {T}-\lambda _{i}\cdot \mathbf {I})^{-1}\cdot \mathbf {z}\) (in the case of tautological loop guard, we just replace \(\mathbf {z}\) by any non-zero n-dimensional real vector) to approximate the eigenvector \(\mathbf {c}(\lambda )\). On the other hand, in the case that \(\lambda \) has geometric multiplicity \(>1\), one can adopt Least-squares approximation as presented in [5, Section 8.9]. Though the Least-squares approximation applies to the cases of eigenvalues with arbitrary geometric multiplicity, our method is much easier to implement and has higher efficiency.

7 Experimental Results

Experiment. We implement our automatic invariant-generation algorithm of eigenvalues and tight choices in Python 3.8 and use Sage [42] for matrix manipulation. All results are obtained on an Intel Core i7 (2.00 GHz) machine with 64 GB memory, running Ubuntu 18.04. Our benchmarks are affine loops chosen from some benchmark in the StInG invariant generator [40], some linear dynamical system in [30], some loop programs in [41] and some other linear dynamical systems resulting from well-known linear recurrences such as Fibonacci numbers, Tribonacci numbers, etc.

Complexity. The main bottleneck of our algorithm lies at exactly solving or approximating real roots of univariate polynomials (for computing eigenvalues and boundary points in our algorithmic approach). The rest includes Gaussian elimination with a single parameter (the polynomial-time solvability of which is guaranteed by [26]), matrix inverse and solving eigenvectors with fixed eigenvalues, which can easily be done in polynomial time. The exact solution for degrees less than 5 can be done by directly applying the solution formulas. The approximation of real roots can be carried out through real root isolation and a further divide-and-conquer (or Newton’s method) in each obtained interval, which can be completed in polynomial time (see e.g. [36] for the polynomial-time solvability of real root isolation). Thus, our approach runs in polynomial time and is much more efficient than quantifier elimination in [12].

Results. The experimental results are presented in Table 1. In the table, the column ‘Loop’ specifies the name of the benchmark, ‘Dim(ension)’ specifies the number of program variables, ‘\(\mu \)’ specifies the values through eigenvalues of the transition matrices (which we marked with \(\mathfrak {e}\)) or boundary points of the intervals in the feasible domain, ‘Invariants’ lists the generated affine invariants from our approach. We compare our approach with the existing generators StInG [40] and InvGen [22], where ‘\(=\)’, ‘>’, ‘\(\gg \)’ and ‘\(\ne \)’ means the generated invariants are identical, more accurate, can only be generated in this work, and incomparable, respectively. Table 2 compares the amounts of runtime for our approach and StInG and InvGen respectively, measured in seconds. Note that the runtime of StInG and InvGen are obtained by executing their binary codes on our platform.

Table 1. Experimental Results of Invariants
Table 2. Experimental Results of Execution Time (s)

Analysis. StInG [40] implements constraint-solving method proposed in [12, 37], InvGen [22] integrates both constraint-solving method and abstract interpretation, while our approach uses matrix algebra to refine and upgrade the constraint-solving method. Based on the results in Table 1 and Table 2, we conclude that:

  • For the benchmarks with rather simple transition matrices (identity or diagonal matrices), our approach covers or outnumbers the invariants generated by StInG and InvGen.

  • For the benchmarks with complicated transition matrices (which are the matrices far away from diagonal ones), especially the ones with irrational eigenvalues, our approach generates adequate accurate invariants while StInG and InvGen generate nothing or only trivial invariants.

  • For all benchmarks, the runtime of StInG and InvGen are faster but comparable with our runtime, hence shows the efficiency of our approach.

Summarizing all above, the experimental results demonstrate the wider coverage for the \(\mu \) value endowed from our approach, and show the generality and efficiency of our approach.