Keywords

1 Introduction

Satisfiability Modulo Theories (SMT) solving plays a crucial role in automated reasoning as it combines the power of Boolean satisfiability (SAT) with various mathematical background theories [3]. This connection enables the automated verification and synthesis of systems [15] that require reasoning in more expressive logical theories, for example real/integer arithmetic.

State-of-the-art SMT solvers employ a combination of Boolean level reasoning and theory-specific algorithms. This is achieved either through the use of the CDCL(T) paradigm [16] or the model-constructing satisfiability (MCSat) approach [11, 14]. The MCSat algorithm lifts the Boolean-level CDCL algorithm to the theory level, while kee** the search theory independent. This approach is particularly effective for handling complex arithmetic theories. For instance, Yices2  [5] uses the MCSat approach to handle non-linear arithmetic constraints.

Finite fields offer an ideal framework for modeling bounded machine arithmetic, particularly relevant in the context of contemporary cryptosystems utilized in system security and post-quantum cryptography. Current methodologies, for instance, develop private and secure systems using zero-knowledge (ZK) proofs [7] or authenticate blockchain technologies like smart contracts [19]. Verifying applications in such areas require efficient SMT solvers that support reasoning over finite field arithmetic, e.g., verification of a compiler for ZK proofs [18].

Related Work. Currently, the related work on SMT solving in finite field arithmetic is still rather limited. Our own theoretical work [9] on MCSat approaches based on finding zero decompositions comes with a proof-of-concept implementation that facilitates only a very fundamental MCSat algorithm, has only limited support of Boolean propagation, and is unable to parse SMT-LIB 2  [2].

The only other SMT solver that we are aware of being capable of reasoning over finite fields is cvc5  [1, 17], which uses a classical CDCL(T) approach. As a theory engine, Gröbner bases [4] reasoning over a set of polynomial equalities is applied. If the derived Gröbner basis contains the constant 1, then the system is unsatisfiable and a conflict core for the CDCL(T) search can be found. Otherwise, a guided enumeration of all possible solutions is performed to search for a model.

Note that both approaches [9, 17] use complementary techniques. On the one hand, Gröbner bases are highly engineered to find conflicts in the polynomial input, which tends to help for unsatisfiable instances [17]. On the other hand, a model constructing approach tends to be fast whenever there is a solution, especially when there is a high number of models [9].

We further note that at the moment our implementation in Yices2, as well as cvc5, is restricted to finite fields where the order (i.e. size) is prime. This limitation is sufficient for many applications in cryptosystems and ZK proofs.

Besides using dedicated finite field solvers, problems over prime fields can be encoded in integer arithmetic using the modulo operator. Further, since terms are bounded, encoding as bit-vectors for subsequent bit-blasting is possible. However, prior experiments have shown that those encodings perform poorly on existing solvers [17].

Contributions. We present an integration of the theory of non-linear finite field arithmetic in the Yices2 SMT solver [5], enabling it to reason over finite field problems. This includes the following contributions which we will further explain throughout the rest of this paper:

  • Adding the parsing of finite field problems to the Yices2 SMT-LIB 2 front-end (Sect. 3).

  • Representing finite field polynomials as terms in Yices2 and implementing features regarding such polynomials in the LibPoly library [12], which is the library used for polynomials in Yices2 (Sect. 4).

  • Implementing and evaluating an MCSat theory back-end for finite field reasoning, using existing concepts from non-linear real and bit-vector solving from Yices2 (Sect. 5).

To the best of our knowledge, our work is currently the only finite field instantiation of MCSat. While our initial theory reasoning approach follows closely the explanation generation procedure of our previous work [9], our implementation allows easy drop-in of an improved explanation procedure in the future.

2 Preliminaries

In mathematics, a finite field is a field that contains a finite number of elements. A finite field \(\mathbb {F}_p\) of prime order p can roughly be seen as the representation of the integers modulo the prime p. We refer to [9, 17] for details on the theory and representation of finite fields. Since there is no inherent order on finite fields, polynomial constraints are either equalities \(p = 0\) or disequalities \(p \ne 0\) for a finite field polynomial p.

For SMT solving in finite fields, we are interested in the following problem:

figure a

For example, the formula \(\mathcal {F}_1 = (x-1=0 \vee y-1=0) \wedge (xy-1 =0)\) is satisfied by the assignment \(x \mapsto 1, y\mapsto 1\) in \(\mathbb {F}_3\); whereas the formula \(\mathcal {F}_2 = (x-1=0 \vee y-1=0) \wedge (xy-1 =0) \wedge (x-2 = 0)\) is unsatisfiable in \(\mathbb {F}_3\).

Yices2 and MCSat. Yices2 contains two main solvers, one based on the traditional CDCL(T) approach [16] and one based on the MCSat approach [13, 14]. Yices2 ’s common API and front-ends can automatically select which solver to use at runtime, depending on an SMT-LIB 2 logic. In particular, when non-linear real or integer arithmetic constraints are present Yices2 selects the MCSat solver. The MCSat solver in Yices2 currently supports the theories of non-linear real arithmetic (QF_NRA) [13] and integer arithmetic (QF_NIA) [10], bit-vectors (QF_BV) [8], equality and uninterpreted functions (QF_EUF), arrays [6], and combinations thereof.

In contrast to CDCL(T) that complements CDCL with theory reasoning, MCSat applies CDCL-like mechanisms to perform theory reasoning. Specifically, it explicitly and incrementally constructs models with first-order variable assignments—maintained in a trail—while maintaining the invariant that none of the constraints evaluate to false. MCSat decides upon such assignments when there is choice, it can propagate them when there is not, and it backtracks upon conflict. The lemmas learned upon backtracking are based on theory-specific explanations of conflicts and propagations. This theory-specific reasoning is implemented through plugins that provide interfaces to make decisions, perform propagations, and produce explanations.

3 Usability of SMT Solving in Finite Fields

Support for finite field reasoning in Yices2 is available on the master branchFootnote 1and will be included in the next release (2.7). The theory of finite fields can be accessed using a not-yet official extension of the SMT-LIB 2 language (.smt2).

SMT-LIB 2 Parsing. Extending the parser to handle finite field problems was our first extension to Yices2. Currently, polynomials over finite fields are no official theory in SMT-LIB 2  [2]. However, when implementing finite field support in cvc5  [1], an extension was proposed in [17]. In the interest of kee** inputs and benchmarks comparable, we aimed at a compatible implementation. Standardization efforts to create an official SMT-LIB 2 theory for finite field arithmetic are currently ongoing.

In the SMT-LIB 2 extension, the theory of (quantifier-free) non-linear finite field arithmetic is denoted as . Elements can be defined using the sort . The sort is indexed by the order of the finite field, which is required to be a prime number. For instance defines the finite field of order 3. Constants are indexed with the field order to indicate which finite field they belong in, e.g., . Note that the integer following is interpreted modulo the field order. As a short-cut to avoid rewriting the field order over and over again, the keyword can be used to interpret the constant in the correct field type: for a defined finite field sort . To specify the finite field operations and are available for multiplication and addition of finite field values, respectively. Both support an arbitrary number of operators. Atoms with finite field terms can be with its respective meaning. For example, an encoding of \(\mathcal {F}_1\) can be seen in Fig. 1.

Fig. 1.
figure 1

Example for an SMT-LIB 2 encoding of a finite field problem \(\mathcal {F}_1\).

4 Implementation Details

The Implementation of MCSat in Yices2 . The MCSat solver in Yices2 supports multiple theories via a notion of theory plugin that builds upon an earlier architecture [11]. An MCSat theory plugin in Yices2 implements a number of functionalities that are given to the main MCSat solver as function pointers. The main MCSat loop calls these functions for theory-specific operations such as deciding or propagating the value of variables or getting explanation lemmas, or upon certain events such as the creation of new terms and lemmas. In return, a theory plugin can access theory-generic mechanisms for, e.g., inspecting the MCSat trail, creating variables and requesting to be notified of certain events like variable assignments, as well as raising conflicts. A theory plugin is not required to implement mechanisms for propagating theory assignments and explaining them, but for the current theories in Yices2, such propagations have provided noticeable speed-ups (see, e.g., [8]).

The Finite Field MCSat Plugin. Before handling constraints in the finite field MCSat plugin, the input assertions are represented as polynomial constraints. Limited preprocessing (e.g., constant propagation) is performed at this step. Internally, the plugin only handles polynomial equalities and disequalities. The implementation of the finite field plugin follows an approach similar to the plugin for non-linear arithmetic [10].

Using the MCSat trail, the finite field plugin reads which constraints must be fulfilled at any given time (as decided or deduced by the Boolean plugin) and tracks the assignment of values to polynomial variables. It also tracks, for each polynomial variable, the set of feasible values that the variable can be assigned without any of the polynomial constraints evaluating to false: Using watch lists, it detects when any of the constraints becomes unit, i.e. when all of its variables but one have been assigned values. Upon such detection, it computes how the constraint restricts the set of feasible values for the last remaining variable, using regular univariate polynomial factorization. When that set becomes empty, the plugin reports a theory conflict to the main MCSat engine. Given a conflict core and the current assignment, the explanation procedure in the plugin generates a (globally valid) lemma that explains the conflict in that it excludes a class of assignments (including the current one) that all violate the conflict core. The MCSat engine performs conflict analysis using theory explanations and Boolean resolutions, and either backtracks if it can or concludes unsatisfiability. On the other hand, the instance is satisfiable once all variables are assigned a value.

Finite Field Explanations. In our earlier work [9], we presented an explanation procedure for finite fields. This approach employs subresultant regular subchains (SRS) [20] between conflicting polynomials to provide new polynomial constraints that can be propagated. In a nutshell, SRS can be used to construct a generalized greatest common divisor (GCD) of polynomials that takes into account the current partial variable assignment on the trail. The computed GCD is utilized in a zero decomposition procedure to reduce the degree of the conflicting polynomials until we can learn a polynomial constraint that excludes the current partial assignment. This constraint is added as an explanation clause to resolve the conflict. We implemented the procedure of [9] in the current version of Yices2 using LibPoly. However, it is important to note that there are other solving techniques for polynomial systems over finite field that could potentially be utilized to develop an explanation method suitable for an MCSat-based search. Furthermore, it is still an open question how different techniques perform in an MCSat environment. That is why we have kept the explanation procedure encapsulated in our implementation, allowing for easy extension in order to support development and evaluation of future explanation procedures.

5 Evaluation

Since finite field solving is a rather new endeavor in the world of SMT, no extensive set of SMT-LIB 2 benchmarks exists yet. For the evaluation we have selected the benchmark sets presented in the papers describing the theory behind the implementation of Yices2 and cvc5:

  1. (i)

    The polynomial sets from our prior work [9], consisting of 325 instances. These benchmarks primarily contains finite fields up to order 211, using two classes of polynomial systems: randomly generated as well as crafted systems. The crafted benchmarks are product of (mostly) degree-1 polynomials.

  2. (ii)

    Benchmarks generated using ZK proof compilers presented in [17]. Besides polynomial equations, these 1602 benchmark instances also contain Boolean structure. The field order varies form small (11) up to vast (more than \(2^{255}\)).

Experimental Setup. Our experiments were run on an AMD EPYC 7502 CPU with a timeout of 300 seconds per benchmark instance. We compared our implementation of Yices2 against cvc5 version 1.1.1, which is the latest released version at the time of writing. We are not aware of any further SMT solvers supporting the theory of non-linear finite fields to be included in the comparison.

Experimental Results. The performance comparison between the two solvers on the first benchmark set can be seen in Fig. 2 and Fig. 3 (left). It is clear to see that the random instances are harder to solve than the crafted instances (which have significantly more variables). We believe that this is due to the lack of internal structure in random polynomials. This makes symbolic handling of those polynomial systems hard, both for Gröbner basis computation (in cvc5) as well as SRS computation (in Yices2).

Fig. 2.
figure 2

Runtime comparison for benchmarks from [9] (in seconds, timeout 300 s) result: sat o, unsat x; finite field order: , , and

Note that cvc5 performs most symbolic computation upfront (when generating the Gröbner basis) and enumerates potential solutions in a second step (using auxiliary Gröbner basis calls). The MCSat approach in Yices2, on the other hand, interleaves model generation and symbolic computation during the search. This tends to be an advantage for harder polynomial systems especially together with small finite field orders. When the finite field order increases, this advantage seems to vanish. For the crafted polynomial benchmarks, Yices2 tends to be faster. We believe that this is due to the fact that the polynomials tend to be large (in the number of monomials), but rather easy to solve. Generating a full Gröbner basis upfront might add significant overhead.

For the second benchmark set, many instances can be solved by both solvers immediately (c.f. Fig. 3 right). We believe that those instances can be solved without extensive finite field reasoning, as the benchmark set contains Boolean structure. This enables both solvers to successfully solve benchmarks even with vast field orders. However, once extensive algebraic reasoning is required in finite fields of vast order (the majority of the benchmarks), the purely symbolic approach of cvc5 in proving unsatisfiability seems to be advantageous. An MCSat approach requires to pick actual values in a gigantic search space, thus especially strong lemmas need to be learned in order to prune the search space efficiently. Improving the explanation procedure is part of our future work.

Fig. 3.
figure 3

Instances solved over time (timeout 300s) by Yices2 and cvc5 from [9] (left) and [17] (right).

6 Summary and Outlook

In this system description we have presented the first implementation of an MCSat-based decision procedure for non-linear finite field polynomials. We have shown that MCSat is a feasible way of solving SMT instances over finite fields and it compares well with SMT approaches using Gröbner bases for many instances.

The presented tool implementation is well suited for future experiments and the rapid development of more advanced explanation procedures that will eliminate the current bottlenecks with regard to large finite fields.