Keywords

1 Introduction

As the growth of big data analysis have led to many concerns about security and privacy of data, researches on secure computation have been highlighted in cryptographic community. Homomorphic Encryption (HE) is a cryptosystem that allows an arbitrary circuit to be evaluated on encrypted data without decryption. It has been one of the most promising solutions that make it possible to outsource computation and securely aggregate sensitive information of individuals. After the first construction of fully homomorphic encryption by Gentry [20], several researches [7, 11, 16,17,18] have improved the efficiency of HE schemes.

There are a few software implementations of HE schemes based on the Ring Learning with Errors (RLWE) problem such as \(\texttt {HElib}\) [25] of the BGV scheme [7] and \(\texttt {SEAL}\) [32] of the BFV scheme [6, 18]. These HE schemes are constructed over the residue ring of a cyclotomic ring (with a huge characteristic) so they manipulate modulo operations between high-degree polynomials, resulting in a performance degradation. For an efficient implementation of polynomial arithmetic, Gentry et al. [21] suggested a representation of cyclotomic polynomials, called the double-CRT representation, based on the Chinese Remainder Theorem (CRT). The first CRT layer uses the Residue Number System (RNS) in order to decompose a polynomial into a tuple of polynomials with smaller moduli. The second layer converts each of small polynomials into a vector of modulo integers via the Number Theoretic Transform (NTT). In the double-CRT representation, an arbitrary polynomial is identified with a matrix consisting of small integers, and this enables an efficient polynomial arithmetic by performing component-wise modulo operations. This technique became one of the core optimization techniques used in the implementations of HE schemes [1, 25, 32].

Cheon et al. [11] recently suggested an HE scheme for arithmetic of approximate numbers, called HEAAN. The main idea of their construction is to consider an RLWE error as a part of an error occurring during approximate computations. Besides homomorphic addition and multiplication, it supports an approximate rounding operation of significant digits on packed ciphertexts. This approximate HE scheme shows remarkable performance in real-world applications that require arithmetic over the real numbers [27, 7, 21]. The NTT conversion can be done efficiently when the approximate bases \(q_\ell \)’s are prime numbers satisfying \(q_\ell \equiv 1 \pmod {2N}\). We give a list of candidate bases to show that there are sufficiently many distinct primes satisfying both conditions for the double-CRT representation.

The homomorphic multiplication algorithm of \(\texttt {HEAAN}\) includes modulus switching procedures that convert an element of \({R}_{Q}\) into \({R}_{P\cdot Q}\) for a sufficiently large integer P and switch back to the original modulus Q. These non-arithmetic operations are difficult to perform on the RNS system, so one should recover the coefficient representation of an input polynomial. For an optimization, we adapt an idea of Barjard et al. [3] to suggest approximate modulus switching algorithms with small errors. Instead of exact computation in the original scheme, our approximate modulus raising algorithm finds an element \(\tilde{a} \in {R}_{P\cdot Q}\) satisfying \(\tilde{a}\equiv a \pmod {Q}\) and \(\Vert {\tilde{a}}\Vert \ll P\cdot Q\) for a given polynomial \(a\in {R}_{Q}\). Conversely, the approximate modulus reduction algorithm returns an element \(b\in {R}_{Q}\) such that \(P\cdot b\approx \tilde{b}\) for an input polynomial \(\tilde{b}\in {R}_{P\cdot Q}\). These procedures give relaxed conditions on output polynomials, but we can construct algorithms that can be performed on the RNS representation. In addition, we show that the correctness of the HE system is still guaranteed with some small additional error.

Related Works. There have been several studies [5, \(\ell \) ciphertext modulus as \(Q_\ell =\prod _{i=0}^\ell q_i\), so that the ciphertext moduli in the consecutive levels have almost the same ratio \(Q_\ell /Q_{\ell -1}=q_\ell \approx q\). The rescaling algorithm with a factor of \(q_\ell \) converts an encryption of m at level \(\ell \) into an encryption of \(q_\ell ^{-1}\cdot m\) at level \((\ell -1)\). This operation has an additional error from the approximation of q, but we can manage the size of an error not to destroy the significant digits of a plaintext. An approximation error is bounded by

$$\begin{aligned} \left| q_\ell ^{-1}\cdot m-q^{-1}\cdot m\right|&=\left| 1-q_\ell ^{-1}\cdot {q}\right| \cdot \left| q^{-1}\cdot m\right| \le 2^{-\eta }\cdot \left| q^{-1}\cdot m\right| ,\end{aligned}$$

so it does not destroy the significant digits of an encrypted plaintext when \(\eta \) is sufficiently larger than the bit precision of an encrypted plaintext.

3.2 Approximate Modulus Switching

The use of an approximate basis enables an implementation of the HEAAN scheme using the RNS representation. However, HEAAN includes some non-arithmetic operations that cannot be directly implemented on the RNS components. Specifically, homomorphic multiplication and rescaling procedure require an exact modulus switching algorithm, and the key-switching technique for rotation and conjugation also contains the same operation (see [9, 11] for details).

We remark that the goal of modulus switching algorithms in [11] can be reduced to a problem that finds a ciphertext with a small error while kee** the correctness of the HE scheme. In this section, we propose an idea to approximately perform the modulus switching algorithms on the RNS representation. A full RNS variant of HEAAN will be described in the next section based on this method. Throughout this paper, we will denote by \({\mathcal D}=\{p_0,\dots ,p_{k-1},q_0,\dots ,q_{\ell -1}\}\), \({\mathcal B}=\{p_0,\dots ,p_{k-1}\}\), and \({\mathcal C}=\{q_0,\dots ,q_{\ell -1}\}\) an RNS basis and its subbases, respectively, with \(P=\prod _{i=0}^{k-1} p_i\) and \(Q=\prod _{j=0}^{\ell -1} q_j\).

Approximate Modulus Raising. Suppose that we are given the RNS representation \([a]_{\mathcal C}\) of an integer \(a\in {\mathbb Z}_Q\). The purpose of the approximate modulus raising algorithm, denoted by \(\mathtt {ModUp}\), is to find the RNS representation of an integer \(\tilde{a}\in {\mathbb Z}_{PQ}\) with respect to the basis \({\mathcal D}\) satisfying two conditions \(\tilde{a} \equiv a \pmod {Q}\) and \(|\tilde{a}| \ll P \cdot Q\). From the first condition \([\tilde{a}]_{\mathcal C}=[a]_{\mathcal C}\), we only need to generate the RNS representation of \(\tilde{a}\) with the basis \({\mathcal B}\) and it can be done by applying the fast conversion algorithm. See Algorithm 1 for a description of the approximate modulus raising.

figure a

As described in Sect. 2.3, the fast conversion algorithm in Algorithm 1 returns \([a + Q\cdot e]_{\mathcal B}\in \prod _{i=0}^{k-1}{\mathbb Z}_{p_i}\) for some integer e with \(|e| \le \ell /2\). Therefore, the output of \(\mathtt {ModUp}\) algorithm is the RNS representation of \(\tilde{a}:=a+Q\cdot e\) with respect to the basis \({\mathcal D}={\mathcal B}\cup {\mathcal C}\), as desired.

Approximate Modulus Reduction. Contrary to the modulus raising algorithm, the approximate modulus reduction algorithm, denoted by \(\mathtt {ModDown}\), takes an RNS representation \([\tilde{b}]_{\mathcal D}\) of an integer \(\tilde{b}\in {\mathbb Z}_{P\cdot Q}\) as an input and aims to compute \([b]_{\mathcal C}\) for some integer \(b\in {\mathbb Z}_Q\) satisfying \(b\approx P^{-1}\cdot \tilde{b}\).

We point out that the goal of approximate modulus reduction is reduced to a problem of finding small \(\tilde{a}=\tilde{b}-P\cdot b\) satisfying \(\tilde{a}\equiv \tilde{b} \pmod P\). The RNS representation \([\tilde{b}]_{\mathcal D}\) is the concatenation of \([\tilde{b}]_{\mathcal B}\) and \([\tilde{b}]_{\mathcal C}\). We first take the first component \([\tilde{b}]_{\mathcal B}=(\tilde{b}^{(0)},\dots ,\tilde{b}^{(k-1)})\), which is the same as \([a]_{\mathcal B}\) for \(a=[\tilde{b}]_P\in {\mathbb Z}_P\). Then we apply the fast conversion algorithm to compute the RNS representation \([\tilde{a}]_{\mathcal C}\) of \(\tilde{a}=a+P\cdot e\) for some small e. Note that \(\tilde{a}\equiv \tilde{b} \pmod P\) and \(|\tilde{a}|\ll P\cdot Q\) from the property of \(\mathtt {Conv}_{{\mathcal B}\rightarrow {\mathcal C}}(\cdot )\). Finally, we derive the RNS representation of \(b=P^{-1}\cdot (\tilde{b}-\tilde{a})\) with respect to the basis \({\mathcal C}\) by computing \( \left( \prod _{i=0}^{k-1} p_i\right) ^{-1}\cdot \left( [\tilde{b}]_{\mathcal C}-[\tilde{a}]_{\mathcal C}\right) \in \prod _{j=0}^{\ell -1} {\mathbb Z}_{q_j}. \) See Algorithm 2 for a description.

figure b

Word Operations. In the rest of the paper, the arithmetic operations (e.g. addition and multiplication) modulo a “word-size” integer will be called the word operations. Now suppose that \(p_i\)’s and \(q_j\)’s are word-size integers. As mentioned before, the fast conversion algorithm \(\mathtt {Conv}_{{\mathcal C}\rightarrow {\mathcal B}}([a]_{\mathcal C})\) outputs the tuple \(\left( \sum _{j=0}^{\ell -1}[a^{(j)}\cdot \hat{q}_{j}^{-1}]_{q_j}\cdot \hat{q}_j \pmod {p_i} \right) _{0\le i< k}\) for \(\hat{q}_j=\prod _{j'\ne j} q_{j'}\). Each component can be computed using the values \([\hat{q}_j^{-1}]_{q_j} = \prod _{j'\ne j} q_{j'}^{-1} \pmod {q_j}\) and \([\hat{q}_j]_{p_i} = \prod _{j'\ne j} q_{j'} \pmod {p_i}\) while avoiding the computation of big integers \(\hat{q}_j\). In addition, if we pre-compute and store these values, which depend only on the bases \({\mathcal B}\) and \({\mathcal C}\), then the computation cost of \(\mathtt {Conv}_{{\mathcal C}\rightarrow {\mathcal B}}(\cdot )\) algorithm can be reduced down to \(O(k\cdot \ell )\) word operations.

Complexity of Approximate Modulus Switching. Our modulus switching algorithms have an advantage, in that they can be computed only using word operations. For example, \(\mathtt {ModUp}_{{\mathcal C}\rightarrow {\mathcal D}}([a]_{\mathcal C})\) requires exactly the same computation as \(\mathtt {Conv}_{{\mathcal C}\rightarrow {\mathcal B}}([a]_{\mathcal C})\), so its total complexity is bounded by \(O(k\cdot \ell )\) word operations. The approximate modulus reduction algorithm needs to compute \(b^{(j)}=P^{-1}\cdot (\tilde{b}^{(k+j)}-\tilde{a}^{(j)}) \pmod {q_j}\) for \(0\le j< \ell \) as well as the fast conversion algorithm. The computation of \(b^{(j)}\)’s can be done in \(O(\ell )\) word operations using the pre-computable constants \([P^{-1}]_{q_j}=\left( \prod _{i=0}^{k-1} p_i\right) ^{-1} \pmod {q_j}\). Therefore, the total complexity of \(\mathtt {ModDown}\) is bounded by \(O(k\cdot \ell +\ell )=O(k\cdot \ell )\) word operations.

The approximate modulus switching algorithms can be extended to algorithms over the polynomial rings as

$$\begin{aligned} \mathtt {ModUp}_{{\mathcal C}\rightarrow {\mathcal D}}(\cdot ):&\prod _{j=0}^{\ell -1}{R}_{q_j}\rightarrow \prod _{i=0}^{k-1}{R}_{p_i}\times \prod _{j=0}^{\ell -1}{R}_{q_j},\\ \mathtt {ModDown}_{{\mathcal D}\rightarrow {\mathcal C}}(\cdot ):&\prod _{i=0}^{k-1}{R}_{p_i}\times \prod _{j=0}^{\ell -1}{R}_{q_j}\rightarrow \prod _{j=0}^{\ell -1}{R}_{q_j} \end{aligned}$$

by applying them coefficient-wise. These operations require \(O(k\cdot \ell \cdot N)\) word operations where N is a degree of a power-of-two cyclotomic ring.

4 A Full RNS Variant of the Approximate HE

In this section, we propose a variant of \(\texttt {HEAAN}\) based on the full RNS representation. For simplicity, we choose a power-of-two integer N and consider the (2N)-th cyclotomic field \(K={\mathbb Q}[X]/(X^N+1)\) and its ring of integers \({R}={\mathbb Z}[X]/(X^N+1)\). An arbitrary element of K is expressed as a polynomial with rational coefficients of degree strictly less than N, and identified with the vector of its coefficients in \({\mathbb Q}^N\). The rounding operation on K and the modulo operation on \({R}\) will be defined by the coefficient-wise rounding and modulo operations, respectively. In the following, we present a concrete description of a full RNS variant of \(\texttt {HEAAN}\).

\(\underline{\texttt {Setup}(q,L,\eta ;1^\lambda )}\). A base integer q, the number of levels L, and the bit precision \(\eta \) are given as inputs with the security parameter \(\lambda \).

  • Choose a basis \({\mathcal D}=\{p_0,\dots ,p_{k-1},q_0,q_1,\dots ,q_L\}\) such that \( {q_j}/{q}\in (1-2^{-\eta },1+2^{-\eta })\) for \(1\le j\le L\). We write \({\mathcal B}=\{p_0,\dots ,p_{k-1}\}\), \({\mathcal C}_\ell =\{q_0,\dots ,q_\ell \}\), and \({\mathcal D}_\ell = {\mathcal B}\cup {\mathcal C}_\ell = \{p_0,\dots ,p_{k-1},q_0,\dots ,q_\ell \}\) for \(0\le \ell \le L\). Let \(P=\prod _{i=0}^{k-1}p_i\) and \(Q=\prod _{j=0}^Lq_j\).

  • Choose a power-of-two integer N.

  • Choose a secret key distribution \(\chi _\mathsf {key}\), an encryption key distribution \(\chi _\mathsf {enc}\), and an error distribution \(\chi _\mathsf {err}\) over \({R}\).

  • Let \(\hat{p}_i=\prod _{0\le i'< k,i'\ne i} p_{i'}\) for \(0\le i< k\). Compute the constants \([\hat{p}_i]_{q_j}\) and \([\hat{p}_i^{-1}]_{p_i}\) for \(0\le i< k\), \(0\le j\le L\).

  • Compute the constants \([P^{-1}]_{q_j}=\left( \prod _{i=0}^{k-1} p_i\right) ^{-1} \pmod {q_j}\) for \(0\le j\le L\).

  • Let \(\hat{q}_{\ell ,j}=\prod _{0\le j'\le \ell , j'\ne j} q_{j'}\) for \(0\le j\le \ell \le L\). Compute the constants \([\hat{q}_{\ell ,j}]_{p_i}\) and \([\hat{q}_{\ell ,j}^{-1}]_{q_j}\) for \(0\le i< k\), \(0\le j\le \ell \le L\).

The constants \([\hat{p}_i]_{q_j}\) and \([\hat{p}_i^{-1}]_{p_i}\) are necessary to compute the conversion \(\mathtt {Conv}_{{\mathcal B}\rightarrow {\mathcal C}_\ell }(\cdot )\) in the \(\mathtt {ModDown}_{{\mathcal D}_\ell \rightarrow {\mathcal C}_\ell }(\cdot )\) algorithm. The constants \([P^{-1}]_{q_j}\) are also used in the algorithm. On the other hand, the constants \([\hat{q}_{\ell ,j}]_{p_i}\) and \([\hat{q}_{\ell ,j}^{-1}]_{q_j}\) are used to compute \(\mathtt {Conv}_{{\mathcal C}_\ell \rightarrow {\mathcal B}}(\cdot )\) for the \(\mathtt {ModUp}_{{\mathcal C}_\ell \rightarrow {\mathcal D}_\ell }(\cdot )\) algorithm.

We choose an RNS basis \({\mathcal D}\) consisting of word-size integers, so that every homomorphic arithmetic can be expressed using word operations (e.g. \(\texttt {uint64\_t}\)). The elements of \({\mathcal B}\) are called the special primes and used in the key-switching procedure. They do not have to be close to q, but their product P should be large enough to get a small key-switching error. The zero-level ciphertext modulus \(Q_0=q_0\) is not necessarily approximate to the base integer q, but it should be larger than the modulus of the encrypted plaintext for the correctness of decryption.

\(\underline{\texttt {KSGen}(s_1,s_2)}\). For given secret polynomials \(s_1, s_2\in {R}\), sample uniform elements \((a'^{(0)},\dots ,\) \( a'^{(k+L)})\leftarrow U\left( \prod _{i=0}^{k-1}{R}_{p_i}\times \prod _{j=0}^L{R}_{q_j}\right) \) and an error \(e'\leftarrow \chi _\mathsf {err}\). Output the switching key \(\mathsf {swk}\) as

$$ \left( \mathsf {swk}^{(0)}=(b'^{(0)},a'^{(0)}),\dots ,\mathsf {swk}^{(k+L)}=(b'^{(k+L)}, a'^{(k+L)})\right) \in \prod _{i=0}^{k-1}{R}_{p_i}^2\times \prod _{j=0}^L{R}_{q_j}^2 $$

where \(b'^{(i)}\leftarrow - a'^{(i)}\cdot s_2+e'\pmod {p_i}\) for \(0\le i< k\) and \(b'^{(k+j)}\leftarrow -a'^{(k+j)}\cdot s_2+[P]_{q_j}\cdot s_1+e'\pmod {q_j}\) for \(0\le j\le L\).

This procedure generates a switching key to convert a ciphertext with the secret key \(s_1\) into a ciphertext encrypting the same message with the secret key \(s_2\). If \(a'\) is the element of \({R}_{P\cdot Q}\) such that \([a']_{\mathcal D}=(a'^{(0)},\dots ,a'^{(k+L)})\), then the switching key \(\mathsf {swk}\) can be seen as the RNS representation of \((b',a')\in {R}_{P\cdot Q}\) in the basis \({\mathcal D}\) for \(b'=-a'\cdot s_2+P\cdot s_1+e' \pmod {P\cdot Q}\).

\(\underline{\texttt {KeyGen}}\).

  • Sample \(s\leftarrow \chi _\mathsf {key}\) and set the secret key as \(\mathsf {sk}\leftarrow (1,s)\).

  • Sample \((a^{(0)},\dots ,a^{(L)})\leftarrow U\left( \prod _{j=0}^L {R}_{q_j}\right) \) and \(e\leftarrow \chi _\mathsf {err}\). Set the public key as

    $$\begin{aligned} \mathsf {pk}\leftarrow \left( \mathsf {pk}^{(j)}=(b^{(j)},a^{(j)})\in {R}_{q_j}^2\right) _{0\le j\le L} \end{aligned}$$

    where \(b^{(j)}\leftarrow -a^{(j)}\cdot s+e \pmod {q_j}\) for \(0\le j\le L\).

  • Set the evaluation key as \(\mathsf {evk}\leftarrow \texttt {KSGen}(s^2,s)\).

The encryption key is the RNS representation of an RLWE sample \((b=-a\cdot s+e, a)\in {R}_{Q_L}^2\) in the basis \({\mathcal C}_L\). The evaluation key \(\mathsf {evk}\) can be used to perform the relinearization operation during homomorphic multiplication. One can generate additional public keys for more functionalities. For example, we need to publish a rotation key (resp. conjugation key) to compute the permutation (resp. conjugation) on plaintext slots as described in [11].

\(\underline{\texttt {Enc}_\mathsf {pk}(m)}\). For \(m\in {R}\), sample \(v\leftarrow \chi _\mathsf {enc}\) and \(e_0,e_1\leftarrow \chi _\mathsf {err}\). Output the ciphertext \(\mathsf {ct}=\left( \mathsf {ct}^{(j)}\right) _{0\le j\le L}\in \prod _{j=0}^L{R}_{q_j}^2\) where \(\mathsf {ct}^{(j)}\leftarrow v\cdot \mathsf {pk}^{(j)}+(m+e_0,e_1) \pmod {q_j}\) for \(0\le j\le L\).

\(\underline{\texttt {Dec}_\mathsf {sk}(\mathsf {ct})}\). For \(\mathsf {ct}= \left( \mathsf {ct}^{(j)}\right) _{0\le j\le \ell }\), output \(\langle {\mathsf {ct}^{(0)},\mathsf {sk}}\rangle \pmod {q_0}.\)

The encryption algorithm generates the RNS representation of a ciphertext \(\mathsf {ct}\) satisfying \([\langle {\mathsf {ct},\mathsf {sk}}\rangle ]_{Q_L}\approx m\). Thus its decryption returns an approximate value of the input plaintext. The encrypted plaintext should satisfy \(\Vert {m}\Vert _\infty \le q_0/2\) in order to recover a correct value.

\(\underline{\texttt {Add}(\mathsf {ct},\mathsf {ct}')}\). Given two ciphertexts \(\mathsf {ct}=\left( \mathsf {ct}^{(0)},\dots ,\mathsf {ct}^{(\ell )}\right) ,\mathsf {ct}'=\left( \mathsf {ct}'^{(0)},\dots ,\mathsf {ct}'^{(\ell )}\right) \) \(\in \prod _{j=0}^\ell {R}_{q_j}^2\), output a ciphertext \(\mathsf {ct}_\mathsf {add}=\left( \mathsf {ct}_\mathsf {add}^{(j)}\right) _{0\le j\le \ell }\) where \(\mathsf {ct}_\mathsf {add}^{(j)}\leftarrow \mathsf {ct}^{(j)}+\mathsf {ct}'^{(j)} \pmod {q_j}\) for \(0\le j\le \ell \).

\(\underline{\texttt {Mult}_\mathsf {evk}(\mathsf {ct},\mathsf {ct}')}\). Given two ciphertexts \(\mathsf {ct}=\left( \mathsf {ct}^{(j)}=(c_{0}^{(j)},c_{1}^{(j)})\right) _{0\le j\le \ell }\) and \(\mathsf {ct}'=\left( \mathsf {ct}'^{(j)}=(c_{0}'^{(j)},c_{1}'^{(j)})\right) _{0\le j\le \ell }\), perform the following procedures and return a ciphertext \(\mathsf {ct}_\mathsf {mult}\in \prod _{j=0}^\ell {R}_{q_j}^2\).

  1. 1.

    For \(0\le j\le \ell \), compute

    $$\begin{aligned} d_0^{(j)}\leftarrow&~c_{0}^{(j)}c_{0}'^{(j)} \pmod {q_j},\\ d_1^{(j)}\leftarrow&~c_{0}^{(j)}c_{1}'^{(j)}+c_{1}^{(j)}c_{0}'^{(j)} \pmod {q_j},\\ d_2^{(j)}\leftarrow&~c_{1}^{(j)}c_{1}'^{(j)} \pmod {q_j}. \end{aligned}$$
  2. 2.

    Compute \(\mathtt {ModUp}_{{\mathcal C}_\ell \rightarrow {\mathcal D}_\ell }(d_2^{(0)}, \dots ,d_2^{(\ell )})= (\tilde{d}_2^{(0)},\dots , \tilde{d}_2^{(k-1)},d_2^{(0)},\dots ,d_2^{(\ell )})\).

  3. 3.

    Compute

    $$\begin{aligned} \tilde{\mathsf {ct}} = (\tilde{\mathsf {ct}}^{(0)}=(\tilde{c}_0^{(0)},\tilde{c}_1^{(0)}),\dots , \tilde{\mathsf {ct}}^{(k+\ell )}=(\tilde{c}_0^{(k+\ell )},\tilde{c}_1^{(k+\ell )})) \in \prod _{i=0}^{k-1} {R}_{p_i}^2\times \prod _{j=0}^\ell {R}_{q_j}^2 \end{aligned}$$

    where \(\tilde{\mathsf {ct}}^{(i)}=\tilde{d}_2^{(i)}\cdot \mathsf {evk}^{(i)} \pmod {p_i}\) and \(\tilde{\mathsf {ct}}^{(k+j)}=d_2^{(j)}\cdot \mathsf {evk}^{(k+j)} \pmod {q_j}\) for \(0\le i< k\), \(0\le j\le \ell \).

  4. 4.

    Compute

    $$\begin{aligned} \left( \hat{c}_0^{(0)},\dots ,\hat{c}_0^{(\ell )}\right)&\leftarrow \mathtt {ModDown}_{{\mathcal D}_\ell \rightarrow {\mathcal C}_\ell }\left( \tilde{c}_0^{(0)},\dots ,\tilde{c}_0^{(k+\ell )}\right) ,\\ \left( \hat{c}_1^{(0)},\dots ,\hat{c}_1^{(\ell )}\right)&\leftarrow \mathtt {ModDown}_{{\mathcal D}_\ell \rightarrow {\mathcal C}_\ell }\left( \tilde{c}_1^{(0)},\dots ,\tilde{c}_1^{(k+\ell )}\right) . \end{aligned}$$
  5. 5.

    Output the ciphertext \(\mathsf {ct}_\mathsf {mult}=(\mathsf {ct}_\mathsf {mult}^{(j)})_{0\le j\le \ell }\) where \(\mathsf {ct}_\mathsf {mult}^{(j)}\leftarrow (\hat{c}_0^{(j)}+d_0^{(j)}, \hat{c}_1^{(j)}+d_1^{(j)})\pmod {q_j}\) for \(0\le j\le \ell \).

We first generate an extended ciphertext \((d_0,d_1,d_2)\) that decrypts to the product of the input plaintexts under the extended secret key \((1,s,s^2)\). As mentioned before, we use the evaluation key to transform \(d_2\) into a normal ciphertext. Our homomorphic multiplication algorithm is somewhat more complicated compared to the ordinary \(\texttt {HEAAN}\) because we switch the ciphertext moduli approximately using our approximate algorithms.

\(\underline{\texttt {RS}(\mathsf {ct})}\). For a level-\(\ell \) ciphertext \(\mathsf {ct}=\left( \mathsf {ct}^{(j)}=(c_0^{(j)},c_1^{(j)})\right) _{0\le j\le \ell }\in \prod _{j=0}^\ell {R}_{q_j}^{2}\), compute \(c_i'^{(j)}\leftarrow q_\ell ^{-1}\cdot \left( c_i^{(j)}-c_i^{(\ell )}\right) \pmod {q_{j}}\) for \(i=0,1\) and \(0\le j<\ell \). Output the ciphertext \(\mathsf {ct}'\leftarrow \left( \mathsf {ct}'^{(j)}=(c_0'^{(j)},c_1'^{(j)})\right) _{0\le j\le \ell -1}\in \prod _{j=0}^{\ell -1}{R}_{q_j}^2\).

For a ciphertext \(\mathsf {ct}\) encrypting a plaintext m, the rescaling algorithm returns an encryption of \(q_{\ell }^{-1}\cdot m\approx q^{-1}\cdot m\) at level \((\ell -1)\). The output ciphertext contains an additional error from the approximation of q to \(q_\ell \) and the rounding of the input ciphertext. The correctness of our scheme will be shown in Appendix A with noise analysis.

5 Software Implementation

In this section, we provide experimental results with parameter sets. In our implementation, every number is stored as an unsigned 64-bit integer, which is standard on computer system. All homomorphic operations provided in our scheme are expressed as word size operations defined on this standard variable type, so our HE library does not depend on any multi-precision numerical library. Our implementation was performed on a machine with an Intel Core i5 running at 2.9 GHz processor on a single-threaded mode, and its source code is publicly available at https://github.com/HanKyoohyung/FullRNS-HEAAN.

We adapt the discrete Fourier transformation to transform a polynomial represented by its coefficient vector into the vector of evaluations at primitive roots of unity modulo a prime. The modulus switching algorithms require the coefficient representation, but we can manipulate the NTT representation for arithmetic operations. Consequently, the complexity of homomorphic operations mainly depends on this transformation between two representations. We implemented the NTT conversion and its inverse based on the butterfly techniques of Cooley-Tukey [12] and Gentleman-Sande [19], respectively. We also optimized these algorithms using Montgomery modular multiplication and butterfly algorithms [26] and Barrett reduction algorithm [4].

5.1 Parameter Sets and Benchmark

We propose parameter sets for multiplicative depths L from 5 to 15 in Table 1. It also shows experimental results for encryption, decryption, addition, scalar-multiplication, and multiplication (together with the rescaling operation) of the original implementation \(\texttt {HEAAN}\) and our RNS variant denoted by .

The smallest ciphertext modulus \(q_0\) should be larger than an encrypted plaintext for the correctness of the decryption circuit. We use \(\log {q_0} \approx 61\) and \(\log {q_i} \approx 55\) for \(i = 1,\dots ,L\). We present a list of primes in Appendix B. For a fair comparison, we choose a power-of-two integer \(Q_L\) of the same bit size as the implementation of the original \(\texttt {HEAAN}\). The coefficients of error polynomials are sampled from the discrete Gaussian distribution of standard deviation \(\sigma = 3.2\) and a secret key is chosen randomly from the set of signed binary polynomials with the Hamming weight \(h = 64\). We used the estimator of Albrecht et al. [2] to guarantee that the proposed parameter sets achieve at least 80-bit security level against the known attacks against the LWE problem.

Our implementation of the RNS variant improved the performance of basic operations by approximately ten times compared to the original \(\texttt {HEAAN}\) [10, 11]. For example, the encryption, decryption, addition, and multiplication are speedups of \( 9.1\), \(17.3\), \(7.4\), and \(8.3\) times, respectively, when evaluating a circuit of depth \(L=10\).

Table 1. Comparison of experimental results of \(\texttt {HEAAN}\) and

In Appendix A, we analyze the growth of errors and provide theoretical upper bounds on the growth during homomorphic operations. Figure 1 depicts the bit precisions of an encrypted plaintext during an evaluation of homomorphic multiplications for \(L=10\) with the parameter set in Table 1. We also provide an empirical result on the precision loss.

Fig. 1.
figure 1

Bit precision of encrypted plaintext

Our scheme exploits the approximate rounding operation which introduces an additional error. We observed that the precision of an output value is reduced by about three bits compared to the original \(\texttt {HEAAN}\) scheme. However, this small gap is not an critical issue in most of applications where an approximate result is sufficient for their purposes. In addition, we can easily increase the precision by setting a larger basis while still kee** advantages in the efficiency.

5.2 Homomorphic Evaluation of Statistical and Analytic Functions

The \(\texttt {HEAAN}\) scheme can evaluate an arbitrary analytic function by exploiting its polynomial approximation. Table 2 shows a parameter set and evaluation timings for the multiplicative inverse, the exponential function, and the sigmoid function \(\sigma (x)=(1+\exp (-x))^{-1}\). We adapt the approximation method for multiplicative inverse of [11, Algorithm 2] and evaluate the approximate polynomial of degree 15. For the exponential and sigmoid functions, we use the Taylor expansions up to degree 7. The output ciphertexts have at least 32 bits of precision. These computations can be performed over multiple slots simultaneously, yielding a better amortized performance per slot.

Table 2. Homomorphic evaluation of analytic functions

We also evaluated mean and variance functions that are the most common quantities in statistical analysis. There have been a few attempts to evaluate these measurements on an HE system. For example, Lauter et al. [30] took about six seconds to obtain the square sum of 100 integers without division by the number of elements.

For computation of mean and variance of n numbers, we encrypt all the numbers in a single ciphertext and compute their summation by applying the partial sum algorithm [9, Algorithm 2]. It repeats to rotate an encrypted plaintext vector and add it to the original ciphertext. The resulting ciphertext encrypts the mean value in every plaintext slot. The following example describes the partial sum algorithm when \(n=4\).

$$\begin{aligned} (m_1,m_2,m_3,m_4)&\mapsto ~(m_1,m_2,m_3,m_4)+(m_3,m_4,m_1,m_2)\\&= \ (m_1+m_3,m_2+m_4,m_1+m_3,m_2+m_4)\\&\mapsto ~\left( \sum _{i=1}^4m_i,\sum _{i=1}^4m_i,\sum _{i=1}^4m_i,\sum _{i=1}^4m_i\right) \end{aligned}$$

Contrary to previous work, the approximate HE scheme can perform a division by n by multiplying the constant \(\lfloor {q/n}\rceil \) and rescaling by one level. In the case of the variance function, we first square an input ciphertext and apply the same procedure to get a ciphertext encrypting the mean square in its plaintext slots. Then the variance of input data can be computed by subtracting the square of the encrypted mean value. We summarize the parameter and experimental results for homomorphic evaluation of statistical functions on \(n=2^{13}\) numbers in Table 3.

5.3 Homomorphic Training of Logistic Regression Model

The security and privacy issues have arisen on machine learning because the training of a model requires a large database consisting of sensitive information while the prediction phase is based on private information of individuals. The technology of an HE system is a promising solution to address these issues by aggregating encrypted personal data and building a model without information leakage. ML Confidential [23] and CryptoNets [22] are notable examples of leveraging the technology of HE for secure outsourcing of machine learning applications.

Table 3. Homomorphic evaluation of statistic functions

In particular, \(\texttt {HEAAN}\) [9, 11] is a strong candidate for machine learning tasks since most of training and prediction algorithms contain an arithmetic over the real numbers. For example, iDASH Security and Privacy Competition in 2017Footnote 1 announced a task which aims to build a logistic regression model from homomorphically encrypted genomic data. To be precise, for a given dataset consisting of n samples \(({\varvec{x}}_i,y_i)\in {\mathbb R}^d\times \{\pm 1\}\) of d features and a binary class, the goal was to find a weight vector \(\varvec{\beta }\in {\mathbb R}^{d+1}\) which minimizes the loss function

$$\begin{aligned} J(\varvec{\beta })=\sum _{i=1}^n \log (1+\exp (-\varvec{\beta }^T {\varvec{z}}_i)) \end{aligned}$$

where \({\varvec{z}}_i=y_i\cdot (1,{\varvec{x}}_i)\) for \(1\le i\le n\). The best solution [27] adapted the \(\texttt {HEAAN}\) library [10] to evaluate Nesterov’s accelerated gradient descent method [31].

We implemented the same algorithm based on to show its versatility and efficiency. For a fair comparison, we adapt the previous encoding and evaluation strategies: the whole database is encrypted in a single ciphertext and the sigmoid function of the gradient descent algorithm is approximated to its least squares approximation. Our implementation took about 1.8 min to train a model based on Low Birth Weight Study (lbw) [29] and Umaru Impact Study (uis) [33] datasets using a single core processor, compared to 3.5 min of the previous best solution [27] using four cores, while maintaining the accuracy and area under the ROC curve (AUC) of the resulting classifier (Table 4).

Table 4. Homomorphic training of logistic regression model

6 Conclusions and Future Work

In this article, we demonstrate a variant of \(\texttt {HEAAN}\) based on the RNS representation of polynomials. In the previous implementation, ciphertext moduli were selected as powers of a fixed base for the correctness of rescaling process. We resolve the issue by taking an RNS basis consisting of primes close to the base integer. In addition, we propose variants of modulus switching algorithms which can be computed without any RNS conversion or multi-precision arithmetic.

One disadvantage of our method is that it makes a trade-off between performance and accuracy. Because of the approximation error of an RNS basis, our scheme may have less accuracy compared to the original scheme when using the same parameter. Recently, \(\texttt {SEAL}\) version 3.0 [32] has been released. It supports a full RNS variant of \(\texttt {HEAAN}\), which is slightly different from our scheme. The main difference is that a ciphertext of \(\texttt {SEAL}\) contains a scaling factor which can be changed during computation. In other words, it continuously tracks the computation and updates the scaling factor information. This method does not have the above accuracy issue, but it is less intuitive and causes new problems related to the management of scaling factors. For example, the addition (resp. multiplication) of ciphertexts of different scaling factors (resp. levels) requires pre (resp. post) processing. It would be an interesting future work to combine the two methods to design a new scheme with enhanced functionality and flexibility.