1 Introduction

First, we recall the minimization problem for convex functions:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min\bigl\{ f(x): x\in H\bigr\} , $$
(MP1)

where H is a real Hilbert space and \(f: H\rightarrow(-\infty, \infty]\) is a proper, lower semicontinuous, and convex function. This is a classical convex minimization problem with many applications. To study this problem, Martinet [11] introduced the proximal point algorithm

$$ x_{n+1}=\operatorname{arg}\min_{y\in H} \biggl\{ f(y)+ \frac{1}{2\beta_{n}} \Vert y-x_{n} \Vert ^{2} \biggr\} ,\quad n\in\mathbb{N}, $$
(PPA)

and showed that \(\{x_{n}\}_{n\in\mathbb{N}}\) converges weakly to a minimizer of f under suitable conditions. This algorithm is useful, however, only for convex problems, because the idea for this algorithm is based on the monotonicity of subdifferential operators of convex functions. So, it is important to consider the relation between nonconvex problems and a proximal point algorithm.

The following is a well-known nonconvex problem, known as DC program:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in\mathbb{R}^{n}}\bigl\{ f(x)=g(x)-h(x) \bigr\} , $$
(DCP)

where \(g,h:\mathbb{R}^{n}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions. Here, the function f is called a DC function, and functions g and h are called DC components of f. In the DC program, the convention \((+\infty)-(+\infty)=+\infty\) has been adopted to avoid the ambiguity \((+\infty)-(+\infty)\) that does not present any interest. It is well known that a necessary condition for \(x\in\operatorname{dom}(f):=\{x\in\mathbb{R}^{n}: f(x)<\infty\}\) to be a local minimizer of f is \(\partial h(x)\subseteq \partial g(x)\), where \(\partial g(x)\) and \(\partial h(x)\) are the subdifferentials of g and h, respectively (see Definition 2.4). But this condition is hard to be reached. So, many researchers focus their attention on finding points such that \(\partial h(x)\cap\partial g(x) \neq\emptyset\), where x is called a critical point of f [8].

It is worth mentioning the richness of the class of DC functions which is a subspace containing the class of lower-\(\mathcal{C}^{2}\) functions. In particular, \(\mathcal{DC}(\mathbb{R}^{n})\) contains the space \(\mathcal{C}^{1,1}\) of functions whose gradient is locally Lipschitz continuous. Further, \(\mathcal{DC}(\mathbb{R}^{n})\) is closed under the operations usually considered in optimization. For example, a linear combination, a finite supremum, or the product of two DC functions remain DC. It is also known that the set of DC functions defined on a compact convex set of \(\mathbb{R}^{n}\) is dense in the set of continuous functions on this set.

We also observed that the interest in the theory of DC functions has much increased in the last years. Some interesting optimality conditions and duality theorems related to the DC program have been given (for example, see [6, 7, 14]). Some algorithms for the DC program are proposed to analyze and solve a variety of highly structured and practical problems (for example, see [13]).

In 2003, Sun, Sampaio, and Candido [16] gave the following algorithm to study problem (DCP).

Algorithm 1.1

(Proximal point algorithm for (DCP) [16])

Let \(\{\beta_{n}\} _{n\in\mathbb{N}}\) be a sequence in \((0,\infty)\), and let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions. Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n})\text{ and set }y_{n}=x_{n}+ \beta_{n}w_{n}, \\ x_{n+1}:=(I+\beta_{n}\partial g)^{-1}(y_{n}), \quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

In 2016, Souza, Oliveira, and Soubeyran [15] gave the following algorithm to study the DC program.

Algorithm 1.2

(Proximal linearized algorithm for (DCP) [15])

Let \(\{\beta_{n}\} _{n\in\mathbb{N}}\) be a sequence in \((0,\infty)\), and let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions. Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle w_{n},u-x_{n}\rangle\},\quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

In fact, if h is differentiable, then it is reduced to the following:

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\},\quad n\in\mathbb{N}. \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

Further, Souza, Oliveira, and Soubeyran [15] gave the following convergence theorem for problem (DCP).

Theorem 1.1

([15, Theorem 3])

Let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\cup\{+\infty\}\) be proper, lower semicontinuous, and convex functions, and \(g-h\) be bounded from below. Suppose that g is ρ-strongly convex, h is differentiable, and \(\nabla h(x)\) is L-Lipschitz continuous. Let \(\{\beta_{n}\}_{n\in \mathbb{N}}\) be a bounded sequence with \(\liminf_{n\rightarrow \infty}\beta _{n}>0\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 1.2. If \(\rho>2L\), then \(\{x_{n}\}_{n\in\mathbb {N}}\) converges linearly to a critical point of problem (DCP).

In this paper, we want to study the split DC program:

$$ \text{Find } \bar{x}\in H_{1} \text{ such that }\bar{x}\in \operatorname{arg}\min _{x\in H_{1}}f_{1}(x) \text{ and } A\bar{x}\in \operatorname{arg}\min _{y\in H _{2}}f_{2}(y), $$
(SDCP)

where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a nonzero linear and bounded map** with adjoint operator \(A^{*}\), \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g_{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\).

Clearly, (SDCP) is a generalization of problem (DCP). Indeed, if \(H_{1}=H_{2}=\mathbb{R}^{n}\), \(A:\mathbb{R}^{n}\rightarrow \mathbb{R}^{n}\) is the identity map**, \(g_{1}=g_{2}\), and \(h_{1}=h_{2}\), then problem (SDCP) is reduced to problem (DCP).

If \(h_{1}(x)=0\) and \(h_{2}(y)=0\) for all \(x\in H_{1}\) and \(y\in H_{2}\), then (SDCP) is reduced to the split minimization problems (SMP) for convex functions:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }g_{1}( \bar{x})= \min_{u\in H_{1}}g_{1}(u)\mbox{ and }g_{2}(A\bar{x})=\min_{v\in H_{2}}g _{2}(v), \end{aligned}$$
(SMP)

where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded map** with adjoint \(A^{*}\), \(g_{1}:H_{1}\rightarrow \mathbb{R}\) and \(g_{2}:H_{2}\rightarrow \mathbb{R}\) are proper, lower semicontinuous, and convex functions. For example, one can see [4] and the related references.

If \(H_{1}=H_{2}=H\) and \(A:H\rightarrow H\) is the identity map**, then problem (SMP) is reduced to the common minimization problem for convex functions:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})= \min _{u\in H}g_{1}(u)\mbox{ and }g_{2}(\bar{x})= \min_{v\in H}g_{2}(v), \end{aligned}$$
(CMP)

where H is a real Hilbert space, \(g_{1},g_{2}:H\rightarrow \mathbb {R}\) are proper, lower semicontinuous, and convex functions. Further, if the solution set of problem (CMP) is nonempty, then problem (CMP) is equivalent to the following problem:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})+g _{2}(\bar{x})=\min_{u\in H}g_{1}(u)+g_{2}(u), \end{aligned}$$
(MP2)

where H is a real Hilbert space, \(g_{1},g_{2}:H\rightarrow \mathbb {R}\) are proper, lower semicontinuous, and convex functions. This problem is well known and it has many important applications, including multiresolution sparse regularization, Fourier regularization, hard-constrained inconsistent feasibility, and alternating projection signal synthesis problems. For example, one can refer to [5, 9] and the related references.

On the other hand, Moudafi [12] introduced the split variational inclusion problem, which is a generalization of problem (SMP):

$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }0_{H_{1}}\in B_{1}( \bar{x})\mbox{ and }0_{H_{2}}\in B_{2}(A \bar{x}), \end{aligned}$$
(SVIP)

where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(B_{1}:H_{1}\multimap H _{1}\) and \(B_{2}:H_{2}\multimap H_{2}\) are set-valued maximal monotone map**s, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded operator, and \(A^{*}\) is the adjoint of A. Here, \(0_{H_{1}}\) and \(0_{H_{2}}\) are zero elements of real Hilbert spaces \(H_{1}\) and \(H_{2}\), respectively. To study problem (SVIP), Byrne et al. [3] gave the following algorithm and related convergence theorem in infinite dimensional Hilbert space.

Theorem 1.2

([3])

Let \(H_{1}\) and \(H_{2}\) be real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) be a nonzero linear and bounded operator, and \(A^{*}\) denote the adjoint operator of A. Let \(B_{1}:H_{1}\multimap H_{1}\) and \(B_{2}:H_{2}\multimap H_{2}\) be set-valued maximal monotone map**s, \(\beta>0\), and \(\gamma\in(0,\frac {2}{\Vert A\Vert ^{2}})\). Let Ω be the solution set of (SVIP), and suppose that \(\varOmega\neq\emptyset\). Let \(\{x_{n}\} _{n\in\mathbb{N}}\) be defined by

$$ x_{n+1}:=J_{\beta}^{B_{1}}\bigl[x_{n}-\gamma A^{*}\bigl(I-J_{\beta}^{B_{2}}\bigr)Ax_{n} \bigr],\quad n\in\mathbb{N}. $$

Then \(\{x_{n}\}\) converges weakly to an element \(\bar{x}\in\varOmega\).

If \(B_{1}=\partial g_{1}\) and \(B_{2}=\partial g_{2}\) (the subdifferential of \(g_{i}\), \(i=1,2\)), then the algorithm given by Theorem 1.2 is reduced to the following algorithm:

$$ \textstyle\begin{cases} y_{n}=\operatorname{arg}\min_{z\in H_{2}}\{g(z)+\frac{1}{2\beta_{n}} \Vert z-Ax_{n} \Vert ^{2} \}, \\ z_{n}=x_{n}-\gamma A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}=\operatorname{arg}\min_{y\in H_{1}}\{g(y)+\frac{1}{2\beta_{n}} \Vert y-z_{n} \Vert ^{2} \},\quad n\in\mathbb{N}. \end{cases} $$

In this paper, motivated by the above works on DC programs and related problems, we want to study problem (SDCP) by using the split proximal linearized algorithm:

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}, \end{cases} $$

where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded map** with adjoint \(A^{*}\), \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, \(g_{1}\) and \(g_{2}\) are strongly convex, \(h_{1}\) and \(h_{2}\) are Fréchet differentiable, \(\nabla h_{1}\) and \(\nabla h_{2}\) are L-Lipschitz continuous, and \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g _{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\). Further, linear convergence theorems for the proposed algorithms are established under suitable conditions. Finally, we give numerical results for the proposed convergence theorems.

2 Preliminaries

Let H be a (real) Hilbert space with the inner product \(\langle \cdot,\cdot\rangle\) and the norm \(\Vert \cdot\Vert \). We denote the strong and weak convergence of \(\{x_{n}\}_{n\in\mathbb{N}}\) to \(x\in H\) by \(x_{n}\rightarrow x\) and \(x_{n}\rightharpoonup x\), respectively. For each \(x,y,u,v\in H\) and \(\lambda\in\mathbb{R}\), we have

$$\begin{aligned}& \Vert x+y \Vert ^{2}= \Vert x \Vert ^{2}+2\langle x,y\rangle+ \Vert y \Vert ^{2}, \end{aligned}$$
(2.1)
$$\begin{aligned}& \bigl\Vert \lambda x+(1-\lambda)y \bigr\Vert ^{2}= \lambda \Vert x \Vert ^{2}+(1-\lambda) \Vert y \Vert ^{2}- \lambda(1-\lambda) \Vert x-y \Vert ^{2}, \end{aligned}$$
(2.2)
$$\begin{aligned}& 2\langle x-y,u-v\rangle= \Vert x-v \Vert ^{2}+ \Vert y-u \Vert ^{2}- \Vert x-u \Vert ^{2}- \Vert y-v \Vert ^{2}. \end{aligned}$$
(2.3)

Definition 2.1

Let H be a real Hilbert space, \(B:H\rightarrow H\) be a map**, and \(\rho>0\). Thus,

  1. (i)

    B is monotone if \(\langle x-y,Bx-By\rangle\geq0\) for all \(x,y\in H\).

  2. (ii)

    B is ρ-strongly monotone if \(\langle x-y,Bx-By \rangle\geq\rho\Vert x-y\Vert ^{2}\) for all \(x,y\in H\).

Definition 2.2

Let H be a real Hilbert space and \(B:H\multimap H\) be a set-valued map** with domain \(\mathcal{D}(B):=\{x\in H:B(x)\neq\emptyset\}\). Thus,

  1. (i)

    B is called monotone if \(\langle u-v,x-y\rangle\geq0\) for any \(u\in B(x)\) and \(v\in B(y)\).

  2. (ii)

    B is maximal monotone if its graph \(\{(x,y):x\in \mathcal{D}(B), y\in B(x)\}\) is not properly contained in the graph of any other monotone map**.

  3. (iii)

    B is ρ-strongly monotone if \(\langle x-y,u-v \rangle\geq\rho\Vert x-y\Vert ^{2}\) for all \(x,y\in H\) and all \(u\in B(x)\), and \(v\in B(y)\).

Definition 2.3

Let H be a real Hilbert space, and \(f:H\rightarrow\mathbb{R}\) be a function. Thus,

  1. (i)

    f is proper if \(\operatorname{dom}(f):=\{x\in H: f(x)<\infty\}\neq \emptyset\).

  2. (ii)

    f is lower semicontinuous if \(\{x\in H:f(x)\leq r\}\) is closed for each \(r\in\mathbb{R}\).

  3. (iii)

    f is convex if \(f(tx+(1-t)y)\leq t f(x)+(1-t)f(y)\) for every \(x,y\in H\) and \(t\in[0,1]\).

  4. (iv)

    f is ρ-strongly convex (\(\rho>0\)) if

    $$ f\bigl(tx+(1-t)y\bigr)+\frac{\rho}{2}\cdot t(1-t) \Vert x-y \Vert ^{2}\leq tf(x)+(1-t)f(y) $$

    for all \(x,y\in H\) and \(t\in(0,1)\).

  5. (v)

    f is Gâteaux differentiable at \(x\in H\) if there is \(\nabla f(x)\in H\) such that

    $$ \lim_{t\rightarrow0}\frac{f(x+ty)-f(x)}{t}=\bigl\langle y, \nabla f(x) \bigr\rangle $$

    for each \(y\in H\).

  6. (vi)

    f is Fréchet differentiable at x if there is \(\nabla f(x)\) such that

    $$ \lim_{y\rightarrow0} \frac{f(x+y)-f(x)-\langle\nabla f(x),y\rangle}{ \Vert y \Vert }=0. $$

Remark 2.1

Let H be a real Hilbert space. Then \(f(x):=\Vert x\Vert ^{2}\) is a 2-strongly convex function. Besides, we know \(g:H\rightarrow \mathbb {R}\) is ρ-strongly convex if and only if \(g-\frac{\rho}{2}\Vert \cdot \Vert ^{2}\) is convex [1, Proposition 10.6].

Example 2.1

Let \(g(x):=\frac{1}{2}\langle Qx,x\rangle-\langle x,b\rangle\), where \(Q\in\mathbb{R}^{n\times n}\) is a real symmetric positive definite matrix and \(b\in\mathbb{R}^{n}\). Then g is a strongly convex function.

Definition 2.4

Let \(f:H\rightarrow(-\infty,\infty]\) be a proper lower semicontinuous and convex function. Then the subdifferential ∂f of f is defined by

$$ \partial f(x):=\bigl\{ x^{*}\in H: f(x)+\bigl\langle y-x,x^{*} \bigr\rangle \leq f(y)\text{ for each }y\in H\bigr\} $$

for each \(x\in H\).

Lemma 2.1

Let \(f:H\rightarrow(-\infty,\infty]\) be a proper lower semicontinuous and convex function. Then the following are satisfied:

  1. (i)

    ∂f is a set-valued maximal monotone map**.

  2. (ii)

    f is Gâteaux differentiable at \(x\in \operatorname{int}( \mathcal{D})\) if and only if \(\partial f(x)\) consists of a single element. That is, \(\partial f(x)=\{\nabla f(x)\}\) [2, Proposition 1.1.10].

  3. (iii)

    Suppose that f is Fréchet differentiable. Then f is convex if and only iff is a monotone map**.

Lemma 2.2

([1, Example 22.3(iv)])

Let \(\rho>0\), H be a real Hilbert space and \(f:H\rightarrow \mathbb {R}\) be a proper, lower-semicontinuous, and convex function. If f is ρ-strongly convex, then ∂f is ρ-strongly monotone.

Lemma 2.3

([1, Proposition 16.26])

Let H be a real Hilbert space and \(f:H\rightarrow (\infty,\infty]\) be a proper, lower semicontinuous, and convex function. If \(\{u_{n}\}_{n\in \mathbb{N}}\) and \(\{x_{n}\}_{n\in\mathbb{N}}\) are sequences in H with \(u_{n}\in\partial f(x_{n})\) for all \(n\in\mathbb{N}\), and \(x_{n}\rightharpoonup x\) and \(u_{n}\rightarrow u\), then \(u\in\partial f(x)\).

Lemma 2.4

([17, p. 114])

Let \(\{a_{n}\}_{n\in \mathbb{N}}\) and \(\{b_{n}\}_{n\in\mathbb{N}}\) be sequences of nonnegative real numbers. If \(\sum_{n=1}^{\infty}a_{n}=\infty\) and \(\sum_{n=1}^{\infty}a_{n} b_{n}<\infty\), then \(\liminf_{n\rightarrow \infty }b_{n}=0\).

Lemma 2.5

([10])

Let H be a real Hilbert space, \(B:H\multimap H\) be a set-valued maximal monotone map**, \(\beta >0\), and \(J_{\beta}^{B}\) be defined by \(J_{\beta}^{B}(x):=(I+\beta B)^{-1}(x)\) for each \(x\in H\). Then \(J_{\beta}^{B}\) is a single-valued map**.

3 Split proximal linearized algorithm

Throughout this section, we use the following notations and assumptions. Let \(\rho\geq L>0\). Let \(H_{1}\) and \(H_{2}\) be finite dimensional real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) be a nonzero linear and bounded map**, \(A^{*}\) be the adjoint of A, \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions, \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions, \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g_{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\). Further, we assume that \(f_{1}\) and \(f_{2}\) are bounded from below, \(h_{1}\) and \(h_{2}\) are Fréchet differentiable, \(\nabla h_{1}\) and \(\nabla h _{2}\) are L-Lipschitz continuous, \(g_{1}\) and \(g_{2}\) are ρ-strongly convex. Let \(\beta\in(0,\infty)\), and let \(\{\beta_{n}\}_{n\in\mathbb{N}}\) be a sequence in \([a,b]\subseteq(0, \infty)\). Let \(r\in(0,\frac{1}{\Vert A\Vert ^{2}})\) and \(\{r_{n}\}_{n\in \mathbb{N}}\) be a sequence in \((0,\frac{1}{\Vert A\Vert ^{2}})\). Let \(\varOmega_{\text{SDCP}}\) be defined by

$$\varOmega_{\text{SDCP}}:=\bigl\{ x\in H_{1}: \nabla h_{1}(x)\in\partial g _{1}(x), \nabla h_{2}(Ax) \in\partial g_{2}(Ax)\bigr\} , $$

and assume that \(\varOmega_{\text{SDCP}}\neq\emptyset\).

Proposition 3.1

If \(\rho>L\) and \(\varOmega_{\mathrm{SDCP}}\neq\emptyset\), then the set \(\varOmega_{\mathrm{SDCP}}\) is a singleton.

Proof

If \(x,y\in\varOmega_{\mathrm{SDCP}}\), then

$$\begin{aligned}& \nabla h_{1}(x)\in\partial g_{1}(x),\qquad \nabla h_{1}(y)\in\partial g_{1}(y), \\& \nabla h_{2}(Ax)\in\partial g_{2}(Ax),\qquad \nabla h_{2}(Ay) \in\partial g_{2}(Ay). \end{aligned}$$

Since \(g_{1}\) is ρ-strongly convex, we know \(\partial g_{1}\) is ρ-strongly monotone. Thus,

$$ \rho \Vert x-y \Vert ^{2}\leq\bigl\langle x-y,\nabla h_{1}(x)-\nabla h_{1}(y) \bigr\rangle \leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert . $$

Since \(\nabla h_{1}\) is L-Lipschitz continuous, we have

$$ \rho \Vert x-y \Vert ^{2}\leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert \leq L \Vert x-y \Vert ^{2}. $$

Since \(\rho>L\), we have \(x=y\). The proof is completed. □

In this section, we study the split DC program by the following algorithm.

Algorithm 3.1

(Split proximal linearized algorithm)

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}. \end{cases} $$

Theorem 3.1

Let \(\{r_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,\frac {1}{\Vert A\Vert ^{2}})\) with \(0<\liminf_{n\rightarrow \infty}r_{n} \leq \limsup_{n\rightarrow \infty}r_{n}<\frac{1}{\Vert A\Vert ^{2}}\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 3.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar{x}\in\varOmega_{\mathrm{SDCP}}\).

Proof

Take any \(w\in\varOmega_{\text{SDCP}}\) and \(n\in\mathbb{N}\), and let w and n be fixed. First, from the second line of Algorithm 3.1, we get

$$ 0\in\partial g_{2}(y_{n})+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n})-\nabla h _{2}(Ax_{n}). $$
(3.1)

By (3.1), there exists \(u_{n}\in\partial g _{2}(y_{n})\) such that

$$ \nabla h_{2}(Ax_{n})=u_{n}+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n}). $$
(3.2)

Since \(w\in\varOmega_{\text{SDCP}}\), we know that \(\nabla h_{2}(Aw) \in\partial g_{2}(Aw)\). By Lemma 2.2, \(\partial g_{2}\) is ρ-strongly monotone, and then

$$ 0\leq\bigl\langle y_{n}-Aw,u_{n}-\nabla h_{2}(Aw)\bigr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$
(3.3)

By (3.2) and (3.3),

$$ 0\leq\biggl\langle y_{n}-Aw, \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax _{n}-y_{n})- \nabla h_{2}(Aw)\biggr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$
(3.4)

Hence, by (3.4), we have

$$\begin{aligned} 0 \leq&2\beta_{n}\bigl\langle y_{n}-Aw,\nabla h_{2}(Ax_{n})-\nabla h_{2}(Aw) \bigr\rangle +2 \langle y_{n}-Aw,Ax_{n}-y_{n}\rangle \\ & {}-2\beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ \leq&2\beta_{n}L \Vert y_{n}-Aw \Vert \cdot \Vert Ax_{n}-Aw \Vert -2\beta_{n}\rho \Vert y _{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2} \\ \leq& \beta_{n} L \Vert y_{n}-Aw \Vert ^{2}+ \beta_{n} L \Vert Ax_{n}-Aw \Vert ^{2}-2 \beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$
(3.5)

By (3.5), we obtain

$$\begin{aligned} \Vert y_{n}-Aw \Vert ^{2} \leq& \frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L} \Vert Ax_{n}-Aw \Vert ^{2}- \frac{ \Vert y _{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n}\rho-\beta_{n}L} \\ \leq& \Vert Ax_{n}-Aw \Vert ^{2}-\frac{ \Vert y_{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n} \rho-\beta_{n}L}. \end{aligned}$$
(3.6)

In the same way, one obtains

$$ \Vert x_{n+1}-w \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2} -\frac{1}{1+2\beta_{n}\rho- \beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2}. $$
(3.7)

Next, we have

$$\begin{aligned} 2 \Vert z_{n}-w \Vert ^{2} =& 2\bigl\langle z_{n}-w,x_{n}-r_{n} A^{*}(Ax_{n}-y_{n})-w \bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \bigl\langle z_{n}-w,A^{*}(Ax _{n}-y_{n})\bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \langle Az_{n}-Aw,Ax_{n}-y _{n}\rangle \\ =& \Vert z_{n}-w \Vert ^{2}+ \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y _{n} \Vert ^{2} \\ &{}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$
(3.8)

By (3.6), (3.7), and (3.8),

$$\begin{aligned}& \Vert x_{n+1}-w \Vert ^{2} \\& \quad \leq \Vert z_{n}-w \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}-r _{n} \Vert Ax_{n}-Aw \Vert ^{2} \\& \qquad {}+r_{n} \Vert A \Vert ^{2}\cdot \Vert z_{n}-x_{n} \Vert ^{2}+r_{n}\cdot \frac{\beta_{n}L+1}{1+2 \beta_{n}\rho-\beta_{n}L} \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(1-\frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L}\biggr) \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$
(3.9)

By (3.9), \(\lim_{n\rightarrow \infty}\Vert x_{n}-w\Vert \) exists and \(\{x_{n}\}_{n\in\mathbb{N}}\) is a bounded sequence. Further, \(\{Ax_{n}\}_{n\in\mathbb{N}}\), \(\{y_{n}\}_{n\in\mathbb{N}}\), \(\{z_{n}\}_{n\in\mathbb{N}}\) are bounded sequences. By (3.9) again, we know that

$$ \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert = \lim_{n\rightarrow \infty } \Vert z_{n}-w \Vert , $$
(3.10)

and

$$ \lim_{n\rightarrow \infty}\frac{ \Vert x_{n+1}-z_{n} \Vert ^{2}}{1+2\beta _{n}\rho- \beta_{n}L}=\lim _{n\rightarrow \infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} = \lim_{n\rightarrow \infty}\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}=0. $$
(3.11)

It follows from \(\{\beta_{n}\}_{n\in\mathbb{N}}\subseteq (a,b)\), \(0<\liminf_{n\rightarrow \infty}r_{n}\leq\limsup_{n\rightarrow \infty}r_{n}< \frac{1}{\Vert A\Vert ^{2}}\), and (3.11) that

$$ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =\lim_{n\rightarrow \infty} \Vert Az_{n}-y _{n} \Vert =\lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$
(3.12)

Since \(\{x_{n}\}_{n\in\mathbb{N}}\) is bounded, there exists a subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) of \(\{x_{n}\}_{n\in \mathbb{N}}\) such that \(x_{n_{k}}\rightarrow \bar{x}\in H_{1}\). Clearly, \(Ax_{n_{k}}\rightarrow A\bar{x}\), \(z_{n_{k}}\rightarrow \bar{x}\), \(Az_{n_{k}}\rightarrow A \bar{x}\), \(y_{n_{k}}\rightarrow A\bar{x}\), and \(x_{n_{k}+1}\rightarrow \bar{x}\). Further, by (3.2), we obtain

$$ \nabla h_{2}(Ax_{n_{k}})\in\partial g_{2}(y_{n_{k}})+\frac{1}{ \beta_{n_{k}}}(y_{n_{k}}-Ax_{n_{k}}). $$
(3.13)

By (3.12), (3.13), Lemma 2.3, and \(\{\beta_{n}\}_{n\in \mathbb{N}}\subseteq (a,b)\), we determine that

$$ \nabla h_{2}(A\bar{x})\in\partial g_{2}(A \bar{x}). $$
(3.14)

Similarly, we have

$$ \nabla h_{1}(\bar{x})\in\partial g_{1}( \bar{x}). $$
(3.15)

By (3.14) and (3.15), we know that \(\bar{x}\in\varOmega_{\text{SDCP}}\). Further, \(\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty }\Vert x_{n_{k}}-\bar{x}\Vert =0\). Therefore, the proof is completed. □

Remark 3.1

  1. (i)

    In Algorithm 3.1, if \(y_{n}=Ax _{n}\) and \(x_{n+1}=z_{n}\), then \(x_{n}=z_{n}\), and this implies that \(\nabla h_{1}(x_{n})\in\partial g_{1}(x_{n})\) and \(\nabla h_{2}(Ax _{n})\in\partial g_{2}(Ax_{n})\). Thus, \(x_{n}\in\varOmega_{\text{SDCP}}\).

  2. (ii)

    In Algorithm 3.1, if \(x_{n+1} \neq z_{n}\), then \(f_{1}(x_{n+1})< f_{1}(z_{n})\). Indeed, it follows from \(\partial h_{1}(z_{n})=\{\nabla h_{1}(z_{n})\}\) and the definition of \(x_{n+1}\) that

    $$ g_{1}(x_{n+1})-h_{1}(x_{n+1})+ \frac{1}{2\beta_{n}} \Vert x_{n+1}-z_{n} \Vert ^{2} \leq g_{1}(z_{n})-h_{1}(z_{n}). $$

    So, if \(x_{n+1}\neq z_{n}\), then \(f_{1}(x_{n+1})< f_{1}(z_{n})\).

  3. (iii)

    In Algorithm 3.1, if \(y_{n} \neq Ax_{n}\), then \(f_{2}(y_{n})< f_{2}(Ax_{n})\). Indeed, it follows from \(\partial h_{2}(Ax_{n})=\{\nabla h_{2}(Ax_{n})\}\) and the definition of \(y_{n}\) that

    $$ g_{2}(y_{n})-h_{2}(y_{n})+ \frac{1}{2\beta_{n}} \Vert y_{n}-Ax_{n} \Vert ^{2} \leq g_{2}(Ax_{n})-h_{2}(Ax_{n}). $$

    So, if \(y_{n}\neq Ax_{n}\), then \(f_{2}(y_{n})< f_{2}(Ax_{n})\).

  4. (iv)

    If \(\rho> L\), then it follows from (3.7) that (3.9) can be rewritten as

    $$ \Vert x_{n+1}-w \Vert ^{2}\leq k_{n} \Vert z_{n}-w \Vert ^{2}\leq k_{n} \Vert x_{n}-w \Vert ^{2}, $$

    where

    $$ k_{n}:=\frac{1+\beta_{n}L}{1+2\beta_{n}\rho-\beta_{n}L}\in(0,1). $$

    Hence, \(\{x_{n}\}_{n\in\mathbb{N}}\) converges linearly to , where \(\varOmega_{\text{SDCP}}=\{\bar{x}\}\).

Remark 3.2

From the proof of Theorem 3.1, we know that

$$ \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax_{n}-y_{n}) \in\partial g _{2}(y_{n}), $$
(3.16)

and this implies that

$$ Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n}) \in y_{n}+\beta_{n}\partial g_{2}(y _{n})=(I_{H_{2}}+\beta_{n}\partial g_{2}) (y_{n}), $$
(3.17)

where \(I_{H_{2}}\) is the identity map** on \(H_{2}\). Since \(g_{2}\) is proper, lower semicontinuous, and convex, we know that \(\partial g_{2}\) is maximal monotone. So, by Lemma 2.5, we determine that

$$ y_{n}=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}\bigl(Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n})\bigr). $$
(3.18)

Similarly, we have

$$ x_{n+1}=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}\bigl(z_{n}+\beta_{n} \nabla h_{1}(z_{n})\bigr), $$
(3.19)

where \(I_{H_{1}}\) is the identity map** on \(H_{1}\). Therefore, Algorithm 3.1 can be rewritten as the following algorithm:

$$ \textstyle\begin{cases} y_{n}:=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}(Ax_{n}+\beta_{n} \nabla h_{2}(Ax_{n})), \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}(z_{n}+\beta_{n} \nabla h_{1}(z_{n})), \quad n\in\mathbb{N}. \end{cases} $$
(Algorithm 3.2)

In fact, we observe that the idea of Algorithm 3.2 is the same as the proposed algorithm by Sun, Sampaio, and Candido [16, Algorithm 2.3]. Hence, a modified algorithm and related convergence theorems could be presented by using the idea of [16, Algorithm 5.3].

Remark 3.3

Under the assumptions in this section, consider the following:

$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}, \end{cases} $$
(3.20)

that is,

$$ \textstyle\begin{cases} y:=(I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax)), \\ z:=x-r A^{*}(Ax-y), \\ w:=(I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z)), \end{cases} $$
(3.21)

we know that

$$ Ax=y\quad \text{and}\quad z=w \quad \Leftrightarrow\quad x=z\in \varOmega_{\text{SDCP}}. $$
(3.22)

Proof

For this equivalence relation, we only need to show that \(x=z\in \varOmega_{\text{SDCP}}\) implies that \(Ax=y\) and \(z=w\). Indeed, since \(x=z\in\varOmega_{\text{SDCP}}\), we know that \(\nabla h_{1}(z)\in \partial g_{1}(z)\) and \(\nabla h_{2}(Ax)\in\partial g_{2}(Ax)\). By Lemma 2.5,

$$ \textstyle\begin{cases} (I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax))=Ax, \\ (I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z))=z. \end{cases} $$
(3.23)

By (3.21) and (3.23), we know that \(Ax=y\) and \(z=w\). □

Remark 3.4

In Algorithm 3.1, if \(\beta_{n}=\beta\) and \(r_{n}=r\) for each \(n\in\mathbb{N}\), and \(x_{N+1}=x_{N}\) for some \(N\in\mathbb{N}\), then \(x_{n}=x_{N}\), \(y_{n}=y_{N}\), and \(z_{n}=z _{N}\) for each \(n\in\mathbb{N}\) with \(n\geq N\). By Theorem 3.1, we know that \(\lim_{n\rightarrow \infty}x_{n}=x _{N}\in\varOmega_{\text{SDCP}}\). So, \(x_{n+1}=x_{n}\) could be set as a stop criterion in Algorithm 3.1. Further, from (3.21), we have

$$\begin{aligned} x=w\quad \Rightarrow& \quad x\in\varOmega_{\text{SDCP}} \\ \Rightarrow&\quad x\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z=w\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=w. \end{aligned}$$

This equivalence relation is important for the split DC program.

By Remark 3.4, we give the following result.

Proposition 3.2

Under the assumptions in this section, and

$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}. \end{cases} $$
(3.24)

Then \(x\in\varOmega_{\mathrm{SDCP}}\) if and only if \(x=w\).

Next, we give another convergence theorem for the split proximal linearized algorithm under different assumptions on \(\{r_{n}\}_{n \in\mathbb{N}}\).

Theorem 3.2

Let \(\{r_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,\frac {1}{\Vert A\Vert ^{2}})\) with \(\lim_{n\rightarrow \infty}r_{n}=0\) and \(\sum_{n=1}^{\infty }r_{n}=\infty\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 3.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar{x}\in\varOmega_{\mathrm{SDCP}}\).

Proof

Take any \(w\in\varOmega_{\mathrm{SDCP}}\) and \(n\in\mathbb{N}\), and let w and n be fixed. From the proof of Theorem 3.1, we have

$$\begin{aligned}& \nabla h_{2}(Ax_{n})-\frac{1}{\beta_{n}}(y_{n}-Ax_{n}) \in\partial g _{2}(Ax_{n}), \end{aligned}$$
(3.25)
$$\begin{aligned}& \nabla h_{1}(z_{n})-\frac{1}{\beta_{n}}(x_{n+1}-z_{n}) \in\partial g _{1}(x_{n+1}), \end{aligned}$$
(3.26)

and

$$\begin{aligned} \Vert x_{n+1}-w \Vert ^{2} \leq& \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z _{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\ &{}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2}-\frac{1}{1+2\beta_{n}\rho-\beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2} \\ \leq& \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$
(3.27)

Further, the following are satisfied:

$$ \textstyle\begin{cases} \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert \text{ exists,} \\ \{x_{n}\}_{n\in\mathbb{N}}, \{Ax_{n}\}_{n\in\mathbb{N}}, \{y_{n}\} _{n\in\mathbb{N}}, \{z_{n}\}_{n\in\mathbb{N}} \text{ are bounded sequences,} \\ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =0, \\ \lim_{n\rightarrow \infty}(1-r_{n} \Vert A \Vert ^{2}) \Vert x_{n}-z_{n} \Vert ^{2}=0. \end{cases} $$

Since \(\lim_{n\rightarrow \infty}r_{n}=0\), we know that

$$ \lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$
(3.28)

By (3.27), we have

$$ \sum_{n=1}^{\infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}\leq\sum _{n=1}^{\infty }\bigl( \Vert x_{n}-w \Vert ^{2}- \Vert x_{n+1}-w \Vert ^{2}\bigr)< \infty. $$
(3.29)

By (3.29) and Lemma 2.4, we determine that

$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=0. $$
(3.30)

Then there exist a subsequence \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\) of \(\{y_{n}\}_{n\in\mathbb{N}}\), a subsequence \(\{z_{n_{k}}\}_{k\in \mathbb{N}}\) of \(\{z_{n}\}_{n\in\mathbb{N}}\), and \(\bar{x}\in H_{1}\) such that \(z_{n_{k}}\rightarrow \bar{x}\), \(y_{n_{k}}\rightarrow A\bar{x}\), and

$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=\lim_{k\rightarrow \infty} \Vert Az _{n_{k}}-y_{n_{k}} \Vert ^{2}=0. $$
(3.31)

Clearly, \(x_{n_{k}}\rightarrow \bar{x}\), \(x_{n_{k}+1}\rightarrow \bar{x}\), and \(Ax_{n_{k}}\rightarrow A\bar{x}\). By (3.25) and (3.26), we know that \(\bar{x}\in \varOmega_{\text{SDCP}}\). Thus, \(\bar{x}=\hat{x}\). Since \(\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert \) exists, we know \(\lim_{n\rightarrow \infty}\Vert x _{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty}\Vert x_{n_{k}}-\bar{x}\Vert =0\). Therefore, the proof is completed. □

4 Application to the DC program and numerical results

Let \(\rho>L\geq0\). Let H be a finite dimensional Hilbert space, \(g,h:H\rightarrow \mathbb{R}\) be proper, lower semicontinuous, and convex functions. Besides, we also assume that h is Fréchet differentiable, ∇h is L-Lipschitz continuous, g is ρ-strongly convex. Let \(\{\beta_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((a,b)\subseteq(0,\infty)\). Let \(\{r_{n}\}_{n\in \mathbb{N}}\) be a sequence in \((0,1)\) with \(0<\liminf_{n\rightarrow \infty}r _{n}\leq \limsup_{n\rightarrow \infty}r_{n}<1\).

Now, we recall the DC program:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in H}\bigl\{ f(x)=g(x)-h(x)\bigr\} . $$
(DCP)

Let \(\varOmega_{\text{DCP}}\) be defined by

$$\varOmega_{\text{DCP}}:=\bigl\{ x\in H: \nabla h(x)\in\partial g(x)\bigr\} , $$

and assume that \(\varOmega_{\text{DCP}}\neq\emptyset\). If \(H_{1}=H_{2}=H\), \(g_{1}=g_{2}=g\), and \(h_{1}=h_{2}=h\), then we get the following algorithm and convergence theorem from Algorithm 3.1 and Theorem 3.1, respectively.

Algorithm 4.1

$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),v-x_{n}\rangle\}, \\ z_{n}:=(1-r_{n})x_{n}+r_{n}y_{n}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),u-z_{n}\rangle\},\quad n\in\mathbb{N}. \end{cases} $$

Theorem 4.1

Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 4.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar {x}\in\varOmega_{\mathrm{DCP}}\).

In fact, we can get the following algorithm and convergence theorem by Algorithm 4.1 and Theorem 4.1, respectively.

Algorithm 4.2

$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is given}, \\ z_{n}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),v-z_{n}\rangle\}, \\ x_{n+1}:=(1-r_{n})z_{n}+r_{n}y_{n}, \quad n\in\mathbb{N}. \end{cases} $$

Theorem 4.2

Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 4.2. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar {x}\in\varOmega_{\mathrm{DCP}}\).

Example 4.1

Let \(g,h:\mathbb{R}^{3}\rightarrow \mathbb{R}\) be defined by \(g(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}\) and \(h(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Then \(\varOmega_{\text{DCP}}:=\{x\in H: \nabla h(x)\in\partial g(x)\}=\{ (1,2,3)\}\).

Example 4.2

Let \(g_{1},h_{1}:\mathbb{R}^{3}\rightarrow \mathbb{R}\) be defined by \(g_{1}(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}\) and \(h_{1}(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Let \(g_{2},h_{2}:\mathbb{R}^{2}\rightarrow \mathbb{R}\) be defined by \(g_{2}(y_{1},y_{2}):=y_{1}^{2}+y_{2}^{2}\) and \(h_{2}(y_{1},y_{2}):=28y_{1}+64y_{2}\) for all \((y_{1},y_{2})\in\mathbb {R}^{2}\). Let \(A:\mathbb{R}^{3}\rightarrow \mathbb{R}^{2}\) be defined by \(A(x_{1},x_{2},x_{3})=(x_{1}+2x_{2}+3x_{3},4x_{1}+5x_{2}+6x_{3})\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Here, \(\Vert A\Vert \approx0.10517\) and \(\varOmega_{\text{SDCP}}=\{(1,2,3)\}\).

From Table 1, we see that Algorithm 4.2 reaches the required errors only need six iterations if \(\beta_{n}=500\) for all \(n\in \mathbb{N}\), but Algorithm 4.2 reaches the required errors need 73 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).

Table 1 Numerical results for Example 4.1.

From Table 2, we see that Algorithm 4.2 reaches the required errors only need seven iterations if \(\beta_{n}=100\) for all \(n\in\mathbb{N}\), but Algorithm 4.2 reaches the required errors need 72 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).

Table 2 Numerical results for Example 4.1.

From Table 3, we see that Algorithm 3.1 reaches the required errors only need seven iterations if \(\beta_{n}=700\) for all \(n\in\mathbb{N}\), but Algorithm 3.1 reaches the required errors need 99 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).

Table 3 Numerical results for Example 4.2.

From Table 3 and Table 4, we see that Algorithm 3.1 reaches the required errors need 283 iterations if \(\beta_{n}=1\) and \(r_{n}=0.05\) for all \(n\in\mathbb{N}\), but Algorithm 3.1 reaches the required errors need 39 iterations if \(\beta_{n}=1\) and \(r_{n}=0.09\) for all \(n\in\mathbb{N}\). On the other hand, for other settings of \(\beta_{n}\), we know the numerical results in Table 3 and Table 4 show that there are no significant differences in the setting of \(\{r_{n}\}_{n\in\mathbb{N}}\).

Table 4 Numerical results for Example 4.2.

However, in Table 5, if \(\beta_{n}=1\) for all \(n\in\mathbb{N}\), then we know the numerical results have big differences in the setting of \(\{r_{n}\}_{n\in\mathbb{N}}\).

Table 5 Numerical results for Example 4.2.