Log in

A closed-form bound on the asymptotic linear convergence of iterative methods via fixed point analysis

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

In many iterative optimization methods, fixed-point theory enables the analysis of the convergence rate via the contraction factor associated with the linear approximation of the fixed-point operator. While this factor characterizes the asymptotic linear rate of convergence, it does not explain the non-linear behavior of these algorithms in the non-asymptotic regime. In this letter, we take into account the effect of the first-order approximation error and present a closed-form bound on the convergence in terms of the number of iterations required for the distance between the iterate and the limit point to reach an arbitrarily small fraction of the initial distance. Our bound includes two terms: one corresponds to the number of iterations required for the linearized version of the fixed-point operator and the other corresponds to the overhead associated with the approximation error. With a focus on the convergence in the scalar case, the tightness of the proposed bound is proven for positively quadratic first-order difference equations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. \(\Vert \cdot \Vert\) denotes the Euclidean norm.

  2. A tighter version of the upper bound \(\overline{K}_2(\epsilon )\) is given in Appendix A, cf., (A8) and (A10).

References

  1. Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)

    Article  Google Scholar 

  2. Saigal, R., Todd, M.J.: Efficient acceleration techniques for fixed point algorithms. SIAM J. Numer. Anal. 15(5), 997–1007 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  3. Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49(4), 1715–1735 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Jung, A.: A fixed-point of view on gradient methods for big data. Front. Appl. Math. Stat. 3, 18 (2017)

    Article  Google Scholar 

  5. Brouwer, L.E.J.: Über abbildung von mannigfaltigkeiten. Math. Ann. 71(1), 97–115 (1911)

    Article  MathSciNet  MATH  Google Scholar 

  6. Banach, S.: Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales. Fundam. Math. 3(1), 133–181 (1922)

    Article  MATH  Google Scholar 

  7. Lambers, J.V., Mooney, A.S., Montiforte, V.A.: Explorations in Numerical Analysis. World Scientific, Singapore (2019)

    MATH  Google Scholar 

  8. Roberts, A.: The derivative as a linear transformation. Am. Math. Mon. 76(6), 632–638 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bellman, R.: Stability Theory of Differential Equations. McGraw-Hill, New York (1953)

    MATH  Google Scholar 

  10. Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. NBS Appl. Math. Ser. 55 (1964)

  11. Vu, T., Raich, R.: Local convergence of the Heavy Ball method in iterative hard thresholding for low-rank matrix completion. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3417–3421 (2019)

  12. Vu, T., Raich, R.: Accelerating iterative hard thresholding for low-rank matrix completion via adaptive restart. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2917–2921 (2019)

  13. Vu, T., Raich, R., Fu, X.: On convergence of projected gradient descent for minimizing a large-scale quadratic over the unit sphere. In: IEEE International Workshop on Machine Learning for Signal Processing, pp. 1–6 (2019)

  14. Vu, T., Raich, R.: Exact linear convergence rate analysis for low-rank symmetric matrix completion via gradient descent. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3240–3244 (2021)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trung Vu.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article. Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proof of Theorem 1

First, we establish a sandwich inequality on \(K(\epsilon )\) in the following lemma:

Lemma 1

For any \(0< \epsilon < 1\), let \(K(\epsilon )\) be the smallest integer such that for all \(k \ge K(\epsilon )\), we have \(a_k \le \epsilon a_0\). Then,

$$\begin{aligned} \underline{K}(\epsilon ) \triangleq F\bigl (\log (1/\epsilon )\bigr ) \le K(\epsilon ) \le F\bigl (\log (1/\epsilon )\bigr ) + b(\rho ,\tau ) \triangleq \overline{K}(\epsilon ) , \end{aligned}$$
(A1)

where \(b(\rho ,\tau )\) is defined in (7) and

$$\begin{aligned} F(x) = \int _0^x f(t) dt \quad \text { with } \quad f(x)&= \frac{1}{-\log \bigl (\rho + \tau (1-\rho ) e^{-x} \bigr )} . \end{aligned}$$
(A2)

The lemma provides an upper bound on \(K(\epsilon )\). Moreover, it is a tight bound in the sense that the gap between lower bound \(\underline{K}(\epsilon )\) and the upper bound \(\overline{K}(\epsilon )\) is independent of \(\epsilon\). In other words, the ratio \(K(\epsilon )/\overline{K}(\epsilon )\) approaches 1 as \(\epsilon \rightarrow 0\). Next, we proceed to obtain a tight closed-form upper bound on \(\overline{K}(\epsilon )\) by upper-bounding \(F(\log (1/\epsilon ))\).

Lemma 2

Consider the function \(F(\cdot )\) given in (A2). For \(0<\epsilon <1\), we have

$$F(\log (1/\epsilon ))\le \frac{\log (1/\epsilon )}{\log (1/\rho )} + \frac{\Delta E_1 (\log \frac{1}{\rho +\tau (1-\rho )} , \log \frac{1}{\rho +\epsilon \tau (1-\rho )})}{\rho \log (1/\rho )} \triangleq \overline{F}_1 (\log (1/\epsilon ) )$$
(A3)
$$\le \frac{\log (1/\epsilon )}{\log (1/\rho )} + \frac{\Delta E_1 \Bigl (\log \frac{1}{\rho +\tau (1-\rho )} , \log \frac{1}{\rho }\Bigr )}{\rho \log (1/\rho )} \triangleq \overline{F}_2 \bigl (\log (1/\epsilon )\bigr )$$
(A4)

and

$$\begin{aligned} F\bigl (\log (1/\epsilon )\bigr )&\ge \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) - A(\epsilon ) \triangleq \underline{F}_1 \bigl (\log (1/\epsilon )\bigr ) , \end{aligned}$$
(A5)

where

$$\begin{aligned} A(\epsilon )&\triangleq \frac{\Delta E_1\bigl (2\log \frac{1}{\rho +\tau (1-\rho )} , 2\log \frac{1}{\rho +\tau (1-\rho )\epsilon }\bigr ) - \rho \Delta E_1\bigl (\log \frac{1}{\rho +\tau (1-\rho )} , \log \frac{1}{\rho +\tau (1-\rho )\epsilon }\bigr )}{2\rho ^2 \log (1/\rho )} . \end{aligned}$$
(A6)

Lemma 2 offers two upper bounds on \(F(\log (1/\epsilon ))\) and one lower bound. The first bound \(\overline{F}_1(\log (1/\epsilon ))\) approximates well the behavior of \(F(\log (1/\epsilon ))\) for both small and large values of \(\log (1/\epsilon )\). The second bound \(\overline{F}_2 (\log (1/\epsilon ))\) provides a linear bound on \(F(\log (1/\epsilon ))\) in terms of \(\log (1/\epsilon )\). Moreover, the gap between \(F(\log (1/\epsilon ))\) and \(\underline{F}_1(\log (1/\epsilon ))\), given by \(A(\epsilon )\), can be upper bound by A(0) since \(A(\cdot )\) is monotonically decreasing for \(\epsilon \in [0,1)\). While \(F(\cdot )\) asymptotically increases like \(\log (1/\epsilon )/ \log (1/\rho )\), the gap approaches a constant independent of \(\epsilon\). Replacing \(F(\log (1/\epsilon ))\) on the RHS of (A1) by either of the upper bounds in Lemma 2, we obtain two corresponding bounds on \(K(\epsilon )\):

$$\begin{aligned} \overline{K}_1(\epsilon ) \triangleq \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) + b(\rho ,\tau ) \le \overline{F}_2\bigl (\log (1/\epsilon )\bigr ) + b(\rho ,\tau ) \triangleq \overline{K}_2(\epsilon ) , \end{aligned}$$
(A7)

where we note that \(\overline{K}_2(\epsilon )\) has the same expression as in (5). Moreover, the tightness of these two upper bounds can be shown as follows. First, using the first inequality in (A1) and then the lower bound on \(F(\log (1/\epsilon ))\) in (A5), the gap between \(\overline{K}_1(\epsilon )\) and \(K(\epsilon )\) can be bounded by

$$\begin{aligned} \overline{K}_1(\epsilon ) - K(\epsilon )&\le \overline{K}_1(\epsilon ) - F\bigl (\log (1/\epsilon )\bigr ) \nonumber \\&\le \overline{K}_1(\epsilon ) - \Bigl ( \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) - A(\epsilon ) \Bigr ) \nonumber \\&= \Bigl ( \overline{F}_1 \bigl ( \bigl (\log (1/\epsilon )\bigr ) + b(\rho ,\tau ) \Bigr ) - \Bigl ( \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) - A(\epsilon ) \Bigr ) \nonumber \\&= A(\epsilon ) + b(\rho ,\tau ) \nonumber \\&\le A(0) + b(\rho ,\tau ) , \end{aligned}$$
(A8)

where the last inequality stems from the monotonicity of \(A(\cdot )\) in [0, 1). Note that the bound in (A8) holds uniformly independent of \(\epsilon\), implying \(\overline{K}_1(\epsilon )\) is a tight bound on \(K(\epsilon )\). Second, using (A7), the gap between \(\overline{K}_2(\epsilon )\) and \(K(\epsilon )\) can be represented as

$$\begin{aligned} \overline{K}_2(\epsilon ) - K(\epsilon )&= \bigl ( \overline{K}_2(\epsilon ) - \overline{K}_1(\epsilon ) \bigr ) + \bigl ( \overline{K}_1(\epsilon ) - K(\epsilon ) \bigr ) \nonumber \\&= \bigl ( \overline{F}_2\bigl (\log (1/\epsilon )\bigr ) - \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) \bigr ) + \bigl ( \overline{K}_1(\epsilon ) - K(\epsilon ) \bigr ) \nonumber \\&\le \bigl ( \overline{F}_2\bigl (\log (1/\epsilon )\bigr ) - \overline{F}_1\bigl (\log (1/\epsilon )\bigr ) \bigr ) + \bigl ( A(0) + b(\rho ,\tau ) \bigr ) , \end{aligned}$$
(A9)

where the last inequality stems from (A8). Furthermore, using the definition of \(\overline{F}_1(\log (1/\epsilon ))\) and \(\overline{F}_2(\log (1/\epsilon ))\) in (A3) and (A4), respectively, we have \(\lim _{\epsilon \rightarrow 0} (\overline{F}_2(\log (1/\epsilon )) - \overline{F}_1(\log (1/\epsilon ))) = 0\). Thus, taking the limit \(\epsilon \rightarrow 0\) on both sides of (A9), we obtain

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \bigl ( \overline{K}_2(\epsilon ) - K(\epsilon ) \bigr )&\le A(0) + b(\rho ,\tau ) . \end{aligned}$$
(A10)

We note that \(\overline{K}_2(\epsilon )\) is a simple bound that is linear in terms of \(\log (1/\epsilon )\) and approaches the upper bound \(\overline{K}_1(\epsilon )\) in the asymptotic regime (\(\epsilon \rightarrow 0\)). Evaluating A(0) from (A6) and substituting it back into (A10) yields (8), which completes our proof of Theorem 1. Figure 1 (right) depicts the aforementioned bounds on \(K(\epsilon )\). It can be seen from the plot that all the four bounds match the asymptotic rate of increment in \(K(\epsilon )\) (for large values of \(1/\epsilon\)). The three bounds \(\underline{K}(\epsilon )\) (red), \(\overline{K}(\epsilon )\) (yellow), and \(\overline{K}_1(\epsilon )\) (purple) closely follow \(K(\epsilon )\) (blue), indicating that the integral function \(F(\cdot )\) effectively estimates the minimum number of iterations required to achieve \(a_k \le \epsilon a_0\) in this setting. The upper bound \(\overline{K}_2(\epsilon )\) (green) forms a tangent to \(\overline{K}_1(\epsilon )\) at \(1/\epsilon \rightarrow \infty\) (i.e., \(\epsilon \rightarrow 0\)).

Appendix A. 1: Proof of Lemma 1

Let \(d_k=\log ({a_0}/{a_k})\) for each \(k \in {\mathbb N}\). Substituting \(a_k = a_0 e^{-d_k}\) into (3), we obtain the surrogate sequence \(\{d_k\}_{k=0}^\infty\):

$$\begin{aligned} d_{k+1} = d_k - \log \bigl (\rho + \tau (1-\rho ) e^{-d_k} \bigr ) , \end{aligned}$$
(A11)

where \(d_0=0\) and \(\tau = a_0 q/(1-\rho ) \in (0,1)\). Since \(\{a_k\}_{k=0}^\infty\) is monotonically decreasing to 0 and \(d_k\) is monotonically decreasing as a function of \(a_k\), \(\{d_k\}_{k=0}^\infty\) is a monotonically increasing sequence. Our key steps in this proof are first to tightly bound the index \(K \in \mathbb {N}\) using \(F(d_K)\)

$$\begin{aligned} F(d_K) \le K \le F(d_K) + \frac{1}{2\rho } \log \biggl ( \frac{\log \rho }{\log \bigl (\rho +\tau (1-\rho )\bigr )} \biggr ) \end{aligned}$$
(A12)

and then to obtain (A1) from (A12) using the monotonicity of the sequence \(\{d_k\}_{k=0}^\infty\) and of the function \(F(\cdot )\). We proceed with the details of each of the steps in the following.

Step 1: We prove (A12) by showing the lower bound on K first and then showing the upper bound on K. Using (A2), we can rewrite (A11) as \(d_{k+1}=d_k+1/f(d_k)\). Rearranging this equation yields

$$\begin{aligned} f(d_k) (d_{k+1} - d_k) = 1 . \end{aligned}$$
(A13)

Since f(x) is monotonically decreasing, we obtain the lower bound on K in (A12) by

$$\begin{aligned} F(d_K)&= \int _0^{d_K} f(x) dx = \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} f(x) dx \nonumber \\&\le \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} f(d_k) dx = \sum _{k=0}^{K-1} f(d_k) (d_{k+1}-d_k) = K , \end{aligned}$$
(A14)

where the last equality stems from (A13). For the upper bound on K in (A12), we use the convexity of \(f(\cdot )\) to lower-bound \(F(d_K)\) as follows

$$\begin{aligned} F(d_K)&= \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} f(x) dx \ge \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} \bigl ( f(d_k) + f'(d_k) (x-d_k) \bigr ) dx \nonumber \\&= \sum _{k=0}^{K-1} \Bigl ( f(d_k) (d_{k+1}-d_k) + \frac{1}{2} f'(d_k) (d_{k+1}-d_k)^2 \Bigr ) . \end{aligned}$$
(A15)

Using (A13) and substituting \(f'(x) = -\bigl (f(x)\bigr )^2 \frac{\tau (1-\rho ) e^{-x}}{\rho +\tau (1-\rho ) e^{-x}}\) into the RHS of (A15), we obtain

$$\begin{aligned} F(d_K)&\ge K - \frac{1}{2} \sum _{k=0}^{K-1} \frac{\tau (1-\rho ) e^{-d_k}}{\rho +\tau (1-\rho ) e^{-d_k}} . \end{aligned}$$
(A16)

Note that (A16) already offers an upper on K in terms of \(F(d_K)\). To obtain the upper bound on K in (A12) from (A16), it suffices to show that

$$\begin{aligned} \sum _{k=0}^{K-1} \frac{\tau (1-\rho ) e^{-d_k}}{\rho +\tau (1-\rho ) e^{-d_k}}&\le \frac{1}{\rho } \log \biggl ( \frac{\log \rho }{\log \bigl (\rho +\tau (1-\rho )\bigr )} \biggr ) . \end{aligned}$$
(A17)

In the following, we prove (A17) by introducing the functions

$$\begin{aligned} g(x) = \frac{\tau (1-\rho ) e^{-x}}{\rho +\tau (1-\rho ) e^{-x}} \frac{1}{-\log \bigl (\rho + \tau (1-\rho ) e^{-x} \bigr )} \end{aligned}$$
(A18)

and

$$\begin{aligned} G(x) = \int _0^{x} g(t) dt =\log \Bigl (\frac{\log (\rho +\tau (1-\rho )e^{-x})}{\log (\rho +\tau (1-\rho ))}\Bigr ) . \end{aligned}$$
(A19)

Note that \(g(\cdot )\) is monotonically decreasing (a product of two decreasing functions) while \(G(\cdot )\) is monotonically increasing (an integral of a non-negative function) on \([0,\infty )\). We have

$$\begin{aligned} G(d_K)&= \int _0^{d_K} g(x) dx = \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} g(x) dx \ge \sum _{k=0}^{K-1} \int _{d_k}^{d_{k+1}} g(d_{k+1}) dx \nonumber \\&= \sum _{k=0}^{K-1} g(d_{k+1}) (d_{k+1}-d_k) = \sum _{k=0}^{K-1} \frac{g(d_{k+1})}{g(d_k)} g(d_k) (d_{k+1}-d_k) . \end{aligned}$$
(A20)

Lemma 3

For any \(k \in \mathbb {N}\), we have \({g(d_{k+1})}/{g(d_k)} \ge \rho\).

Proof

For \(k\in \mathbb {N}\), let \(t_k = \rho + \tau (1-\rho ) e^{-d_k} \in (\rho ,1)\). From (A11), we have \(t_k = e^{-(d_{k+1}-d_k)}\) and \(t_{k+1} = \rho + \tau (1-\rho ) e^{-d_{k+1}} = \rho + \tau (1-\rho ) e^{-d_{k}} e^{-(d_{k+1}-d_k)} = \rho + (t_k-\rho )t_k\). Substituting \(d_k\) for x in g(x) from (A18) and replacing \(\rho + \tau (1-\rho ) e^{-d_k}\) with \(t_k\) yield \(g(d_k) = \frac{\tau (1-\rho ) e^{-d_{k}}}{t_k} \frac{1}{-\log (t_k)}\). Repeating the same process to obtain \(g(d_{k+1})\) and taking the ratio between \(g(d_{k+1})\) and \(g(d_k)\), we obtain

$$\begin{aligned} \frac{g(d_{k+1})}{g(d_k)} = e^{-(d_{k+1}-d_k)} \frac{t_k}{t_{k+1}} \frac{\log (t_k)}{\log (t_{k+1})} . \end{aligned}$$
(A21)

Substituting \(e^{-(d_{k+1}-d_k)} = t_k\) and \(t_{k+1} = \rho + (t_k-\rho )t_k\) into (A21) yields

$$\begin{aligned} \frac{g(d_{k+1})}{g(d_k)} = \frac{t_k^2 \log (t_k)}{(\rho +(t_k-\rho )t_k)\log (\rho +(t_k-\rho )t_k)} . \end{aligned}$$
(A22)

We now continue to bound the ratio \({g(d_{k+1})}/{g(d_k)}\) by bounding the RHS. Since \(t_k-\rho \ge 0\) and \(t_k<1\), we have \(t_k-\rho > (t_k-\rho ) t_k\) and hence \({t_k}/{(\rho +(t_k-\rho )t_k)} > 1\). Thus, in order to prove \(\frac{g(d_{k+1})}{g(d_k)} \ge \rho\) from the fact that the RHS of (A22) is greater or equal to \(\rho\), it remains to show that

$$\begin{aligned} \frac{t_k \log (t_k)}{\log \bigl (\rho +(t_k-\rho )t_k \bigr )} \ge \rho . \end{aligned}$$
(A23)

By the concavity of \(\log (\cdot )\), it holds that \(\log (\frac{\rho }{t_k} 1+\frac{t_k-\rho }{t_k} t ) \ge \frac{\rho }{t_k} \log (1) + \frac{t_k-\rho }{t_k} \log (t_k) = (1-\frac{\rho }{t_k}) \log (t_k)\). Adding \(\log (t_k)\) to both sides of the last inequality yields \(\log (\rho +(t_k-\rho )t_k ) \ge (2-\frac{\rho }{t_k}) \log (t_k)\). Now using the fact that \((\sqrt{\rho /t_k} - \sqrt{t_k/\rho })^2 \ge 0\), we have \(2-\rho /t_k \le t_k/\rho\). By this inequality and the negativity of \(\log (t_k)\), we have \(\log (\rho +(t_k-\rho )t_k ) \ge \frac{t_k}{\rho } \log (t_k)\). Multiplying both sides by the negative ratio \(\rho /\log (\rho +(t_k-\rho )t_k)\) and adjusting the direction of the inequality yields the inequality in (A23), which completes our proof of the lemma.

Back to our proof of Theorem 1, applying Lemma 3 to (A20) and substituting \(d_{k+1}-d_k = -\log (\rho + \tau (1-\rho ) e^{-d_k})\) from (A11) and \(g(d_k)\) from (A18), we have

$$\begin{aligned} G(d_K)&\ge \sum _{k=0}^{K-1} \rho g(d_k) (d_{k+1}-d_k) = \rho \sum _{k=0}^{K-1} \frac{\tau (1-\rho ) e^{-d_k}}{\rho +\tau (1-\rho ) e^{-d_k}} . \end{aligned}$$
(A24)

Using the monotonicity of \(G(\cdot )\), we upper-bound \(G(d_K)\) by

$$\begin{aligned} G(d_K) \le G(\infty ) = \log \Bigl ( \frac{\log \rho }{\log (\rho +\tau (1-\rho ))} \Bigr ) . \end{aligned}$$
(A25)

Thus, the RHS of (A24) is upper bounded by the RHS of (A25). Dividing the result by \(\rho\), we obtain (A17). This completes our proof of the upper bound on K in (A12) and thereby the first step of the proof.

Step 2: We proved both the lower bound and the upper bound on K in (A12). Next, we proceed to show (A1) using (A12). By the definition of \(K(\epsilon )\), \(a_{K(\epsilon )} \le \epsilon a_0 < a_{K(\epsilon )-1}\). Since \(d_k=\log ({a_0}/{a_k})\), for \(k \in {\mathbb N}\), we have \(d_{K(\epsilon )-1} \le \log (1/\epsilon ) \le d_{K(\epsilon )}\). On the one hand, using the monotonicity of \(F(\cdot )\) and substituting \(K=K(\epsilon )\) into the lower bound on K in (A12) yields

$$\begin{aligned} F\bigl (\log (1/\epsilon )\bigr ) \le F(d_{K(\epsilon )}) \le K(\epsilon ) . \end{aligned}$$
(A26)

On the other hand, substituting \(K=K(\epsilon )-1\) into the upper bound on K in (A12), we obtain

$$\begin{aligned} K(\epsilon ) - 1 \le F(d_{K(\epsilon )-1}) + \frac{1}{2\rho }\log \biggl ( \frac{\log \rho }{\log \bigl (\rho +\tau (1-\rho )\bigr )} \biggr ) . \end{aligned}$$
(A27)

Since \(F(\cdot )\) is monotonically increasing and \(d_{K(\epsilon )-1} \le \log (1/\epsilon )\), we have \(F(d_{K(\epsilon )-1}) \le F(\log (1/\epsilon ))\). Therefore, upper-bounding \(F(d_{K(\epsilon )-1})\) on the RHS of (A27) by \(F(\log (1/\epsilon ))\) yields

$$\begin{aligned} K(\epsilon ) \le F\bigl (\log (1/\epsilon )\bigr ) + \frac{1}{2\rho }\log \biggl ( \frac{\log \rho }{\log \bigl (\rho +\tau (1-\rho )\bigr )} \biggr ) + 1 . \end{aligned}$$
(A28)

The inequality (A1) follows on combining (A26) and (A28).

Appendix A. 2: Proof of Lemma 2

Let \(\nu =\tau (1-\rho ) /\rho\). We represent f(x) in the interval \((0,\log (1/\epsilon ))\) as

$$\begin{aligned} f(x)&= \frac{1}{-\log \bigl (\rho + \tau (1-\rho ) e^{-x} \bigr )} \\&= \frac{1}{\log (1/\rho )} + \frac{1}{\log (1/\rho )} \frac{\log (1+\nu e^{-x})}{\log (1/\rho )-\log (1+\nu e^{-x})} . \end{aligned}$$

Then, taking the integral from 0 to \(\log (1/\epsilon )\) yields

$$\begin{aligned} F\bigl (\log (1/\epsilon )\bigr ) = \frac{1}{\log (1/\rho )} \biggl ( \log (1/\epsilon ) + \int _0^{\log (1/\epsilon )} \frac{\log (1+\nu e^{-t})}{\log (1/\rho )-\log (1+\nu e^{-t})} dt \biggr ) . \end{aligned}$$
(A29)

Using \(\alpha (1-\alpha /2)=\alpha -\alpha ^2/2 \le \log (1+\alpha ) \le \alpha\), for \(\alpha =\nu e^{-t} \ge 0\), on the numerator within the integral in (A29) and changing the integration variable t to \(z = \log (1/\rho )-\log (1+ \nu e^{-t})\), we obtain both an upper bound and a lower bound on the integral on the RHS of (A29)

$$\begin{aligned} \frac{1}{\rho } \int ^{\overline{z}}_{\underline{z}} \frac{e^{-z} - \frac{1}{2\rho }e^{-z}(e^{-z}-\rho )}{z} dz&\le \int _0^{\log (1/\epsilon )} \frac{\log (1+\nu e^{-t})}{\log (1/\rho )-\log (1+\nu e^{-t})} dt \nonumber \\&\le \frac{1}{\rho } \int ^{\overline{z}}_{\underline{z}} \frac{e^{-z}}{z} dz , \end{aligned}$$
(A30)

where \(\underline{z} = -\log (\rho +\tau (1-\rho ))\) and \(\overline{z} = -\log (\rho +\epsilon \tau (1-\rho ))\). Replacing the integral in (A29) by the upper bound and lower bound from (A30), using the definition of the exponential integral, and simplifying, we obtain the upper-bound on \(F(\log (1/\epsilon ))\) given by \(\overline{F}_1(\log (1/\epsilon ))\) in (A3) and similarly the lower bound on \(F(\log (1/\epsilon ))\) given by \(\underline{F}_1(\log (1/\epsilon ))\) in (A5). Finally, we prove the second upper bound in (A4) as follows. Since \(E_1(\cdot )\) is monotonically decreasing and \(\frac{1}{\rho +\epsilon \tau (1-\rho )} \le \frac{1}{\rho }\), we have \(E_1(\log \frac{1}{\rho +\epsilon \tau (1-\rho )}) \ge E_1(\log \frac{1}{\rho })\), which implies \(\Delta E_1 (\log \frac{1}{\rho +\tau (1-\rho )} , \log \frac{1}{\rho +\epsilon \tau (1-\rho )}) \le \Delta E_1 (\log \frac{1}{\rho +\tau (1-\rho )} , \log \frac{1}{\rho })\). Combining this with the definition of \(\overline{F}_1(\log (1/\epsilon ))\) and \(\overline{F}_2(\log (1/\epsilon ))\) in (A3) and (A4), respectively, we conclude that \(\overline{F}_1(\log (1/\epsilon )) \le \overline{F}_2(\log (1/\epsilon ))\), thereby completes the proof of the lemma.

Appendix B: Proof of Theorem 2

Let \(\tilde{\varvec{\delta }}^{(k)} = \varvec{Q}^{-1} \varvec{\delta }^{(k)}\) be the transformed error vector. Substituting \(\mathcal {T}(\varvec{\delta }^{(k)}) = \varvec{Q} \varvec{\Lambda }\varvec{Q}^{-1} (\varvec{\delta }^{(k)})\) into (2) and then left-multiplying both sides by \(\varvec{Q}^{-1}\), we obtain

$$\begin{aligned} \tilde{\varvec{\delta }}^{(k+1)} = \varvec{\Lambda }\tilde{\varvec{\delta }}^{(k)} + \tilde{\varvec{q}}(\tilde{\varvec{\delta }}^{(k)}) , \end{aligned}$$
(B1)

where \(\tilde{\varvec{q}}(\tilde{\varvec{\delta }}^{(k)}) = \varvec{Q}^{-1} \varvec{q}(\varvec{Q} \tilde{\varvec{\delta }}^{(k)})\) satisfies \(\Vert \tilde{\varvec{q}}(\tilde{\varvec{\delta }}^{(k)})\Vert \le q \Vert \varvec{Q}^{-1}\Vert _2 \Vert \varvec{Q}\Vert _2^2 \Vert \tilde{\varvec{\delta }}^{(k)}\Vert ^2\). Taking the norm of both sides of (B1) and using the triangle inequality yield

$$\begin{aligned} \Vert \tilde{\varvec{\delta }}^{(k+1)}\Vert&\le \Vert \varvec{\Lambda }\tilde{\varvec{\delta }}^{(k)}\Vert + \Vert \tilde{\varvec{q}}(\tilde{\varvec{\delta }}^{(k)})\Vert \\&\le \Vert \varvec{\Lambda }\Vert _2 \Vert \tilde{\varvec{\delta }}^{(k)}\Vert + q \Vert \varvec{Q}^{-1}\Vert _2 \Vert \varvec{Q}\Vert _2^2 \Vert \tilde{\varvec{\delta }}^{(k)}\Vert ^2 \end{aligned}$$

Since \(\Vert \varvec{\Lambda }\Vert _2 = \rho (\mathcal {T})\), the last inequality can be rewritten compactly as

$$\begin{aligned} \Vert \tilde{\varvec{\delta }}^{(k+1)}\Vert \le \rho \Vert \tilde{\varvec{\delta }}^{(k)}\Vert + \tilde{q} \Vert \tilde{\varvec{\delta }}^{(k)}\Vert ^2 , \end{aligned}$$
(B2)

where \(\rho =\rho (\mathcal {T})\) and \(\tilde{q} = q \Vert \varvec{Q}^{-1}\Vert _2 \Vert \varvec{Q}\Vert _2^2\).

To analyze the convergence of \(\{\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \}_{k=0}^\infty\), let us consider a surrogate sequence \(\{a_k\}_{k=0}^\infty \subset {\mathbb R}\) defined by \(a_{k+1} = \rho a_k + \tilde{q} a_k^2\) with \(a_0=\Vert \tilde{\varvec{\delta }}^{(0)}\Vert\). We show that \(\{a_k\}_{k=0}^\infty\) upper-bounds \(\{\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \}_{k=0}^\infty\), i.e.,

$$\begin{aligned} \Vert \tilde{\varvec{\delta }}^{(k)}\Vert \le a_k \quad \forall k \in \mathbb {N} . \end{aligned}$$
(B3)

The base case when \(k=0\) holds trivially as \(a_0=\Vert \tilde{\varvec{\delta }}^{(0)}\Vert\). In the induction step, given \(\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \le a_k\) for some integer \(k \ge 0\), we have

$$\begin{aligned} \Vert \tilde{\varvec{\delta }}^{(k+1)}\Vert&\le \rho \Vert \tilde{\varvec{\delta }}^{(k)}\Vert + \tilde{q} \Vert \tilde{\varvec{\delta }}^{(k)}\Vert ^2 \le \rho a_k + \tilde{q} a_k^2 = a_{k+1} . \end{aligned}$$

By the principle of induction, (B3) holds for all \(k \in \mathbb {N}\). Assume for now that \(a_0 = \Vert \tilde{\varvec{\delta }}^{(0)}\Vert < (1-\rho )/\tilde{q}\), then applying Theorem 1 yields \(a_k \le \tilde{\epsilon } a_0\) for any \(\tilde{\epsilon }>0\) and integer \(k \ge {\log (1/\tilde{\epsilon })}/{\log (1/\rho )} + c(\rho ,\tau )\). Using (B3) and setting \(\tilde{\epsilon } = \epsilon /\kappa (\varvec{Q})\), we further have \(\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \le a_k \le \tilde{\epsilon } a_0 = \epsilon \Vert \tilde{\varvec{\delta }}^{(0)}\Vert / \kappa (\varvec{Q})\) for all

$$\begin{aligned} k \ge \frac{\log (1/\epsilon )+\log (\kappa (\varvec{Q}))}{\log (1/\rho )} + c\Bigl ( \rho ,\frac{\tilde{q} \Vert \tilde{\varvec{\delta }}^{(0)}\Vert }{1-\rho } \Bigr ) . \end{aligned}$$
(B4)

Now, it remains to prove (i) the accuracy on the transformed error vector \(\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \le \tilde{\epsilon } \Vert \tilde{\varvec{\delta }}^{(0)}\Vert\) is sufficient for the accuracy on the original error vector \(\Vert {\varvec{\delta }}^{(k)}\Vert \le \epsilon \Vert {\varvec{\delta }}^{(0)}\Vert\); and (ii) the initial condition \(\Vert {\varvec{\delta }}^{(0)}\Vert < (1-\rho )/(q \kappa (\varvec{Q})^2)\) is sufficient for \(\Vert \tilde{\varvec{\delta }}^{(0)}\Vert < (1-\rho )/\tilde{q}\). In order to prove (i), using \(\Vert \tilde{\varvec{\delta }}^{(k)}\Vert \le \epsilon \Vert \tilde{\varvec{\delta }}^{(0)}\Vert /\kappa (\varvec{Q})\), we have

$$\begin{aligned} \Vert \varvec{\delta }^{(k)}\Vert = \Vert \varvec{Q} \tilde{\varvec{\delta }}^{(k)}\Vert&\le \Vert \varvec{Q}\Vert _2 \Vert \tilde{\varvec{\delta }}^{(k)}\Vert \\&\le \Vert \varvec{Q}\Vert _2 \frac{\epsilon }{\Vert \varvec{Q}\Vert _2 \Vert \varvec{Q}^{-1}\Vert _2} \Vert \tilde{\varvec{\delta }}^{(0)}\Vert = \frac{\epsilon }{\Vert \varvec{Q}^{-1}\Vert _2} \Vert \tilde{\varvec{\delta }}^{(0)}\Vert \le \epsilon \Vert \varvec{\delta }^{(0)}\Vert , \end{aligned}$$

where the last inequality stems from \(\Vert \tilde{\varvec{\delta }}^{(0)}\Vert = \Vert \varvec{Q}^{-1} {\varvec{\delta }}^{(0)}\Vert \le \Vert \varvec{Q}^{-1}\Vert _2 \Vert \varvec{\delta }^{(0)}\Vert\). To prove (ii), we use similar derivation as follows

$$\begin{aligned} \Vert \tilde{\varvec{\delta }}^{(0)}\Vert \le \Vert \varvec{Q}^{-1}\Vert _2 \Vert \varvec{\delta }^{(0)}\Vert < \Vert \varvec{Q}^{-1}\Vert _2 \frac{1-\rho }{q \kappa (\varvec{Q})^2} = \frac{1-\rho }{\tilde{q}} . \end{aligned}$$

Finally, the case that \(\varvec{T}\) is symmetric can be proven by the fact that \(\varvec{Q}\) is orthogonal, i.e., \(\varvec{Q}^{-1} = \varvec{Q}^T\) and \(\kappa (\varvec{Q})=1\). Substituting this back into (10) and using the orthogonal invariance property of norm, we obtain the simplified version in (11). This completes our proof of Theorem 2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vu, T., Raich, R. A closed-form bound on the asymptotic linear convergence of iterative methods via fixed point analysis. Optim Lett 17, 643–656 (2023). https://doi.org/10.1007/s11590-022-01893-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-022-01893-7

Keywords

Navigation