1 Introduction

Global optimization is one of the thriving fields of mathematical programming. As it can be seen from numerous applications in electronics, machine learning, engineering, optimal decision making, finance, etc. (see, e.g., [1, 2, 7,8,9,10, 20,21,22, 24, 28, 39, 40, 49,50,51] and references given therein), problems of this kind are often characterized by the presence of numerous local minima and maxima and the necessity to find among them the absolutely best, in other words global, solution. In local optimization, univariate problems have lost their importance decades ago and nowadays can be considered only as illustrative examples in university optimization courses. In contrast, in global optimization, where the objective functions can be strongly multiextremal making so local optimization methods inapplicable, univariate problems are still a very active research area. This happens due to at least the following three reasons. First, global optimization is significantly younger w.r.t. local optimization and there still exists a lot of room for improvement and new ideas. Second, global optimization problems are extremely difficult even in the one-dimensional case and there exists a huge number of applications (it is sufficient to mention elaboration of signals) where problems of this kind arise (see, e.g.,[4, 5, 8, 15, 30, 31, 34, 36, 39, 40, 42,43,44,45,46]). The third reason is that one-dimensional schemes are broadly used for constructing multi-dimensional global optimization methods for a single objective (see, e.g., [20, 22, 26, 28, 36, 42, 43, 48]) and in the multiobjective case (see, e.g. [25]). Therefore, univariate global optimization can be viewed as a good training ground for a subsequent development of multi-dimensional algorithms.

The univariate global optimization problem can be stated as follows

$$\begin{aligned} f(x) \rightarrow \min , \text { s.t. } x \in [a,b], \end{aligned}$$
(1)

where the objective function f(x) can have a high number of local minima and maxima and evaluation of f(x) at each point is a time consuming operation. The goal is to find a global minimizer \(x^* \in [a,b]\) and the value \(f(x^*)\) such that

$$\begin{aligned} f(x^*) \le f(x), \quad x \in [a,b]. \end{aligned}$$
(2)

Hereinafter we assume that f(x) is a continuous function over [ab]. Thus, the point \(x^*\) in (2) always exists.

In many cases, finding the exact solution \(f(x^*)\) either analytically or by numerical algorithms is impractical (see, e.g., [7, 8, 20, 26, 28, 42, 43, 48]). Thus, numerical methods are often aimed at finding approximate solutions. In this paper, there are considered algorithms that guarantee to find an \(\varepsilon \)-optimal solution \(x^{(\varepsilon )}\) satisfying the following condition

$$\begin{aligned} f(x^{(\varepsilon )}) \le f(x^*) + \varepsilon . \end{aligned}$$
(3)

One of the key methodologies developed to deal with global optimization problems is Lipschitz global optimization (see, e.g., [12,13,14, 16,17,18, 20, 27,28,29, 47, 48]). It uses a natural assumption that the function f(x) from (1) has bounded slopes. In other words, it satisfies the Lipschitz condition (it will be introduced formally shortly). An important subclass in Lipschitz global optimization consists of functions with the first derivative satisfying the Lipschitz condition (see [3, 5, 8, 11, 18, 19, 23, 35, 48], etc.). The importance of methods using derivatives has increased significantly after the introduction of the Infinity Computer (see [38]) allowing one to compute numerically (i.e., neither analytically nor symbolically) exact derivatives and to use them in global optimization algorithms (see [37, 41, 46]).

Since the nineties of the XXth century people started to propose algorithms for solving this problem using piece-wise quadratic minorants. Breiman and Cutler (see [3]) and Gergel (see [11]) have introduced methods constructing non-smooth minorants that were adaptively improved during optimization. Then, since the objective function f(x) is differentiable over the search region, in [23, 35, 46] there have been introduced methods constructing smooth minorants that are closer to the objective function f(x) with respect to non-smooth ones providing so a significant acceleration in comparison with the algorithms [3, 11].

In the present paper, a further substantial improvement is proposed. In order to explain its essence we need the following two definitions.

Definition 1

A function \(h(x), x \in [a,b],\) is called Lipschitz continuous on the interval [ab] if there is a non-negative finite real constant L such that the property

$$\begin{aligned} |h(x_1) - h(x_2)| \le L |x_1 - x_2| \end{aligned}$$
(4)

holds for any \(x_1,x_2 \in [a,b]\). Any such L is called a Lipschitz constant.

The property (4) bounds the absolute value of the function variation. In the present paper, we use a more accurate notion of interval Lipschitz continuity that provides both lower and upper bounds for slopes of h(x).

Definition 2

A function \(h(x), x \in [a,b],\) is called interval Lipschitz continuous over the interval [ab] if there exist real numbers \(\alpha , \beta \), \(\alpha \le \beta \), such that

$$\begin{aligned} \alpha (x_2 - x_1) \le h(x_2) - h(x_1) \le \beta (x_2 - x_1) \end{aligned}$$
(5)

for any \(x_1,x_2 \in [a,b]\), \(x_1 \le x_2\). Any interval \([\alpha , \beta ]\) satisfying (5) is called a Lipschitz interval.

Obviously, a Lipschitz continuous function with a constant L is interval Lipschitz continuous with the interval \([-L,L]\). Vice-versa, an interval Lipschitz continuous function with a Lipschitz interval \([\alpha ,\beta ]\) is Lipschitz continous with the Lipschitz constant \(L = \max \{|\alpha |, |\beta | \}\). It is easy to show that if a function is differentiable and its derivative values belong to an interval \([\alpha , \beta ]\) then it is interval Lipschitz continuous with this interval. In what follows we suppose that constants \(\alpha \) and \(\beta \) are known. It should be stressed that in this we follow the tradition existing in Lipschitz global optimization w.r.t. the knowledge of L (see, e.g., [3, 11, 16, 20, 27,28,29, 35]). In practice, both L and \(\alpha \) and \(\beta \) can be estimated using, for example, the interval analysis (see, e.g., [6, 32]). There exist also techniques allowing one to accelerate the search by using adaptive estimates of L (see, e.g., [18, 19, 23, 35, 46] and references therein). These aspects are beyond the interest of the present paper and will be considered in further investigations.

In this paper, we focus on a problem of constructing smooth estimators (minorants and majorants) for a differentiable univariate function f(x) assuming that its first derivative is interval Lipschitz continuous over [ab] with the Lipschitz interval \([\alpha , \beta ]\). Formally, this can be written as follows:

$$\begin{aligned} \alpha (x_2 - x_1) \le f'(x_2) - f'(x_1) \le \beta (x_2 - x_1), \,\,\, x_1,x_2 \in [a,b],\,\,\, x_1 \le x_2. \end{aligned}$$
(6)

Clearly, construction of such estimators has its own intrinsic importance since they can be applied in different contexts. Hereinafter, it will be shown how such estimators can be constructed and used to solve global optimization problems.

In order to illustrate the interval Lipschitz continuity property, let us consider Fig. 1a that presents an example of a piecewise linear function h(x) which is interval Lipschitz continuous with the Lipschitz interval \([-3, 1]\). Figure 1b shows a function f(x) such that \(f'(x) = h(x)\). The function in Fig. 1b satisfies the property (6) with \(\alpha = -3, \beta = 1\). In this figure, the roots of h(x) are points of extrema of f(x). These points are marked with blue on both plots.

Fig. 1
figure 1

An interval Lipschitz continuous function (a) and a function with an interval Lipschitz derivative (b)

As was already mentioned, in [35], a smooth piece-wise quadratic minorant for functions with Lipschitzian first derivatives has been proposed. In the present paper, we show that this minorant can be improved significantly if the Lipschitz property for the first derivative is replaced with the interval Lipschitz property (6).

To illustrate this fact that will be proved hereinafter, let us consider a function \(f(x) = 3 \, sin(x) + x^2\), \(x \in [-4,4]\). The Lipschitz interval computed by interval arithmetics (see [32]) as the interval bounds for its second derivative is \([-1, 5]\). Thus, the respective Lipschitz constant is \(L = 5\). Figure 2a shows the lower and upper estimators obtained with the techniques from [35] that relies solely on a Lipschitz constant. Figure 2b depicts the estimators constructed by the method presented in the rest of the paper that uses the Lipschitz interval. As one can see, considering the Lipschitz interval entails tighter estimators w.r.t. the Lipschitz constant.

Fig. 2
figure 2

Lower (blue) and upper (green) estimators obtained using Lipschitz (a) and interval Lipschitz (b) properties of the first derivative of the function shown in red. (Color figure online)

2 Preliminary notions and facts

Lemma 1

Let h(x) be an interval Lipschitz continuous function defined on an interval [ab] with a Lipschitz interval \([\alpha , \beta ]\), see (5). If

$$\begin{aligned} h(b) - h(a) = \alpha (b - a), \end{aligned}$$
(7)

then h(x) is the interval Lipschitz continuous function with the Lipschitz interval \([\alpha , \alpha ]\) and \(h(x) = h(a) + \alpha (x - a)\). Similarly if

$$\begin{aligned} h(b) - h(a) = \beta (b - a), \end{aligned}$$
(8)

then h(x) is the interval Lipschitz continuous function with the Lipschitz interval \([\beta , \beta ]\) and \(h(x) = h(a) + \beta (x - a)\).

Proof

Notice that according to (5), we have

$$\begin{aligned} \alpha (b - x) \le h(b) - h(x),\quad x\in [a,b]. \end{aligned}$$

Due to this fact and (7), it follows that

$$\begin{aligned} h(x) \le h(b) - \alpha (b - x) = h(a) + \alpha (b - a) - \alpha (b- x) = h(a) + \alpha (x - a), \quad x \in [a,b]. \end{aligned}$$

On the other hand, according to (5), we get

$$\begin{aligned} h(x) \ge h(a) + \alpha (x - a), \quad x \in [a,b]. \end{aligned}$$

As a result, we obtain that,

$$\begin{aligned} h(x) = h(a) + \alpha (x - a), \quad x \in [a,b]. \end{aligned}$$
(9)

Since h(x) in (9) is linear with the slope \(\alpha \), h(x) is interval Lipschitz continuous with Lipschitz interval \([\alpha , \alpha ]\). The case (8) is considered by a complete analogy. \(\square \)

Lemma 2

Let h(x) be an interval Lipschitz continuous function defined on an interval [ab] with a Lipschitz interval \([\alpha , \beta ]\), see (5). Then the following inequalities hold

$$\begin{aligned} \underline{h}(x) \le h(x) \le \overline{h}(x), \quad x \in [a,b], \end{aligned}$$
(10)

where

$$\begin{aligned} \begin{aligned} \underline{h}(x)&= \max (h(a) + \alpha (x - a), h(b) + \beta (x - b)),\\ \overline{h}(x)&= \min (h(a) + \beta (x - a), h(b) + \alpha (x - b)). \end{aligned} \end{aligned}$$

i.e., \(\underline{h}(x)\) and \(\overline{h}(x)\) are an underestimator and an overestimator for h(x).

Proof

From (5) we get

$$\begin{aligned} \alpha (x - a) \le h(x) - h(a) \le \beta (x - a), \quad x \in [a,b]. \end{aligned}$$

Rewriting these inequalities we obtain

$$\begin{aligned} h(a) + \alpha (x - a) \le h(x) \le h(a) + \beta (x - a), \quad x \in [a,b]. \end{aligned}$$
(11)

In the same way we obtain

$$\begin{aligned} \alpha (b - x) \le h(b) - h(x) \le \beta (b - x), \quad x \in [a,b]. \end{aligned}$$

and

$$\begin{aligned} h(b) + \beta (x - b) \le h(x) \le h(b) + \alpha (x - b), \quad x \in [a,b]. \end{aligned}$$
(12)

The inequality (10) is a direct consequence of (11), (12). \(\square \)

Let us introduce more compact representation of functions \(\underline{h}(x)\), \(\overline{h}(x)\), provided by the following Lemma.

Lemma 3

The underestimator \(\underline{h}(x)\) can be rewritten in the following equivalent form

$$\begin{aligned} \underline{h}(x) = {\left\{ \begin{array}{ll} h(a) + \alpha (x - a), \quad x \in [a, s),\\ h(b) + \beta (x - b), \quad x \in [s, b], \end{array}\right. } \end{aligned}$$
(13)

where

$$\begin{aligned} s = \frac{h(a) - h(b) + \beta b - \alpha a}{\beta - \alpha }. \end{aligned}$$
(14)

The overestimator \(\overline{h}(x)\) can be rewritten in this equivalent form

$$\begin{aligned} \overline{h}(x) = {\left\{ \begin{array}{ll} h(a) + \beta (x - a), \quad x \in [a, t),\\ h(b) + \alpha (x - b), \quad x \in [t, b], \end{array}\right. } \end{aligned}$$
(15)

where

$$\begin{aligned} t = \frac{h(b) - h(a) + \beta a - \alpha b}{\beta - \alpha }. \end{aligned}$$
(16)

Proof

The value s is the abscissa of the intersection of lines \(h(a) + \alpha (x - a)\) and \(h(b) + \beta (x - b)\). Therefore it is a root of the algebraic equation

$$\begin{aligned} h(a) + \alpha (x - a) = h(b) + \beta (x - b). \end{aligned}$$

This root is computed explicitly as follows

$$\begin{aligned} s = \frac{h(a) - h(b) + \beta b - \alpha a}{\beta - \alpha }. \end{aligned}$$

Thus, the formula (13) has been proven. The formula (15) can be proven in a similar way. \(\square \)

Lemmas 2 and 3 are illustrated in Fig. 3 for the function \(h(x) = \sin (x) + 0.5, x \in [-\pi , \pi ]\). Observe that \(h'(x) = \cos (x) + 0.5\). Since the range of \(\cos (x)\) for \(x \in [-\pi , \pi ]\) is \([-1, 1]\), then the range of \(h'(x)\) is \([-0.5, 1.5]\), i.e., \(\alpha = -0.5, \beta = 1.5\). In Fig. 3, points s and t are also shown.

Fig. 3
figure 3

Lower (blue) and upper (green) estimators according to Lemma 2. (Color figure online)

3 The second order estimators

In what follows we construct a piecewise quadratic underestimator for a function f(x) satisfying (6) starting from a step function \(\phi (x,c,d)\) depending on an argument x and parameters cd. A function \(\psi (x,c,d)\) of the argument x is then defined as a definite integral of \(\phi (x,c,d)\) with a variable upper limit and, in its turn, a function \(\chi (x,c,d)\) of the argument x is defined as a definite integral with a variable upper limit of \(\psi (x,c,d)\).

Below we introduce the mentioned functions formally, study their properties and show that it is always possible to choose parameters cd in such a way that \(\chi (x, c, d)\) becomes a differentiable piecewise quadratic underestimator for f(x).

We start by considering the following step function \(\phi (x, c, d)\) of the variable x:

$$\begin{aligned} \phi (x,c,d) = {\left\{ \begin{array}{ll} \alpha , &{} x \in [a, c),\\ \beta , &{} x \in [c, d),\\ \alpha , &{} x \in [d, b], \end{array}\right. } \end{aligned}$$
(17)

where cd are two real numbers such that \(a \le c \le d \le b\) (the proper choice of these parameters is discussed later). Let us now define functions \(\psi (x, c, d)\) and \(\chi (x, c, d)\) as follows

$$\begin{aligned} \psi (x, c, d)= & {} f'(a) + \int _{a}^{x} \phi (t, c, d) dt, \end{aligned}$$
(18)
$$\begin{aligned} \chi (x, c, d)= & {} f(a) + \int _a^x \psi (t, c, d) dt. \end{aligned}$$
(19)

In what follows we study the properties of these functions and start by considering the following two possible cases for \(\alpha \) and \(\beta \): (i) \(\alpha = \beta \), (ii) \(\alpha < \beta \).

Lemma 4

Let \(\alpha = \beta \). Then f(x) is a quadratic function and for any \(c,d \in [a,b]\), \(c \le d\), it follows

$$\begin{aligned} f(x) = \chi (x, c, d) = f(a) + f'(a)(x - a) + \frac{\alpha }{2}(x-a)^2. \end{aligned}$$
(20)

Proof

Since \(\alpha = \beta \), \(\phi (x,c,d) = \alpha \), \(x \in [a,b]\), regardless of the choice of c and d. Therefore, it follows from (18) that

$$\begin{aligned} \psi (x, c, d) = f'(a) + \alpha (x - a), \quad x\in [a,b]. \end{aligned}$$
(21)

On the other hand, from (6) it follows that

$$\begin{aligned} f'(x) = f'(a) + \alpha (x - a), \quad x\in [a,b]. \end{aligned}$$

Thus, \(f'(x) = \psi (x, c, d)\), \(x \in [a,b]\), and by definition (19) we get

$$\begin{aligned} \chi (x, c, d) = f(a) + \int _a^x \psi (t, c, d) dt = f(a) + \int _a^x f'(t) dt = f(x). \end{aligned}$$
(22)

We conclude that in the case \(\alpha = \beta \) the choice of cd is irrelevant and the underestimator coincides with f(x), i.e., the first equality in (20) is true. After substitution of (21)–(22) and integration we obtain the second equality in (20). This observation concludes the proof. \(\square \)

From Lemma 4 it follows that for arbitrary c and d in [ab] the function \(\chi (x, c, d)\) coincides with f(x) and thus is a trivial underestimator. The situation is different if \(\alpha < \beta \). In what follows we show how to choose c and d within [ab] to ensure that \(\chi (x, c, d)\) bounds f(x) from below.

Let us assume now that \(\alpha < \beta \). Let \(\delta \) be a real number defined as follows

$$\begin{aligned} \delta = \frac{f'(b) - f'(a) - \alpha (b - a)}{\beta - \alpha }. \end{aligned}$$
(23)

Notice that due to (6), the inequalities

$$\begin{aligned} \alpha (b - a) \le f'(b) - f'(a) \le \beta (b - a) \end{aligned}$$
(24)

are valid. Moreover, according to Lemma 1, if at least one of the inequalities (24) degenerates to equality then \(f'(x)\) is interval Lipschitz continuos with an interval with equal ends. This case (\(\alpha = \beta \)) was considered above. Thus both inequalities in (24) can be assumed strict and therefore

$$\begin{aligned} 0< \delta < b - a. \end{aligned}$$
(25)

Let c be an arbitrary number in \([a, b - \delta ]\). In the rest of the paper we assume that the parameter d used in (17)—(19) has the form

$$\begin{aligned} d = c + \delta . \end{aligned}$$
(26)

From (25) it immediately follows that \(d \in [a, b]\). For \(c \in [a, b - \delta ]\) we denote \(\hat{\psi }(x,c) = \psi (x, c, c + \delta )\) hereinafter.

After substituting the value (26) for d to (18) and expanding integrals we obtain

$$\begin{aligned} \hat{\psi }(x, c) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a,c),\\ f'(a) + \alpha (c - a) + \beta (x - c), &{} x \in [c, d),\\ f'(a) -\alpha a + (\beta - \alpha ) \delta + \alpha x, &{} x \in [d, b]. \end{array}\right. } \end{aligned}$$
(27)

Then, from (23) it follows that the third line in (27) can be reformulated as follows:

$$\begin{aligned} f'(a) -\alpha a + (\beta - \alpha ) \delta + \alpha x{} & {} = f'(a) - \alpha a + f'(b) - f'(a) - \alpha (b - a) + \alpha x \\ {}{} & {} = f'(b) + \alpha (x - b). \end{aligned}$$

Thus, (27) can be rewritten in a more compact form

$$\begin{aligned} \hat{\psi }(x, c) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a,c),\\ f'(a) + \alpha (c - a) + \beta (x - c), &{} x \in [c, d),\\ f'(b) + \alpha (x - b), &{} x \in [d, b]. \end{array}\right. } \end{aligned}$$
(28)

Lemma 5

The function \(\hat{\psi }(x, c)\) is a continuous piecewise linear in x and the following equality holds

$$\begin{aligned} \hat{\psi }(b, c) = f'(b). \end{aligned}$$
(29)

Proof

From (27) it follows that the function \(\hat{\psi }(x, c)\) is piecewise linear by definition. Thus, to prove the continuity, it is necessary to study values of the function \( \hat{\psi }(x, c)\) at points c and d. After substituting \(x = c\), the 1st and 2nd expressions in (27) take the same value \(f'(a) + \alpha (c - a)\) which means continuity at that point.

After assuming \(x = d\) and substituting (26), the 2nd expression in (27) becomes

$$\begin{aligned} f'(a) + \alpha (c - a) + \beta (d - c) = f'(a) + \alpha (c - a) + \beta \delta . \end{aligned}$$
(30)

Then, after substituting \(x = d\) to the 3rd expression in (27) we obtain

$$\begin{aligned} f'(a) - \alpha a + (\beta - \alpha ) \delta + \alpha d = f'(a) - \alpha a + (\beta - \alpha ) \delta + \alpha (c + \delta ) = f'(a) + \alpha (c - a) + \beta \delta . \end{aligned}$$
(31)

The rightmost parts of (30) and (31) coincide, which means the continuity at the point d.

Finally, equality (29) becomes evident after a direct substitution \(x=b\) to the last equation in (28). \(\square \)

The following Lemma provides concise representations for some specific choices of the second arguments of \(\hat{\psi }(x, c)\) function. These properties are used later in the rest of the paper.

Lemma 6

The following two expressions are valid for \(x \in [a,b]\)

$$\begin{aligned} \hat{\psi }(x, b-\delta )&= {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a, b - \delta ),\\ f'(b) + \beta (x - b), &{} x \in [b - \delta , b], \end{array}\right. }\end{aligned}$$
(32)
$$\begin{aligned} \hat{\psi }(x, a)&= {\left\{ \begin{array}{ll} f'(a) + \beta (x - a), &{} x \in [a, a + \delta ),\\ f'(b) + \alpha (x - b), &{} x \in [a + \delta , b]. \end{array}\right. } \end{aligned}$$
(33)

Proof

Let us prove the formula (32). After substituting \(c = b-\delta \) to (26) we get \(d = b\). Thus, the substitution \(c = b-\delta \) to (28) yields

$$\begin{aligned} \hat{\psi }(x, b-\delta ) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a, b - \delta ),\\ f'(a) + \alpha (b - \delta - a) + \beta (x - b + \delta ), &{} x \in [b - \delta , b),\\ f'(b), &{} x = b. \end{array}\right. } \end{aligned}$$
(34)

By substituting expression (23) for \(\delta \) the second case in (34) can be simplified

$$\begin{aligned} \begin{aligned} f'(a) + \alpha (b - \delta - a) + \beta (x - b + \delta ) = f'(a) + \alpha (b - a) + \delta (\beta - \alpha ) + \beta (x - b)\\ = f'(a) + \alpha (b - a) + f'(b) - f'(a) - \alpha (b - a) + \beta (x - b) = f'(b) + \beta (x - b). \end{aligned} \end{aligned}$$

Thus, (34) can be rewritten in a more compact form (32).

Now, let us prove (33). After substituting \(c = a\) and \(d = c + \delta \) to (28) the half interval [ac) becomes empty and therefore we get

$$\begin{aligned} \hat{\psi }(x, a) = {\left\{ \begin{array}{ll} f'(a) + \beta (x - a), &{} x \in [a, a + \delta ),\\ f'(b) + \alpha (x - b), &{} x \in [a + \delta , b]. \end{array}\right. } \end{aligned}$$

This completes the proof. \(\square \)

The following two Lemmas 7 and 8 establish some properties of the function \(\hat{\psi }(x,c)\) that will be used later to prove the proposition 1. Let start with the first Lemma illustrated in Fig. 4.

Lemma 7

Inequalities

$$\begin{aligned} \hat{\psi }(x, b - \delta )\le f'(x) \le \hat{\psi }(x,a) \end{aligned}$$
(35)

are valid for \(x \in [a,b]\).

Proof

Let us prove the first inequality in (35). Notice that

$$\begin{aligned} \begin{aligned} b - \delta&= b - \frac{f'(b) - f'(a) - \alpha (b - a)}{\beta - \alpha } = \frac{b (\beta - \alpha ) - f'(b) + f'(a) + \alpha (b -a)}{\beta - \alpha }\\&= \frac{f'(a) - f'(b) + b \beta - b \alpha + \alpha b - \alpha a}{\beta - \alpha } = \frac{f'(a) - f'(b) + \beta b - \alpha a}{\beta - \alpha }. \end{aligned} \end{aligned}$$

We need to recall Lemma 3 and compare (13) with (32). Observe that s in (14) is equal to \(b - \delta \) under assumption \(h(x) = f'(x)\). Thus \(\underline{h}(x)\) coincides with \(\hat{\psi }(x, b-\delta )\). As a result, according to Lemmas 2 and 3, \(\hat{\psi }(x, b-\delta )\) is an underestimator for \(f'(x)\), i.e.,

$$\begin{aligned} \hat{\psi }(x, b-\delta ) \le f'(x), \quad x \in [a,b]. \end{aligned}$$

The first of inequalities (35) has been proven.

Now, let us prove the second inequality in (35). Observe that

$$\begin{aligned} \begin{aligned} a + \delta&= a + \frac{f'(b) - f'(a) - \alpha (b - a)}{\beta - \alpha } = \frac{f'(b) - f'(a) - \alpha (b - a) + a (\beta - \alpha )}{\beta - \alpha }\\&= \frac{f'(b) - f'(a) - \alpha b + \alpha a + \beta a - \alpha a}{\beta - \alpha } = \frac{f'(b) - f'(a) + \beta a - \alpha b}{\beta - \alpha }. \end{aligned} \end{aligned}$$

Thus, due to (15) and (16) the point \(a+\delta \) is equal to t and \(\overline{h}(x)\) coincides with \(\hat{\psi }(x, a)\) under assumption \(h(x) = f'(x)\). As a result, due to Lemmas 2 and 3, \(\hat{\psi }(x,a)\) is an overestimator for \(f'(x)\):

$$\begin{aligned} f'(x) \le \hat{\psi }(x,a), \quad x \in [a,b]. \end{aligned}$$

This completes the proof. \(\square \)

Fig. 4
figure 4

Functions \(\hat{\psi }(x,a)\) and \(\hat{\psi }(x, b - \delta )\) are lower and upper estimators for \(f'(x)\), respectively

Lemma 8

The function \(\hat{\psi }(x, c)\) is monotonically non-increasing and Lipschitz continuous for its second argument over the interval [ab] with the Lipschitz constant equal to \(\beta - \alpha \), i.e.,

$$\begin{aligned} \begin{aligned} 0&\le \hat{\psi }(x, c_1) - \hat{\psi }(x, c_2) \le (\beta -\alpha )(c_2 - c_1), \\&\quad x \in [a, b], \quad c_1, c_2 \in [a, b - \delta ],\quad c_1 \le c_2, \end{aligned} \end{aligned}$$
(36)

where \(\alpha , \beta \) are from (17).

Proof

This Lemma is proved in “Appendix A”. \(\square \)

Let us introduce the notation \(\hat{\chi }(x, c) = \chi (x, c, d)\), where d is computed according to (26). Then, it follows from (19) that

$$\begin{aligned} \hat{\chi }(x, c) = f(a) + \int _a^x \hat{\psi }(t, c) dt. \end{aligned}$$
(37)

Let us now define the function

$$\begin{aligned} \xi (c) = \hat{\chi }(b, c) \end{aligned}$$
(38)

and prove the following corollary.

Corollary 1

Function \(\xi (c)\) is a Lipschitz continuous non-increasing function of c, \(c \in [a, b - \delta ]\), with the Lipschitz constant equal to \((b - a)(\beta - \alpha )\).

Proof

Let \(c_1\), \(c_2\) be two reals such that \(a \le c_1 < c_2 \le b - \delta \). Due to (37), (38), and (36) we have

$$\begin{aligned} \xi (c_1) = f(a) + \int _a^b \hat{\psi }(t, c_1) dt \ge f(a) + \int _a^b \hat{\psi }(t, c_2) dt = \xi (c_2). \end{aligned}$$

Thus, the monotonicity has been proven. The Lipschitz continuity follows from (36) and the following inequalities:

$$\begin{aligned} \begin{aligned} \xi (c_1) - \xi (c_2)&= \int _a^b \left( \hat{\psi }(t, c_1) - \hat{\psi }(t, c_2) \right) dt \le \int _a^b (\beta - \alpha )(c_2 - c_1) dt \\&= (b - a)(\beta - \alpha )(c_2 - c_1). \end{aligned} \end{aligned}$$

\(\square \)

In what follows we derive a useful explicit formula for \(\hat{\chi }(x,c)\) obtained in the following Lemma.

Lemma 9

For the function \(\hat{\chi }(x,c)\) computed accordingly to (37) the following expression holds

$$\begin{aligned} \hat{\chi }(x,c) = {\left\{ \begin{array}{ll} f(a) + f'(a)(x - a) + \frac{\alpha }{2}(x - a)^2, &{} x \in [a,c),\\ \\ f(a) + f'(a)(c - a) + \frac{\alpha }{2}(c - a)^2 &{}\,\\ +(f'(a) + \alpha (c - a))(x - c) + \frac{\beta }{2}(x - c)^2, &{} x \in [c, d),\\ \\ f(a) + f'(a)(c - a) + \frac{\alpha }{2}(c - a)^2 + (f'(a) + \alpha (c - a)) \delta &{}\,\\ + \frac{\beta }{2}\delta ^2 + f'(b)(x - d) + \frac{\alpha }{2}(x - b)^2 - \frac{\alpha }{2}(d - b)^2, &{} x \in [d, b]. \end{array}\right. } \end{aligned}$$
(39)

Proof

The proof is given in “Appendix B”. \(\square \)

The following proposition shows that we can choose a point \(c^*\) in the interval \([a, b-\delta ]\) in such a way, that \(\xi (c^*) = f(b)\). Later this property will be used to prove that the function \(\hat{\chi }(x, c^*)\) is an underestimator for f(x), coinciding with it at the ends of the interval [ab]. The fact that \(c^*\) satisfies the inequalities \(a \le c^* \le b - \delta \) is essential to ensure that both \(c^*\) and \(d^* = c^* + \delta \) belong to [ab].

Proposition 1

There exists a unique point \(c^* \in [a, b - \delta ]\) such that \(\xi (c^*) = f(b)\), where

$$\begin{aligned} \begin{aligned} c^*&= \frac{\alpha (b - a)}{2(\beta - \alpha )} + \frac{ f'(a) - f'(b)}{2(\beta - \alpha )}\\&\quad + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{\alpha }{2}(a^ 2 - b^2)}{f'(b) - f'(a) - \alpha (b - a)}. \end{aligned} \end{aligned}$$
(40)

Proof

The proof is given in “Appendix C”. \(\square \)

Now we can define

$$\begin{aligned} d^* = c^* + \delta . \end{aligned}$$
(41)

The following Corollary gives an explicit formula for \(d^*\).

Corollary 2

The value of \(d^*\) can be calculated as follows

$$\begin{aligned} d^* = \frac{\alpha (a - b)}{2(\beta - \alpha )} + \frac{ f'(b) - f'(a)}{2(\beta - \alpha )} + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{\alpha }{2}(a^ 2 - b^2)}{f'(b) - f'(a) - \alpha (b - a)}. \end{aligned}$$
(42)

Proof

By substituting the expression (23) for \(\delta \) to the right part of (41), and recalling (40), we obtain

$$\begin{aligned} \begin{aligned} d^*&= \frac{\alpha (b - a)}{2(\beta - \alpha )} + \frac{ f'(a) - f'(b)}{2(\beta - \alpha )} + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{\alpha }{2}(a^ 2 - b^2)}{f'(b) - f'(a) - \alpha (b - a)}\\&\quad + \frac{f'(b) - f'(a) - \alpha (b - a)}{\beta - \alpha }\\&= \frac{\alpha (a - b)}{2(\beta - \alpha )} + \frac{ f'(b) - f'(a)}{2(\beta - \alpha )} + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{\alpha }{2}(a^ 2 - b^2)}{f'(b) - f'(a) - \alpha (b - a)}. \end{aligned} \end{aligned}$$

\(\square \)

Notice, that if the derivative of f(x) satisfies the Lipschitzian property with the constant L then we can safely assume \(\alpha = -L\) and \(\beta = L\). The following Corollary establishes the formulae for \(c^*\) and \(d^*\) in this case.

Corollary 3

If \(\alpha = -L\) and \(\beta = L\) then

$$\begin{aligned} \begin{aligned} c^*&= - \frac{b - a}{4} - \frac{ f'(b) - f'(a)}{4 L}\\&\quad + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{L}{2}(b^2 - a^ 2)}{L (b - a) + f'(b) - f'(a) }, \end{aligned} \end{aligned}$$
(43)

and

$$\begin{aligned} \begin{aligned} d^*&= \frac{b - a}{4} + \frac{ f'(b) - f'(a)}{4 L}\\&\quad + \frac{f(a) - f(b) + b f'(b) - a f'(a) + \frac{L}{2}(b^2 - a^ 2)}{L (b - a) + f'(b) - f'(a) }. \end{aligned} \end{aligned}$$
(44)

Proof

Formulae (43), (44) can be obtained by a direct substitution of values \(-L\), L instead of \(\alpha \), \(\beta \) in (40) and (42) respectively. \(\square \)

As expected, formulae (43), (44) coincide with the formulae (13), (12) from [35], respectively, that provide gluing points of three quadratic pieces of the smooth piece-wise quadratic support function for the function f(x) constructed for the case where \(f'(x)\) satisfies the Lipschitzian property with the constant L.

Let us introduce now the following two functions

$$\begin{aligned} \mu (x) = \hat{\chi }(x, c^*), \quad \nu (x) = \hat{\psi }(x, c^*). \end{aligned}$$
(45)

From (37), (38), and the equality \(\xi (c^*) = b\), proved in Proposition 1 it follows that

$$\begin{aligned} \begin{aligned} \mu (x)&= f(a) + \int _a^x \nu (t) dt, \quad x \in [a, b],\\ \mu (a)&= f(a),\quad \mu (b) = f(b), \end{aligned} \end{aligned}$$
(46)

Thus, \(\nu (x)\) is the first derivative for \(\mu (x)\). Recall, that by construction, \(\nu (x)\) is piece-wise linear and \(\mu (x)\) is piece-wise quadratic.

Hereinafter we show that the function \(\mu (x)\) is an underestimator for the function f(x) and illustrate this fact by the following example, where Fig. 5 shows the function \(f(x) = 5 \, sin(x - 2) + x\), its derivative \(f'(x) = 5 \, cos (x - 2) + 1\), and functions \(\nu (x), \mu (x)\) defined on the interval [0, 5].

Fig. 5
figure 5

A function f(x), its derivative \(f'(x)\), the underestimator \(\mu (x)\) of f(x), and the derivative \(\nu (x)\) of \(\mu (x)\)

Theorem 1

For any differentiable function f(x) obeying (6) the function \(\mu (x)\) defined according to (46) is an underestimator for f(x) over the interval [ab], i.e.,

$$\begin{aligned} f(x) \ge \mu (x), \quad x \in [a,b]. \end{aligned}$$
(47)

Proof

Since \(f'(x)\) is interval Lipschitz continuous on [ab], the following two inequalities follow from (6)

$$\begin{aligned} f'(a) + \alpha (x - a) \le f'(x) \le f'(b) + \alpha (x - b), \end{aligned}$$
(48)

for all \(x \in [a,b]\). Recalling (28) and (45), from (48) we get

$$\begin{aligned} \begin{aligned} f'(x)&\ge \nu (x), \quad x \in [a, c^*],\\ f'(x)&\le \nu (x), \quad x \in [d^*, b]. \end{aligned} \end{aligned}$$

Thus, for the points \(c^*\), \(d^*\) we have \(f'(c^*) - \nu (c^*) \ge 0\) and \(f'(d^*) - \nu (d^*) \le 0\). Due to continuity of the function \(f'(x) - \nu (x)\) there exists a point \(s \in [c^*, d^*]\) such that \(f'(s) - \nu (s) = 0\), i.e., \(f'(s) = \nu (s)\), see Fig. 5 for illustration.

From (28) and (45) it directly follows that

$$\begin{aligned} \nu (x) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a,c^*),\\ f'(a) + \alpha (c^* - a) + \beta (x - c^*), &{} x \in [c^*, d^*],\\ f'(b) + \alpha (x - b), &{} x \in (d^*, b]. \end{array}\right. } \end{aligned}$$
(49)

Due to the fact that \(s \in [c^*, d^*]\), we get

$$\begin{aligned} \nu (s) = f'(a) + \alpha (c^* - a) + \beta (s - c^*) \end{aligned}$$

from where we obtain

$$\begin{aligned} \nu (x) - \nu (s) = \beta (x - s), \quad x \in [c^*, d^*]. \end{aligned}$$

Since \(f'(s) = \nu (s)\), we have

$$\begin{aligned} \nu (x) = f'(s) + \beta (x - s), \quad x \in [c^*, d^*]. \end{aligned}$$

Thus, the expression (49) can be rewritten as

$$\begin{aligned} \nu (x) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a,c^*),\\ f'(s) + \beta (x - s), &{} x \in [c^*, d^*],\\ f'(b) + \alpha (x - b), &{} x \in (d^*, b]. \end{array}\right. } \end{aligned}$$

Observe that \(f'(a) + \alpha (x-a)\) and \(f'(s) + \beta (x - s)\) are two linear functions intersecting at the point \(x = c^*\). Thus, since \(\beta > \alpha \), we have

$$\begin{aligned} \begin{aligned} f'(a) + \alpha (x-a)&\ge f'(s) + \beta (x - s), \quad x \le c^*,\\ f'(a) + \alpha (x-a)&\le f'(s) + \beta (x - s), \quad x \ge c^*. \end{aligned} \end{aligned}$$

In a similar way we can prove that

$$\begin{aligned} \begin{aligned} f'(b) + \alpha (x - b)&\ge f'(s) + \beta (x - s), \quad x \le d^*,\\ f'(b) + \alpha (x - b)&\le f'(s) + \beta (x - s), \quad x \ge d^*. \end{aligned} \end{aligned}$$

Thus, the function \(\nu (x)\) can be rewritten as follows

$$\begin{aligned} \nu (x) = {\left\{ \begin{array}{ll} \max \left( f'(a) + \alpha (x - a), f'(s) + \beta (x - s)\right) , \quad x \in [a, s],\\ \min \left( f'(s) + \beta (x - s), f'(b) + \alpha (x - b)\right) , \quad x \in [s, b]. \end{array}\right. } \end{aligned}$$
(50)

By applying Lemma 2 to (50) we get

$$\begin{aligned} \begin{aligned} f'(x) \ge \nu (x), \quad x \in [a,s],\\ f'(x) \le \nu (x), \quad x \in [s,b]. \end{aligned} \end{aligned}$$
(51)

Then, the Newton-Leibniz formula and (51) allow us to write

$$\begin{aligned} f(x) = f(a) + \int _a^x f'(t) dt \ge f(a) + \int _a^x \nu (t) dt = \mu (x) \end{aligned}$$
(52)

for \(x \in [a,s]\) and

$$\begin{aligned} f(b) = f(x) + \int _x^b f'(t) dt \le f(x) + \int _x^b \nu (t) dt \end{aligned}$$

for \(x \in [s,b]\). On the other hand, by taking into account (46) we get

$$\begin{aligned} \begin{aligned} f(x)&\ge f(b) - \int _x^b \nu (t) dt = \mu (b) - \int _x^b \nu (t) dt\\&= f(a) + \int _a^b \nu (t) dt - \int _x^b \nu (t) dt = f(a) + \int _a^x \nu (t) dt = \mu (x) \end{aligned} \end{aligned}$$
(53)

for \(x \in [s,b]\).

Finally, the inequality (47) is a direct consequence of (52) and (53). \(\square \)

4 Computational formulae for estimators and the objective ranges

The cumbersome formula (39) can be used to evaluate \(\mu (x) = \hat{\chi }(x, c^*)\) at a given point \(x \in [a,b]\). The following proposition shows that \(\mu (x)\) can be written in a more compact way.

Proposition 2

The function \(\mu (x)\) defined according to (46) can be expressed as follows

$$\begin{aligned} \mu (x) = {\left\{ \begin{array}{ll} f(a) + f'(a)(x - a) + \frac{\alpha }{2}(x - a)^2, &{} x \in [a,c^*),\\ \\ f(a) + f'(a)(c^* - a) + \frac{\alpha }{2}(c^* - a)^2 &{}\,\\ +(f'(a) + \alpha (c^* - a))(x - c^*) + \frac{\beta }{2}(x - c^*)^2, &{} x \in [c^*, d^*),\\ \\ f(b) + f'(b)(x - b) + \frac{\alpha }{2}(x - b)^2, &{} x \in [d^*, b], \end{array}\right. } \end{aligned}$$
(54)

and, in its turn, the piecewise linear derivative \(\nu (x)\) of \(\mu (x)\) can be computed as follows

$$\begin{aligned} \nu (x) = {\left\{ \begin{array}{ll} f'(a) + \alpha (x - a), &{} x \in [a,c^*),\\ f'(a) + \alpha (c^* - a) + \beta (x - c^*), &{} x \in [c^*, d^*),\\ f'(b) + \alpha (x - b), &{} x \in [d^*, b]. \end{array}\right. } \end{aligned}$$
(55)

where \(c^*\) is calculated accordingly to (40) and \(d^* = c^* + \delta \).

Proof

Let us prove expression (55) first. According to (45), \(\nu (x) = \hat{\psi }(x, c^*)\). Then expression (55) is obtained from (28) by replacing c with \(c^*\).

Let us now prove (54). Since cases \(x \in [a,c^*)\) and \(x \in [c^*, d^*)\) in (54) coincide with respective equations in (39), it is necessary to consider only the remaining case \(x \in [d^*, b]\). According to (39) and (45) we get

$$\begin{aligned} \begin{aligned} \mu (x)&= f(a) + f'(a)(c^* - a) + \frac{\alpha }{2}(c^* - a)^2 + (f'(a) + \alpha (c^* - a)) \delta \\&\quad + \frac{\beta }{2}\delta ^2 + f'(b)(x - d^*) + \frac{\alpha }{2}(x - b)^2 - \frac{\alpha }{2}(d^* - b)^2, \quad x \in [d^*, b]. \end{aligned} \end{aligned}$$
(56)

Due to (46) we have \(\mu (b) = f(b)\). Thus, it follows

$$\begin{aligned} \begin{aligned} f(b)&= f(a) + f'(a)(c^* - a) + \frac{\alpha }{2}(c^* - a)^2 + (f'(a) + \alpha (c^* - a)) \delta \\&\quad + \frac{\beta }{2}\delta ^2 + f'(b)(b - d^*) - \frac{\alpha }{2}(d^* - b)^2. \end{aligned} \end{aligned}$$
(57)

By subtracting (57) from (56) we obtain

$$\begin{aligned} \mu (x) - f(b) = f'(b)(x - b) + \frac{\alpha }{2}(x - b)^2. \end{aligned}$$

The proposition has been proven. \(\square \)

The formula (54) provides a lower estimator for the function f(x). In many applications, e.g., global optimization or solving non-linear equations, one needs an estimation interval [mM] of a function’s range over [ab], where

$$\begin{aligned} \begin{aligned} m&= \min \{\mu (x) | x \in [a,b]\},\\ M&= \max \{\mu (x) | x \in [a,b]\}. \end{aligned} \end{aligned}$$
(58)

Let us denote by Z a set of all zeros of the function \(\nu (x)\). Since \(\mu (x)\) is differentiable in [ab] and \(\mu '(x) = \nu (x), x \in [a,b]\), we get

$$\begin{aligned} m = \min \{\mu (x) |x \in Z \cup a \cup b\}. \end{aligned}$$
(59)

Thus, the minimum m of \(\mu (x)\) can be found by the following sequence of steps:

  1. 1.

    Find a set Z of all roots of \(\nu (x) = 0\), \(x \in [a,b]\).

  2. 2.

    Compute m according to (59).

The first step requires some explanation. Since \(\nu (x)\) is a piecewise linear function, finding its root is reduced to finding intersections of its segments with the horizontal line \(y = 0\) (see Fig. 5 for illustration). In the general case \(\alpha \ne 0, \, \beta \ne 0\), the set Z consists of no more then three points since the number of line segments comprising \(\nu (x)\) is three.

The situation becomes a bit more complex when \(\alpha = 0\) or \(\beta = 0\). Then some of line segments comprising \(\nu (x)\) are parallel to \(y = 0\). The intersection is either an empty set or a horizontal line segment. Within such segment \(\nu (x) = 0\) and, therefore, \(\mu (x)\) is constant. Thus it is sufficient to add one of the ends of this segment to the set Z.

In order to obtain the upper bound M from (58), observe that

$$\begin{aligned} M = -\hat{m}, \end{aligned}$$

where \(\hat{m}\) is a lower bound for the function \(\hat{f}(x) = -f(x)\) on interval [ab]. Thus, M can be easily computed by applying the procedure described above to the function \(\hat{f}(x)\) and reversing the sign of the found value.

5 The accuracy of the proposed estimators

In this section, we study the accuracy of the proposed estimators. First of all, it should be mentioned that the accuracy depends largely on the tightness of the Lipschitzian interval \([\alpha , \beta ]\). We leave without proof the following obvious fact meaning that the tighter \([\alpha , \beta ]\), the better the lower estimator.

Proposition 3

Let \(\mu (x)\) and \(\tilde{\mu }(x)\) be two lower bounds constructed accordingly to (54) for Lipschitzian intervals \([\alpha , \beta ]\) and \([\tilde{\alpha }, \tilde{\beta }]\), respectively. If \([\alpha ,\beta ] \subseteq [\tilde{\alpha }, \tilde{\beta }]\) then

$$\begin{aligned} \mu (x) \ge \tilde{\mu }(x), \quad x \in [a,b]. \end{aligned}$$

Let us now perform a comparison of the proposed estimator \(\mu (x)\) with the estimator defined in [35], where a second-order smooth estimator \(\mu _L(x)\) for a univariate function f(x) whose first derivative satisfies the Lipschitz condition

$$\begin{aligned} |f'(x_1) - f'(x_2)| \le L |x_1 - x_2| \end{aligned}$$
(60)

was proposed.

From (6) and (60) it follows that \([-L,L]\) is a Lipschitzian interval for \(f'(x)\) over [ab]. Vice-versa, if \([\alpha , \beta ]\) is a Lipschitzian interval for \(f'(x)\) over [ab], then \(L = \max \{|\alpha |, |\beta | \}\) satisfies (60). Thus, without loss of generality we assume the following inclusion

$$\begin{aligned} {[}\alpha ,\beta ] \subseteq [-L, L]. \end{aligned}$$

As was already mentioned above, the lower estimator proposed in [35] coincides with the lower estimator \(\mu (x)\) defined in (45) when \(\alpha = -L, \beta = L\). In the general case, \([\alpha , \beta ]\) can be significantly narrower than \([-L, L]\) which entails a more tight lower bound. Below we show that in some cases this difference can be arbitrary large.

Let \(\mu _L(x)\) be an estimator for f(x) constructed according to (45) with the Lipschitzian interval for the first derivative set to \([-L, L]\), where \(L = \max \{|\alpha |, |\beta | \}\). Denote the minimum of f(x), lower bounds for estimators \(\mu (x)\) and \(\mu _L(x)\) as z, m, and \(m_L\) respectively

$$\begin{aligned} \begin{aligned} z&= \min _{x \in [a,b]} f(x),\\ m&= \min _{x \in [a,b]} \mu (x),\\ m_L&= \min _{x \in [a,b]} \mu _L(x). \end{aligned} \end{aligned}$$

It follows from Proposition 3 that \(m \ge m_L\). Below we construct a parametric series of examples where \(m = z\) and the ratio \(|m_L| / |m|\) can be arbitrary large. To do this consider a function defined over the interval \([-1, 1]\) as follows

$$\begin{aligned} f(x) = \int _{-1}^x g(t) dt, \quad x \in [-1,1], \end{aligned}$$

where

$$\begin{aligned} g(x) = {\left\{ \begin{array}{ll} \alpha (x + 1), &{} x \in [-1, -r),\\ \beta (x + r) - 1, &{} x \in [-r, r),\\ \alpha (x - 1), &{} x \in [r, 1]. \end{array}\right. } \end{aligned}$$

Here r, \(\alpha \), and \(\beta \) are real numbers satisfying the following properties

$$\begin{aligned} r \in (0, 1/2), \quad \alpha = (r - 1)^{-1}, \quad \beta = r^{-1}. \end{aligned}$$
(61)

Notice that from (61) it follows that \(\alpha< 0 < \beta \). Observe that

$$\begin{aligned} f(-1) = f(1) = 0, \quad f'(-1) = f'(1) = 0. \end{aligned}$$
(62)

By substituting \(a = -1, b = 1\) and values (62) to (23) we get

$$\begin{aligned} \delta = \frac{2 \alpha }{\alpha - \beta }. \end{aligned}$$
(63)

Thus, the expression (40) can be written as follows

$$\begin{aligned} c^* = \frac{1}{\delta (\beta - \alpha )}\left( \frac{\delta ^2}{2}(\beta - \alpha ) + 2 \alpha \delta \right) = \frac{\delta }{2} + \frac{2 \alpha }{\beta - \alpha } = \frac{\alpha }{\beta - \alpha }. \end{aligned}$$
(64)

Due to the obvious symmetry, the minimum of \(\mu (x)\) is achieved at the point \(x = 0\). Then, from (63), (64), and (54) we get

$$\begin{aligned} \min _{x \in [a,b]} \mu (x) = \mu (0) = \frac{\alpha }{2} (c^* + 1)^2 - \alpha c^*(c^* + 1) + \frac{\beta }{2} {c^*}^2 = \frac{\beta - \alpha }{2} {c^*}^2 + \frac{\alpha }{2}. \end{aligned}$$

After substituting the value of \(c^*\) to this formula we obtain

$$\begin{aligned} \min _{x \in [a,b]} \mu (x) = \frac{\alpha \beta }{2 (\beta - \alpha )}. \end{aligned}$$

By construction the underestimator \(\mu (x)\) coincides with f(x) and thus

$$\begin{aligned} z = m = \frac{\alpha \beta }{2 (\beta - \alpha )}. \end{aligned}$$
(65)

By substituting values of \(\alpha \) and \(\beta \) from (61) we obtain

$$\begin{aligned} z = m = -\frac{1}{2}. \end{aligned}$$
(66)
Fig. 6
figure 6

Function f(x) coinciding with the underestimator \(\mu (x)\) (red) and underestimator \(\mu _L(x)\) (blue) for \(r = 0.45\) (left), 0.25 (center), 0.1 (right). (Color figure online)

Recall that the underestimator \(\mu _L(x)\) is obtained in the same way as \(\mu (x)\) by assuming \(\alpha = -L, \beta = L\). For this example we have \(L = \max \{|\alpha |, |\beta | \} = \beta \). Thus

$$\begin{aligned} m_L = -\frac{\beta }{2} = -\frac{1}{4 r}. \end{aligned}$$
(67)

Due to (65), ratio \((z - m_L) / (z - m)\) is undefined, since the bound m is exact. In order to perform a meaningful comparison, let us consider the ratio \(|m_L|/|m|\). From (66) and (67) it follows that

$$\begin{aligned} \frac{|m_L|}{|m|} = \frac{1}{2 r}. \end{aligned}$$

This value tends to \(+ \infty \) when \(r \rightarrow 0\), i.e., the bound \(m_L\) can be arbitrary worse with respect to m which is precise for this example. To illustrate this tendency underestimators \(\mu (x)\), \(\mu _L(x)\) for different values of r are shown in Fig. 6.

6 Experimental evaluation on two series of global optimization test problems

6.1 A global optimization algorithm using the new estimator

To evaluate the efficiency of the proposed estimator we have implemented a classical branch-and-bound procedure for solving the problem (1). The algorithm finds an approximate \(\varepsilon \)-solution \(x^{(\varepsilon )}\) (see (3)) in a finite number of steps (see the rest of the section for the explanation).

This branch-and-bound method (Algorithm 1) uses an auxiliary procedure get_bound (Algorithm 2) that for a given interval \([\hat{a},\hat{b}]\) constructs an underestimator g(x) (line 2), finds a point of its minimum \(\hat{c}\) (line 3), computes the objective’s value at this point and updates the record point \(x^r\) if necessary (lines 4–6).Footnote 1 Notice that different underestimators g(x) can be employed. The procedure returns a pair consisting of a minimizer \(\hat{c}\) and the corresponding lower bound \(g(\hat{c})\).

figure a

Experiments were performed for four different underestimators g(x). The first one is the classical Pijavskij piece-wise linear underestimator (see [29]) defined as follows

$$\begin{aligned} \mu _P(x) = \max \left( f(\hat{a}) - l (x - \hat{a}), f(\hat{b}) + l (x - \hat{b})\right) , \quad x \in [\hat{a},\hat{b}], \end{aligned}$$
(68)

where l is the Lipschitz constant for the function f(x) on the interval \([\hat{a},\hat{b}]\).

figure b

The second underestimator was proposed in [6] and has the following form

$$\begin{aligned} \mu _C(x) = \max \left( f(\hat{a}) - \gamma (x - \hat{a}), f(\hat{a}) + \lambda (x - \hat{a})\right) , \quad x \in [\hat{a},\hat{b}], \end{aligned}$$
(69)

where \([\gamma , \lambda ]\) is a Lipschitz interval for the function f(x) over \([\hat{a},\hat{b}]\).Footnote 2 Notice, that \([\gamma , \lambda ] \subseteq [-l, l]\). Thus, the underestimator (69) is always not worse than (68).

The third underestimator is the smooth supporting function \(\mu _L(x)\), introduced in [35] and considered in detail earlier in Sect. 5. The fourth one is the underestimator \(\mu (x)\) proposed in the present paper, see (54).

The Branch-and-Bound algorithm (Algorithm 1) takes the feasible interval [ab] and the tolerance \(\varepsilon \) as parameters. It maintains the list of tuples \(\mathcal {L}\), initialized with a tuple (abclb) corresponding to the the initial problem (line 2). Each tuple stores the interval ends \(\hat{a}\), \(\hat{b}\), the split point \(\hat{c}\) and the lower bound \(\hat{lb}\) for the objective function on this interval. The point \(\hat{c}\) is used to separate the interval into two smaller intervals \([\hat{a},\hat{c}]\), \([\hat{c},\hat{b}]\). The value lb is used in lower bound tests in lines 6, 8 and 12 of Algorithm 1.

The point \(\hat{c}\) is usually taken as the global minimizer of the underestimator g(x) (line 3 in Algorithm 2). Notice that \(\tilde{c}\) computed at the line 7 of Algorithm 1 may coincide with one of the interval ends. For the underestimators under consideration, in this case \(g(\tilde{c}) = f(\tilde{c})\). In all four underestimators under consideration the values of the function f(x) is obligatory computed at the interval ends and the record \(x^r\) is updated if necessary. Therefore, we get

$$\begin{aligned} f(x^r) \le f(\tilde{c}) = g(\tilde{c}) = \hat{lb}. \end{aligned}$$

Thus, the tuple \((\hat{a},\hat{c},\tilde{c},\tilde{lb})\) (or \((\hat{c},\hat{b},\tilde{c},\tilde{lb})\)) will not be placed to the list \(\mathcal {L}\) as failed the lower bound test at lines 9 or 13 of Algorithm 1.

The tuples are stored in the increasing order of their lower bounds. At each iteration of the main while loop (lines 3-15 of Algorithm 1) the tuple S is taken from the head of this list, i.e., the tuple with the least lower bound is selected for processing.

Featured with the considered underestimators the algorithm 1 always terminates in a finite number of steps (iterations of the while loop in lines 3–16). The proof of this fact follows easily from the general finite convergence conditions of the Branch-and-Bound scheme (see [33]) and the properties of the underestimators under consideration.

6.2 Experimental setup and numerical results

For experimental evaluation of the proposed underestimators two sets of benchmarks described in [48] have been applied. The first set, A, contains 100 Shekel test problems defined as follows

$$\begin{aligned} f(x) = - \sum \limits _{i=1}^{10} k^2_i (10 x - a_i)^2 + c_i, \quad x \in [0,1], \end{aligned}$$
(70)

where parameters

$$\begin{aligned} k_i \in [1, 3],\, c_i \in [0.1, 0.3],\, a_i \in [0,10], \quad i=1, \dots , 10. \end{aligned}$$

were randomly generated within the specified intervals. The set B contains 100 reverse Shekel test problems obtained from (70) by reversing the sign of the objective function f(x). The example functions from these test sets are presented in Figs. 7 and 8.

Fig. 7
figure 7

The Shekel test function and trial points for \(\mu _P(x)\) (1), \(\mu _C(x)\) (2), \(\mu _P(x)\) (3) and \(\mu (x)\) (4) underestimators

Fig. 8
figure 8

The reverse Shekel test function and trial points for \(\mu _P(x)\) (1), \(\mu _C(x)\) (2), \(\mu _P(x)\) (3) and \(\mu (x)\) (4) underestimators

It should be mentioned that for reverse Shekel functions the global minimizer can occur at the margins of the search interval [0, 1]. These cases have been excluded from the experiments. Thus, all functions included in the set B have global minimizers being internal points of the search interval.

The Lipschitzian interval \([\gamma , \lambda ]\) used in the \(\mu _C(x)\) underestimator was computed at the beginning of a global optimization problem solution as the natural interval extension (see [32]) of the first derivative \(f'(x)\) of the objective function on the entire interval [ab]. The value of the Lipschitz constant l used in \(\mu _P(x)\) underestimator for the objective function was assumed equal to \(\max \{|\gamma |, |\lambda | \}\).

The Lipschitzian interval \([\alpha , \beta ]\) for the derivative \(f'(x)\) was also computed once at the beginning of the benchmark problem processing as the natural interval extension of the second derivative \(f''(x)\) of the objective function on the interval [ab]. The Lipschitz constant L for the first derivative was assumed equal to \(\max \{|\alpha |, |\beta | \}\).

The experimental results are presented in Tables 1 and 2. Table 1 contains the average number of steps (iterations of the main loop of Algorithm 1) for four underestimators \(\mu _P(x)\), \(\mu _C(x)\), \(\mu _L(x)\), and \(\mu (x)\). Figures 7 and 8 depict the trial points for two arbitrary selected functions from test sets A and B, respectively. To obtain a meaningful visualization, only each 1000th trial point is depicted, others are skipped (otherwise points become indistinguishable due to a huge number of them).

As we can see from Table 1, the average number of steps performed by the Algorithm 1 equipped with the quadratic underestimators (\(\mu _L(x)\) and \(\mu (x)\)) is in an order of magnitude less than the linear ones (\(\mu _P(x)\) and \(\mu _C(x)\)). However, it should be taken into account that iterations of methods using derivatives are heavier w.r.t. \(\mu _P(x)\) and \(\mu _C(x)\) since to compute \(\mu _L(x)\) and \(\mu (x)\) not only values of f(x) but values of \(f'(x)\) are required. It can be seen that, as expected, using the \(\mu (x)\) underestimator in average yields less number of steps w.r.t. the \(\mu _L(x)\) underestimator. However, the ratio of steps performed for these two cases differs for different benchmarks. The minimal, average, and maximal values of this ratio are given in Table 2.

Let us note, that the large number of trials is caused by a global estimation of a Lipschitz interval. If we used the local estimation instead the number of trial points would be dramatically less. The second observation is that for all the algorithms tested the reverse Shekel functions appear much more complex w.r.t. the original ones. This can be explained by the fact that a reverse Shekel function assumes values close to the global minimum for large portions of the feasible region (see Fig. 8) where the lower bound tests perform poorly.

In conclusion, experimental results confirm that: (i) smooth piece-wise quadratic estimators work better than piece-wise linear ones; (ii) the smooth piece-wise quadratic underestimator \(\mu (x)\) proposed in this paper gives a significant improvement over \(\mu _L(x)\) introduced in [35].

Table 1 The average number of iterations for different underestimators
Table 2 The minimal, average, and the maximal ratio of the number of steps performed when using the underestimator \(\mu _L(x)\) to the the number of steps performed when \(\mu (x)\) is used