1 Introduction

Motivated by the need of modeling and market forecasting, inverse problems in financial mathematics have received considerable attention (see [1, 2, 7, 1416, 1921, 23, 26, 30]).

In general, the underlying asset \(S_{t}\) at time t is modeled by the stochastic differential equation

$$ dS_{t}=\mu (t,S_{t})S_{t}\,dt+\sigma (t,S_{t})S_{t}\,dW_{t}, $$

where the process \(W_{t}\) is the standard Brownian motion. The parameters μ and σ are called the real drift and local volatility of the underlying asset, respectively. The drift term μ indicates the expected return of stock price changes, whereas the volatility is used to measure the variability of variables. The stock price movement is determined by the drift rate, volatility, and Brownian motion. The randomness of volatility, drift rate, and Brownian motion determines the process of stock price, which is full of randomness and uncertainty. In practice, the drift rate is quite difficult to measure, but it has an important impact on the stock price trend. Therefore it is of great financial significance to use some indirect approaches to estimate the drift rate, which represents people’s prediction of the expected return rate of stock price.

Black and Scholes [1] first discovered how to construct a dynamic portfolio \(\Pi _{t}\) of a derivative security and the underlying asset. By Itô’s lemma the stochastic behavior of the derivative security \(u(t,S)\) is driven by the stochastic differential equation

$$ du= \biggl(\frac{\partial u}{\partial t}+\mu (t,S)S \frac{\partial u}{\partial S} + \frac{1}{2}\sigma ^{2}(t,S)S^{2} \frac{\partial ^{2}u}{\partial S^{2}} \biggr)\,dt +\sigma (t,S)S \frac{\partial u}{\partial S}\,dW. $$

In the absence of arbitrage opportunities, the instantaneous return of this portfolio must be equal to the interest rate \(r>0\), that is, the return of a riskless asset such as a bank deposit. Therefore this equality takes the form of the following partial differential equation (the Black–Scholes equation):

$$ u_{t}+\frac{1}{2}\sigma (t,S)^{2}S^{2}u_{SS}+(r- \delta )Su_{S}-ru=0, $$

where r and the dividend rate δ are known constants.

However, the theoretical prices of options with different strike prices calculated by the Black–Scholes model differ from real market prices. Under the noarbitrage property of the financial market, the real drift μ does not enter the above equation. In [23], taking this into account, the following new binary option model is derived:

$$ u_{t}+\frac{1}{2}\sigma (t,S)^{2}S^{2}u_{SS}+ \mu (S)Su_{S}-ru=0. $$

This model is a form of arbitrage model. Further, considering the property of binary option, the final condition at the maturity is specified by

$$ u(T,S)=H(S-K)=\textstyle\begin{cases} 1, & S\geq K, \\ 0, & S< K.\end{cases} $$

We would like to determine the drift function μ using the current market prices

$$ u\bigl(t^{*},S^{*};K,T \bigr)=u^{*}(K,T), \quad K>0, $$
(1.1)

of options with different strikes K and fixed maturity T.

Using the Dupire technique, we deduce that the price \(u(T,K)\) of the binary option with maturity T and strike price K satisfies the adjoint equation:

$$ \textstyle\begin{cases} u_{T}-\frac{1}{2}\sigma ^{2}K^{2}u_{KK}+\mu (K)Ku_{K}+ru=0, & (K,T) \in (0,\infty )\times (0,t), \\ u(t,S;T,K)|_{T=t}=H(S-K), & K\in (0,\infty ).\end{cases} $$

Making the changes in variables \(\bar{W}(\tau ,y)=u(T,K)\), \(y=\ln (K/S^{*})\), \(\tau =T-t\), we identify the drift \(a(y):=\mu (K)\) satisfying the following:

Problem P1

The Cauchy problem of second-order parabolic equation

$$ \textstyle\begin{cases} \bar{W}_{\tau }- \frac{1}{2}\sigma _{0}^{2}\bar{W}_{yy}+(\frac{1}{2} \sigma _{0}^{2}+ a(y))\bar{W}_{y}+r\bar{W}=0, & (y,\tau )\in R\times (0,\tau ^{*}), \\ \bar{W}(y,0)=H(-y), & y\in R, \end{cases} $$
(1.2)

where the local volatility \(\sigma _{0}\) is a constant, \(a(y)\) is an unknown coefficient in (1.2). The extra condition (1.1) is transformed into

$$ \bar{W}\bigl(y,\tau ^{*}\bigr)=\bar{W}^{*}(y), \quad y\in R. $$
(1.3)

Determine the functions and \(a(y)\) satisfying (1.2)–(1.3).

In this problem, \(y\in R\), so the problem is an unbounded domain problem which is not conducive to numerical calculations. With this in mind, we translate the problem into an approximation problem on a larger bounded domain \(y\in [-L,L]\), where L is a large positive constant. The problem can be rewritten in the following form.

Problem P2

The initial-boundary value problem of the second-order parabolic equation

$$ \textstyle\begin{cases} W_{\tau }- \frac{1}{2}\sigma _{0}^{2}W_{yy}+(\frac{1}{2}\sigma _{0}^{2}+a(y))W_{y}+rW=0, & (y,\tau )\in Q=(-L,L)\times (0,\tau ^{*}), \\ W(y,0)=H(-y), & y\in [-L,L], \\ W(-L,\tau )=1,& \tau \in (0,\tau ^{*}), \\ W(L,\tau )=0,& \tau \in (0,\tau ^{*}), \end{cases} $$
(1.4)

and the extra condition

$$ W\bigl(y,\tau ^{*}\bigr)=W^{*}(y),\quad y\in [-L,L]. $$
(1.5)

Determine the functions W and \(a(y)\) satisfying (1.4)–(1.5).

The boundary conditions in P2 is nonhomogeneous, which is not conductive to integration by parts. We try to convert the nonhomogeneous equation to a homogeneous one. Let

$$ W(\tau ,y)=U(\tau ,y)+\frac{L-y}{2L}, \qquad f(y)=\biggl( \frac{1}{2}\sigma ^{2}_{0}-rL+ry\biggr) \frac{1}{2L}. $$

Then Problem P2 is transformed into the following form.

Problem P

The initial-boundary value problem of second-order parabolic equation

$$ \textstyle\begin{cases} U_{\tau }-\frac{1}{2}\sigma _{0}^{2}(U_{yy}-U_{y}) +a(y)(U_{y}- \frac{1}{2L})+rU \\ \quad =f(y), & (y,\tau )\in Q=(-L,L)\times (0,\tau ^{*}), \\ U(y,0)=H(-y)-\frac{L-y}{2L}, & y\in [-L,L], \\ U(-L,\tau )=0,& \tau \in (0,\tau ^{*}), \\ U(L,\tau )=0,& \tau \in (0,\tau ^{*}). \end{cases} $$
(1.6)

The additional condition is given by

$$ U\bigl(y,\tau ^{*}\bigr)=U^{*}(y):=W^{*}(y)- \frac{L-y}{2L}, \quad y\in [-L,L]. $$
(1.7)

Inverse coefficient problems for parabolic equations are well studied in the literature, and abundant theoretical and numerical results are obtained. The inverse problem of identifying coefficient \(q(y)\) in the parabolic equation

$$ u_{t}-\triangle u+q(y)u=0, \quad (y,t)\in Q, $$

from final overdetermination data \(u(y,T)\) has been investigated by several authors, for example, in [5, 6, 17, 18, 25, 27, 29, 31]. The purely time-dependent case, that is, determining the unknown radiative coefficient in heat conduction equations independent of the spatial variable, has been extensively studied by several authors (see, e.g., [3, 4, 811, 24, 28]). For the space- and time-dependent case, we refer the readers to [12, 13].

In financial mathematics, there is often a kind of inverse problems of using market observation data to reconstruct the implied volatility. Using the optimal control theory, the recovery of volatility in the Black–Scholes equation

$$ \frac{\partial V}{\partial t}+\frac{1}{2}\sigma ^{2}(S)S^{2} \frac{\partial ^{2}V}{\partial S^{2}}+(r-q)S \frac{\partial V}{\partial S}-rV=0 $$

is studied from the current options market in [21]. In [2] the authors reduce the identification of volatility to an inverse parabolic problem with terminal observation and establish uniqueness and stability results by using the Carleman estimates. In [7] the local volatility surface is recovered by nonlinear Landweber iteration using simulated data. A new continuous-time model is proposed in [19] to recover the volatility, and the corresponding numerical results are obtained by solving a couple of fully nonlinear parabolic equations. In [16] the volatility is parameterized by five special numbers, and (nonlinear) minimization of the mismatch functional is implemented.

Compared with the inverse volatility problem, there are few documents on the inverse problem of drift rate. In [23] the authors consider an inverse problem of recovering the real drift of binary call options from market prices. By using the linearization method the inverse problem is transformed to an integral equation, and numerical results are also obtained.

In this paper, we use the optimal control method to discuss Problem P. Compared with other papers concerning volatility identification problems, our work has the following unusual features. First, the unknown function to be identified is the first-order derivative coefficient rather than the principal one. Second, our mathematical model does not tend to zero at infinity, which may bring great trouble to theoretical analysis and numerical calculation.

The rest of the paper is organized as follows. In Sects. 2 and 3, we transform Problem P into an optimal control Problem P3 and prove the existence of the minimum for the control functional. The necessary condition, which must be satisfied by the minimum, is deduced in Sect. 4. In Sect. 5, we prove that the minimum is locally unique under some assumptions. In Sect. 6, we design an iterative algorithm to obtain the numerical solution and give some typical examples. Section 7 ends this paper with a summary.

2 Optimal control problem

Consider the following optimal control problem.

Problem P3

Find \(\bar{a}(y)\in \mathcal{A}\) such that

$$ J(\bar{a})=\min_{a\in \mathcal{A}}J(a), $$
(2.1)

where

$$\begin{aligned}& \begin{aligned} &J(a)= \int _{-L}^{L} \bigl\vert U\bigl(y,\tau ^{*};a\bigr)-U^{*}(y) \bigr\vert ^{2}\,dy+N \int _{-L}^{L} \vert \nabla a \vert ^{2}\,dy, \\ &\mathcal{A}= \bigl\{ a(y)|0< \alpha _{0} \leq a(y)\leq \alpha _{1}, \nabla a\in L^{2}(-L,L), \vert \nabla a \vert \leq \alpha _{2} \bigr\} , \end{aligned} \end{aligned}$$
(2.2)

\(U(y,\tau ;a)\) is the solution to problem (1.6) for given \(a\in \mathcal{A}\), and N is the regularization parameter.

For given \(a\in \mathcal{A}\), from Sobolev’s embedding theorem we have \(a\in C^{1/2}(-L,L)\) and \(\|a\|_{C^{1/2}(-L,L)}\leq C\) (here C is a constant). The known Schauder theory for parabolic equations (see [22]) guarantees that there is a unique solution \(U(y,\tau )\in C^{2+\alpha ,1+\alpha /2}(\bar{Q})\) to the initial-boundary value problem (1.6).

Lemma 2.1

If \(U(y,\tau )\) is a solution to the initial-boundary value problem (1.6), then

$$\begin{aligned}& \max_{0\leq \tau \leq \tau ^{*}} \int _{-L}^{L} U^{2}\,dy\,d\tau + \int _{Q_{\tau }} U_{y}^{2}\,dy\,d\tau \\& \quad \leq C \int _{-L}^{L} \biggl(H(-y)-\frac{L-y}{2L} \biggr)^{2}\,dy+ C \int _{Q_{\tau }} \biggl(a(y)\frac{1}{2L}+f(y) \biggr)^{2}\,dy\,d\tau , \end{aligned}$$
(2.3)

where \(Q_{\tau }=(-L,L)\times (0,\tau ]\), and C is a constant.

Proof

From equation (1.6) we have

$$ \int _{Q_{\tau }} \biggl(U_{\tau }-\frac{1}{2} \sigma _{0}^{2}(U_{yy}-U_{y})+a(y) \biggl(U_{y}- \frac{1}{2L}\biggr)+rU \biggr) U\,dy\,d\tau = \int _{Q_{\tau }}f(y)U\,dy\,d\tau . $$

Integrating by parts, we obtain

$$ \int _{-L}^{L} \frac{U^{2}}{2}\bigg|_{0}^{\tau } \,dy+ \int _{Q_{\tau }} \biggl(\frac{1}{2}\sigma _{0}^{2}U_{y}^{2}+rU^{2} \biggr)\,dy\,d\tau = \int _{Q_{\tau }} \biggl(-aU_{y}U+ \biggl( \frac{a}{2L}+f(y) \biggr)U \biggr)\,dy\,d\tau . $$

From the Cauchy–Schwarz inequality we have

$$\begin{aligned}& \int _{-L}^{L} \frac{U^{2}}{2}\bigg|_{\tau } \,dy+ \int _{Q_{\tau }} \biggl(\frac{1}{2} \sigma _{0}^{2}U_{y}^{2}+rU^{2} \biggr)\,dy\,d\tau \\& \quad \leq \int _{-L}^{L}\frac{1}{2} \biggl(H(-y)- \frac{L-y}{2L} \biggr)^{2}\,dy+ \int _{Q_{\tau }} \biggl(\frac{1}{4}\sigma _{0}^{2}U_{y}^{2}+ \biggl( \frac{a}{2L}+f(y) \biggr)^{2} +CU^{2} \biggr)\,dy \,d\tau , \end{aligned}$$

that is,

$$\begin{aligned}& \int _{-L}^{L}\frac{U^{2}}{2}\bigg|_{\tau } \,dy + \int _{Q_{\tau }} \frac{1}{4}\sigma _{0}^{2}U_{y}^{2} \,dy\,d\tau \\& \quad \leq \int _{-L}^{L}\frac{1}{2} \biggl(H(-y)- \frac{L-y}{2L} \biggr)^{2}\,dy+ \int _{Q_{\tau }} \biggl( \biggl(\frac{a}{2L}+f(y) \biggr)^{2} +CU^{2} \biggr)\,dy\,d\tau . \end{aligned}$$

From Gronwall’s inequality we have

$$\begin{aligned}& \int _{-L}^{L} \frac{U^{2}}{2}\bigg|_{\tau } \,dy+ \int _{Q_{\tau }} \frac{1}{4} \sigma _{0}^{2}U_{y}^{2} \,dy\,d\tau \\& \quad \leq C \int _{-L}^{L} \biggl(H(-y)-\frac{L-y}{2L} \biggr)^{2}\,dy+ C \int _{Q_{\tau }} \biggl(\frac{a}{2L}+f(y) \biggr)^{2}\,dy\,d\tau . \end{aligned}$$

This completes the proof of Lemma 2.1. □

3 Existence

Theorem 3.1

There exists a minimizer \(\bar{a}\in \mathcal{A}\) of \(J(a)\), that is,

$$ J(\bar{a})=\min_{a\in \mathcal{A}}J(a). $$

Proof

Let \((U_{n},a_{n})\) be a minimizing sequence. Since \(J(a_{n})\leq C\), thanks to the particular structure of J, we deduce

$$ \Vert \nabla a_{n} \Vert _{L^{2}(-L,L)}\leq C \quad (C \text{ is independent of } n). $$

By the Sobolev imbedding theorem we obtain

$$ \Vert a_{n} \Vert _{C^{1/2}(-L,L)}\leq C. $$

Thus

$$ \bigl\Vert U_{n}(y,\tau ) \bigr\Vert _{C^{2+1/2,1+1/4}(Q)}\leq C. $$

Therefore we can select subsequences of \(a_{n}\) and \(U_{n}\), again denoted by \(a_{n}\) and \(U_{n}\), such that

$$\begin{aligned}& a_{n}(y)\rightarrow \bar{a}(y)\in C^{1/2}(-L,L), \quad \text{uniformly in } C^{\alpha }(-L,L)\quad \biggl(0\leq \alpha < \frac{1}{2} \biggr), \\& U_{n}(y,\tau )\rightarrow \bar{U}(y,\tau ), \quad \text{uniformly in } C^{2+ \alpha ,1+\alpha /2}(Q). \end{aligned}$$

We easily check that \((\bar{a}(y),\bar{U}(y,\tau ))\) satisfies (1.6). By the Lebesgue control convergence theorem and the weak semicontinuity of the \(L^{2}\) norm we obtain

$$ J(\bar{a})\leq \lim_{n\rightarrow \infty }\inf J(a_{n})=\min _{a\in \mathcal{A}}J(a). $$

Hence

$$ J(\bar{a})=\min_{a\in \mathcal{A}}J(a). $$

This completes the proof of Theorem 3.1. □

4 Necessary condition

Theorem 4.1

Let a be the solution of the optimal control problem (2.1). Then there exists a triple of functions \((U,V;a)\) satisfying the following system:

$$\begin{aligned}& \textstyle\begin{cases} U_{\tau }-\frac{1}{2}\sigma _{0}^{2}(U_{yy}-U_{y})+a(y)(U_{y}- \frac{1}{2L})+rU=f(y), &(y,\tau )\in Q, \\ U(y,0)=H(-y)-\frac{L-y}{2L}, & y\in [-L,L], \\ U(-L,\tau )=0,& \tau \in (0,\tau ^{*}), \\ U(L,\tau )=0,& \tau \in (0,\tau ^{*}), \end{cases}\displaystyle \end{aligned}$$
(4.1)
$$\begin{aligned}& \textstyle\begin{cases} -V_{\tau }-\frac{1}{2}\sigma _{0}^{2}V_{yy}-\frac{1}{2}\sigma _{0}^{2}V_{y}- (a(y)V)_{y}+rV=0, &(y,\tau )\in Q, \\ V(y,\tau ^{*})=U(y,\tau ^{*})-U^{*}(y), & y\in [-L,L], \\ V(-L,\tau )=0,& \tau \in (0,\tau ^{*}), \\ V(L,\tau )=0,& \tau \in (0,\tau ^{*}), \end{cases}\displaystyle \end{aligned}$$
(4.2)

and

$$ N \int _{-L}^{L}\nabla a\cdot \nabla (h-a)\,dy- \int _{Q}V(h-a) \biggl(U_{y}- \frac{1}{2L} \biggr)\,dy\,d\tau \geq 0 $$
(4.3)

for all \(h\in \mathcal{A}\).

Proof

For any \(h\in \mathcal{A}\) and \(0\leq \delta \leq 1\), we have

$$ a_{\delta }\equiv (1-\delta )a+\delta h\in \mathcal{A}. $$

Let \(U_{\delta }\) be the solution to the initial-boundary value problem (4.1) with given \(a=a_{\delta }\). Since a is an optimal solution, by (2.2) we have

$$ \frac{d}{d\delta }J(a_{\delta })\bigg|_{\delta =0}=2 \int _{-L}^{L}\bigl(U\bigl(y, \tau ^{*}\bigr)- U^{*}(y)\bigr)\frac{\partial U_{\delta }}{\partial \delta } \bigg|_{\delta =0}\,dy+ 2N \int _{-L}^{L}\nabla a\cdot \nabla (h-a)\,dy \geq 0. $$
(4.4)

Denoting \(U^{\prime }_{\delta }\equiv \frac{\partial U_{\delta }}{\partial \delta }\) and using (4.1), by direct calculation we get the following equation:

$$ \textstyle\begin{cases} U_{\delta \tau }^{\prime }=\frac{1}{2}\sigma _{0}^{2}U_{\delta yy}^{\prime }- \frac{1}{2}\sigma _{0}^{2}U_{\delta y}^{\prime }-a_{\delta }(y)U_{\delta y}^{\prime }-(h-a)U_{ \delta y}+(h-a)\frac{1}{2L}-rU_{\delta }^{\prime }, \\ U^{\prime }_{\delta }|_{\tau =0}=0, \\ U^{\prime }_{\delta }|_{y=-L}=0, \\ U^{\prime }_{\delta }|_{y=L}=0. \end{cases} $$
(4.5)

Let \(\xi =U^{\prime }_{\delta } |_{\delta =0}\). Then ξ satisfies

$$ \textstyle\begin{cases} \xi _{\tau }=\frac{1}{2}\sigma _{0}^{2}\xi _{yy}-\frac{1}{2}\sigma _{0}^{2} \xi _{ y}-a(y)\xi _{y}-(h-a)\frac{\partial U}{\partial y}+(h-a) \frac{1}{2L}-r\xi , \\ \xi |_{\tau =0}=0, \\ \xi |_{y=-L}=0, \\ \xi |_{y=L}=0. \end{cases} $$
(4.6)

From (4.4) we have

$$ \int _{-L}^{L}\bigl(U\bigl(y,\tau ^{*}\bigr)-U^{*}(y)\bigr)\xi \bigl(y,\tau ^{*}\bigr)\,dy+ N \int _{-L}^{L} \nabla a\cdot \nabla (h-a)\,dy\geq 0. $$
(4.7)

Suppose V is the generalized solution of the following adjoint problem:

$$ \textstyle\begin{cases} \mathcal{L}^{*}V=-V_{\tau }-\frac{1}{2}\sigma _{0}^{2}V_{yy}- \frac{1}{2}\sigma _{0}^{2}V_{y}- (a(y)V)_{y}+rV=0, \\ V(y,\tau ^{*})=U(y,\tau ^{*})-U^{*}(y), \\ V(-L,\tau )=0, \\ V(L,\tau )=0. \end{cases} $$
(4.8)

From (4.6) and (4.8) we have

$$ \begin{aligned} 0&= \int _{Q}\mathcal{L}^{*}V\cdot \xi \,dy\,d\tau \\ &= \int _{Q}\biggl(-V_{\tau }\xi -\frac{1}{2} \sigma _{0}^{2}V_{yy}\xi - \frac{1}{2}\sigma _{0}^{2}V_{y}\xi - \bigl(a(y)V\bigr)_{y}\xi +rV\xi \biggr)\,dy\,d\tau \\ &=- \int _{-L}^{L}V\xi |^{\tau ^{*}}_{0} \,dy+ \int _{Q}\biggl(V\xi _{\tau }- \frac{1}{2} \sigma _{0}^{2}V\xi _{yy}+ \frac{1}{2}\sigma _{0}^{2}V\xi _{y} +a(y)V\xi _{y}+rV\xi \biggr)\,dy\,d\tau \\ &=- \int _{-L}^{L}\bigl(U\bigl(y,\tau ^{*}\bigr)-U^{*}(y)\bigr)\xi \bigl(y,\tau ^{*}\bigr)\,dy+ \int _{Q}V(a-h) \biggl(U_{y}- \frac{1}{2L} \biggr)\,dy\,d\tau . \end{aligned} $$
(4.9)

Combining (4.7) and (4.9), we easily obtain that

$$ N \int _{-L}^{L}\nabla a\cdot \nabla (h-a)\,dy- \int _{Q}V(h-a) \biggl(U_{y}- \frac{1}{2L} \biggr)\,dy\,d\tau \geq 0. $$

This completes the proof of Theorem 4.1. □

5 Uniqueness

Lemma 5.1

For any bounded continuous function \(f(y)\in C[-L,L]\), we have

$$ \max_{y\in [-L,L]} \bigl\vert f(y) \bigr\vert \leq \bigl\vert f(y_{0}) \bigr\vert +C \biggl( \int _{-L}^{L} \vert \nabla f \vert ^{2}\,dy \biggr)^{1/2}, $$

where \(y_{0}\) is a fixed point.

Proof

$$\begin{aligned} \bigl\vert f(y) \bigr\vert \leq& \bigl\vert f(y_{0}) \bigr\vert + \biggl\vert \int _{y_{0}}^{y}f^{\prime }\,dy \biggr\vert \\ \leq& \bigl\vert f(y_{0}) \bigr\vert + \biggl( \int _{-L}^{L}1\,dy \biggr)^{1/2} \cdot \biggl( \int _{-L}^{L} \vert \nabla f \vert ^{2}\,dy \biggr)^{1/2}. \end{aligned}$$

This completes the proof of Lemma 5.1. □

Let \(a_{1}(y)\) and \(a_{2}(y)\) be two minimizers of the control Problem P3, and let \(\{U_{i},V_{i}\}\) (\(i=1,2\)) be solutions of system (4.1)–(4.2) in which \(\bar{a}=a_{i}\), respectively. Set

$$ a_{1}-a_{2}=A, \qquad U_{1}-U_{2}=U,\qquad V_{1}-V_{2}=V. $$

Then U and V satisfy

$$ \textstyle\begin{cases} U_{\tau }-\frac{1}{2}\sigma _{0}^{2}U_{yy}+(\frac{1}{2}\sigma _{0}^{2}+a_{1}(y))U_{y}+rU =\frac{1}{2L}A-AU_{2y}, \\ U(y,0)=0, \\ U(-L,\tau )=0, \\ U(L,\tau )=0, \end{cases} $$
(5.1)

and

$$ \textstyle\begin{cases} -V_{\tau }-\frac{1}{2}\sigma _{0}^{2}V_{yy}-\frac{1}{2}\sigma _{0}^{2}V_{y}- (a_{1}(y)V)_{y}+rV=(AV_{2})_{y}, \\ V(y,\tau ^{*})=U(y,\tau ^{*}), \\ V(-L,\tau )=0, \\ V(L,\tau )=0. \end{cases} $$
(5.2)

Lemma 5.2

$$ \Vert V_{i} \Vert _{\infty }\leq \bigl\Vert U_{i}\bigl(y,\tau ^{*}\bigr)-U_{i}^{*}(y) \bigr\Vert _{\infty } \quad (i=1,2). $$
(5.3)

Proof

Set \(t=\tau ^{*}-\tau \). Then from (4.2) we have

$$ \textstyle\begin{cases} \mathcal{L}_{1}V_{i}=V_{it}-\frac{1}{2}\sigma _{0}^{2}V_{iyy}- \frac{1}{2}\sigma _{0}^{2}V_{iy}- (a_{i}(y)V_{i})_{y}+rV_{i}=0, \\ V_{i}(y,0)=U_{i}(y,\tau ^{*})-U^{*}_{i}(y), \\ V_{i}(-L,\tau )=0, \\ V_{i}(L,\tau )=0. \end{cases} $$

Letting \(W=\|U_{i}(y,\tau ^{*})-U_{i}^{*}(y)\|_{\infty }\pm V_{i}\), we obtain

$$\begin{aligned} &\mathcal{L}_{1}W=\mathcal{L}_{1} \bigl\Vert U_{i}-U_{i}^{*} \bigr\Vert _{\infty }\pm \mathcal{L}_{1}V_{i}=r \bigl\Vert U_{i}-U_{i}^{*} \bigr\Vert _{\infty }\geq 0, \\ &W|_{t=0}= \bigl\Vert U_{i}\bigl(y,\tau ^{*}\bigr)-U_{i}^{*}(y) \bigr\Vert _{\infty }\pm \bigl(U_{i}\bigl(y, \tau ^{*} \bigr)-U_{i}^{*}(y)\bigr)\geq 0, \\ &W|_{y=-L}= \bigl\Vert U_{i}\bigl(-L,\tau ^{*}\bigr)-U_{i}^{*}(-L) \bigr\Vert _{\infty }\pm 0 \geq 0, \\ &W|_{y=L}= \bigl\Vert U_{i}\bigl(L,\tau ^{*}\bigr)-U_{i}^{*}(L) \bigr\Vert _{\infty }\pm 0\geq 0. \end{aligned}$$

Using the maximum principle, we obtain \(\|V_{i}\|_{\infty }\leq \|U_{i}(y,\tau ^{*})-U_{i}^{*}(y)\|_{\infty }\) (\(i=1,2\)).

This completes the proof of Lemma 5.2. □

Lemma 5.3

For problem (5.1) we have the estimates

$$ \max_{0\leq \tau \leq \tau ^{*}} \int _{-L}^{L}U^{2}\,dy\leq C\bigl( \max \vert A \vert ^{2}\bigr) \biggl( \int _{Q} \vert U_{2y} \vert ^{2}\,dy\,d\tau +1 \biggr) $$
(5.4)

and

$$ \max_{0\leq \tau \leq \tau ^{*}} \int _{-L}^{L}U^{2}_{y} \,dy\leq C\bigl( \max \vert A \vert ^{2}\bigr) \biggl( \int _{Q} \vert U_{2y} \vert ^{2}\,dy\,d\tau +1 \biggr), $$
(5.5)

where C is a constant.

Proof

The proof of estimate (5.4) is standard. So we only need to prove estimate (5.5).

From equation (5.1) we have, for \(0\leq \tau \leq \tau ^{*}\),

$$\begin{aligned} & \int _{Q_{\tau }} \biggl(-a_{2}\frac{1}{2L}-AU_{2y} \biggr)U_{\tau }\,dy\,d \tau \\ &\quad = \int _{Q_{\tau }} \biggl(U_{\tau }-\frac{1}{2} \sigma _{0}^{2}U_{yy}+ \biggl( \frac{1}{2}\sigma _{0}^{2}+a_{1}(y) \biggr)U_{y}+rU-a_{1}(y) \frac{1}{2L} \biggr)U_{\tau }\,dy\,d\tau \\ &\quad = \int _{Q_{\tau }}U_{\tau }^{2}\,dy\,d\tau + \int _{Q_{\tau }} \frac{\sigma _{0}^{2}}{2}U_{y}U_{\tau y} \,dy\,d\tau + \int _{-L}^{L}r \frac{U^{2}}{2}|_{0}^{\tau } \,dy \\ &\qquad {}+ \int _{Q_{\tau }} \biggl( \biggl( \frac{1}{2}\sigma _{0}^{2}+a_{1}(y) \biggr)U_{y}U_{\tau }- \frac{a_{1}}{2L}U_{\tau } \biggr)\,dy\,d\tau . \end{aligned}$$

Using the boundedness of \(a_{1}(y)\), we get the following inequality:

$$\begin{aligned} & \int _{Q_{\tau }}U_{\tau }^{2}\,dy\,d\tau + \int _{-L}^{L}\frac{1}{2} \sigma _{0}^{2}\frac{U_{y}^{2}}{2}|_{0}^{\tau } \,dy+ \int _{-L}^{L}r \frac{U^{2}}{2}|_{\tau } \,dy \\ &\quad = \int _{Q_{\tau }} \biggl( \biggl(-\frac{1}{2}\sigma _{0}^{2}-a_{1}(y) \biggr)U_{y}U_{\tau }+ \frac{A}{2L}U_{\tau }-AU_{2y}U_{\tau } \biggr)\,dy\,d \tau \\ &\quad \leq \int _{Q_{\tau }} \biggl(CU_{y}^{2}+ \frac{1}{2}U_{\tau }^{2} +C\bigl( \max \vert A \vert ^{2}\bigr)+C\bigl(\max \vert A \vert ^{2} \bigr)U_{2y}^{2} \biggr)\,dy\,d\tau , \end{aligned}$$

that is,

$$\begin{aligned} &\frac{1}{2} \int _{Q_{\tau }}U_{\tau }^{2}\,dy\,d\tau + \int _{-L}^{L} \frac{1}{2} \sigma _{0}^{2}\frac{U_{y}^{2}}{2}|_{\tau }\,dy \\ &\quad \leq C \int _{Q_{\tau }}U_{y}^{2}\,dy\,d\tau +C \bigl(\max \vert A \vert ^{2}\bigr) \int _{Q_{ \tau }} \bigl(U_{2y}^{2}+1 \bigr) \,dy\,d\tau . \end{aligned}$$

From Gronwall’s inequality we have

$$\begin{aligned} & \int _{Q_{\tau }}\frac{1}{2}U_{\tau }^{2} \,dy\,d\tau + \int _{-L}^{L} \frac{1}{2}\sigma _{0}^{2} \frac{U_{y}^{2}}{2}|_{\tau }\,dy \\ &\quad \leq C\bigl(\max \vert A \vert ^{2}\bigr) \biggl( \int _{Q_{\tau }}U_{2y}^{2}\,dy\,d\tau +1 \biggr). \end{aligned}$$

This completes the proof of Lemma 5.3. □

Lemma 5.4

For problem (5.2) we have the estimate

$$ \max_{0\leq \tau \leq \tau ^{*}} \int _{-L}^{L}V^{2}\,dy\leq C\bigl( \max \vert A \vert ^{2}\bigr) \biggl( \int _{Q_{\tau }}\bigl(V_{2}^{2}+U_{2y}^{2} \bigr)\,dy\,d\tau +1 \biggr), $$
(5.6)

where C is a constant.

Proof

From equation (5.2) we have

$$\begin{aligned} & \int _{Q_{\tau }}(AV_{2})_{y}V\,dy\,d\tau \\ &\quad = \int _{Q_{\tau }} \biggl(-V_{\tau }-\frac{1}{2} \sigma _{0}^{2}V_{yy}- \frac{1}{2} \sigma _{0}^{2}V_{y}- \bigl(a_{1}(y)V \bigr)_{y}+rV \biggr)V\,dy\,d\tau \\ &\quad = \int _{-L}^{L}-\frac{V^{2}}{2}|_{\tau }^{\tau ^{*}} \,dy+ \int _{Q_{ \tau }}\frac{1}{2}\sigma _{0}^{2}V_{y}^{2} \,dy\,d\tau + \int _{Q_{\tau }}a_{1}(y)VV_{y}\,dy\,d \tau + \int _{Q_{\tau }}rV^{2}\,dy\,d\tau \\ &\quad =- \int _{Q_{\tau }}AV_{2}V_{y}\,dy\,d\tau . \end{aligned}$$

By Lemma 5.3 this yields

$$\begin{aligned} & \int _{-L}^{L}\frac{V^{2}}{2}|_{\tau } \,dy+ \int _{Q_{\tau }} \frac{1}{2}\sigma _{0}^{2}V_{y}^{2} \,dy\,d\tau + \int _{Q_{\tau }}rV^{2}\,dy\,d \tau \\ &\quad = \int _{-L}^{L}\frac{V^{2}}{2}|_{(y,\tau ^{*})} \,dy- \int _{Q_{ \tau }}a_{1}(y)VV_{y}\,dy\,d\tau - \int _{Q_{\tau }}AV_{2}V_{y}\,dy\,d\tau \\ &\quad \leq C\bigl(\max \vert A \vert ^{2}\bigr) \biggl( \int _{Q_{\tau }}U_{2y}^{2}\,dy\,d\tau +1 \biggr)+ C \int _{Q_{\tau }}V^{2}\,dy\,d\tau + \frac{a_{1}\sigma _{0}^{2}}{4\alpha _{1}} \int _{Q_{\tau }}V_{y}^{2}\,dy\,d \tau \\ &\qquad {}+C\bigl(\max \vert A \vert ^{2}\bigr) \int _{Q_{\tau }}V_{2}^{2}\,dy\,d\tau . \end{aligned}$$

From Gronwall’s inequality we have

$$\begin{aligned} & \int _{-L}^{L}\frac{V^{2}}{2}|_{\tau } \,dy+ \int _{Q_{\tau }} \biggl( \frac{\sigma _{0}^{2}}{2}- \frac{a_{1}\sigma _{0}^{2}}{4\alpha _{1}} \biggr)V_{y}^{2}\,dy\,d\tau \\ &\quad \leq C\bigl(\max \vert A \vert ^{2}\bigr) \biggl( \int _{Q_{\tau }}\bigl(U_{2y}^{2}+V_{2}^{2} \bigr)\,dy\,d \tau +1 \biggr). \end{aligned}$$

This completes the proof of Lemma 5.4. □

Theorem 5.5

Let \(a_{1}(y)\) and \(a_{2}(y)\) be two minimizers of the optimal control Problem P3. If there exists a point \(y_{0}\) such that

$$ a_{1}(y_{0})=a_{2}(y_{0}), $$

then for \(\tau ^{*}\ll 1\), we have

$$ a_{1}(y)\equiv a_{2}(y) \quad \textit{for any } y\in [-L,L]. $$

Proof

Taking \(h=a_{2}\) when \(\bar{a}=a_{1}\) and \(h=a_{1}\) when \(\bar{a}=a_{2}\) in (4.3), we have

$$\begin{aligned}& N \int _{-L}^{L}\nabla a_{1}\cdot \nabla (a_{2}-a_{1})\,dy+ \int _{Q}V_{1}(a_{1}-a_{2}) \biggl(U_{1y}-\frac{1}{2L} \biggr)\,dy\,d\tau \geq 0, \end{aligned}$$
(5.7)
$$\begin{aligned}& N \int _{-L}^{L}\nabla a_{2}\cdot \nabla (a_{1}-a_{2})\,dy+ \int _{Q}V_{2}(a_{2}-a_{1}) \biggl(U_{2y}-\frac{1}{2L} \biggr)\,dy\,d\tau \geq 0, \end{aligned}$$
(5.8)

where \(\{U_{i},V_{i}\}\) (\(i=1,2\)) are solutions of system (4.1)–(4.2) with \(\bar{a}=a_{i}\) (\(i=1,2\)), respectively. From (5.7) and (5.8) we have

$$\begin{aligned} N \int _{-L}^{L} \bigl\vert \nabla (a_{2}-a_{1}) \bigr\vert ^{2}\,dy \leq & \int _{Q}(a_{1}-a_{2}) \biggl(V_{1} \biggl(U_{1y}-\frac{1}{2L} \biggr)-V_{2} \biggl(U_{2y}- \frac{1}{2L} \biggr) \biggr)\,dy\,d\tau \\ =& \int _{Q}A \biggl(V_{1}U_{1y}-V_{2}U_{1y}+V_{2}U_{1y}-V_{2}U_{2y}- \frac{V_{1}}{2L}+\frac{V_{2}}{2L} \biggr)\,dy\,d\tau \\ =& \int _{Q}A \biggl(VU_{1y}+V_{2}U_{y}- \frac{V}{2L} \biggr)\,dy\,d\tau \\ \leq& C\bigl(\max \vert A \vert \bigr)\sqrt{ \int _{Q}V^{2}\,dy\,d\tau } \biggl(\sqrt{ \int _{Q}U^{2}_{1y}\,dy\,d \tau }+1 \biggr) \\ &{} +C\bigl(\max \vert A \vert \bigr)\sqrt{ \int _{Q}V^{2}_{2}\,dy\,d\tau } \sqrt{ \int _{Q}U^{2}_{y}\,dy\,d \tau }. \end{aligned}$$
(5.9)

By Lemma 5.3, Lemma 5.4, and (5.9) we have

$$\begin{aligned} N \int _{-L}^{L} \vert \nabla A \vert ^{2}\,dy \leq& C\bigl(\max \vert A \vert ^{2}\bigr) \sqrt{\tau ^{*} \biggl( \int _{Q}\bigl(V^{2}_{2}+U^{2}_{2y} \bigr)\,dy\,d\tau +1 \biggr)} \biggl(\sqrt{ \int _{Q}U^{2}_{1y}\,dy\,d\tau }+1 \biggr) \\ &{} +C\bigl(\max \vert A \vert ^{2}\bigr)\sqrt{ \int _{Q}V^{2}_{2}\,dy\,d\tau } \sqrt{\tau ^{*} \biggl( \int _{Q}U^{2}_{2y}\,dy\,d\tau +1 \biggr)}. \end{aligned}$$
(5.10)

From Lemma 2.1 we have

$$\begin{aligned}& \int _{Q}U^{2}_{1y}\,dy\,d\tau < \infty , \end{aligned}$$
(5.11)
$$\begin{aligned}& \int _{Q}U^{2}_{2y}\,dy\,d\tau < \infty . \end{aligned}$$
(5.12)

From Lemma 5.2 we have

$$ \int _{Q}V^{2}_{2}\,dy\,d\tau < \infty . $$
(5.13)

By the assumption of Theorem 5.5 there exists a point \(y_{0}\in [-L,L]\) such that

$$ A(y_{0})=a_{1}(y_{0})-a_{2}(y_{0})=0. $$
(5.14)

From Lemma 5.1 we have

$$ \max_{y\in [-L,L]} \vert A \vert \leq C \biggl( \int _{-L}^{L} \vert \nabla A \vert ^{2}\,dy \biggr)^{1/2}. $$
(5.15)

From (5.9)–(5.15) we have

$$ \int _{-L}^{L} \vert \nabla A \vert ^{2}\,dy\leq C\bigl(\max \vert A \vert ^{2}\bigr) \sqrt{\tau ^{*}} \leq C\sqrt{\tau ^{*}} \int _{-L}^{L} \vert \nabla A \vert ^{2}\,dy. $$

Then choosing \(\tau ^{*}\ll 1\) such that

$$ C^{2}\tau ^{*}\leq \frac{1}{2}, $$

we have

$$ \int _{-L}^{L} \vert \nabla A \vert ^{2}\,dy\leq 0. $$

Therefore

$$ \nabla A=0. $$

From the assumption \(A(y_{0})=0\) we have

$$ A(y)=a_{1}(y)-a_{2}(y)\equiv 0. $$

This completes the proof of Theorem 5.5. □

6 Numerical examples

In this section, we give some numerical examples to test the validity of the proposed methods for reconstruction of the drift. In this paper, we consider the gradient iteration algorithm to obtain the numerical solutions. The key ingredient of this iteration algorithm is the Gâteaux derivative of \(J'(a)\), which is given as follows.

Theorem 6.1

The Gâteaux derivative of \(J'(a)\) at a point \(a\in \mathcal{A}\) along direction \(p(y)\) is determined as

$$ J'(a)p= \int _{Q}p\cdot w \biggl(\frac{1}{2L}-W_{y} \biggr)\,dy\,d\tau +N \int _{-L}^{L} \nabla a(y)\cdot \nabla p(y)\,dy, $$

where \(W(y,\tau ;a)\) is the solution of system (1.4) with given coefficient of \(a(x)\in \mathcal{A}\), and \(w(y,\tau ;a)\) satisfies the following equation:

$$ \textstyle\begin{cases} -w_{\tau }-\frac{1}{2}\sigma _{0}^{2}w_{yy}-\frac{1}{2}\sigma _{0}^{2}w_{y}- (a(y)w)_{y}+rw=0, &(y,\tau )\in Q=[-L,L]\times (0,\tau ^{*}), \\ w(y,\tau ^{*})=W(y,\tau ^{*})-W^{*}(y), & y\in [-L,L], \\ w(-L,\tau )=0,& \tau \in (0,\tau ^{*}), \\ w(L,\tau )=0,& \tau \in (0,\tau ^{*}). \end{cases} $$
(6.1)

The proof is similar to that of Theorem 4.1.

Remark 6.1

The main difference between Problems P2 and P is the boundary condition. Problem P is homogeneous, whereas Problem P2 is nonhomogeneous. Problem P is convenient for theoretical analysis, such as integration by parts, but there is no difference between the two problems in calculation. In this section, we use Eq. (1.4) as a mathematical model of the forward problem.

Suppose the computational domain \(\bar{Q}=[-L,L]\times [0,\tau ^{*}]\) is divided into a \(2M\times P\) mesh with spatial step size \(h=\frac{L}{M}\) in y direction and the time step size \(\tau =\frac{\tau ^{*}}{P}\), respectively. Grid points \((y_{j},t_{n})\) are defined by

$$\begin{aligned}& y_{j}=(j-M)h, \quad j=0,1,2,\ldots ,2M; \\& t_{n}=n\tau , \quad n=0,1,2,\ldots ,P; \end{aligned}$$

where M and P are two integers.

Based on the above analysis, the detailed procedure of iteration algorithm can be summarized as follows:

Step 1. Choose an initial value of iteration \(a=a_{0}(y)\).

Step 2. Solve the initial-boundary value problem (1.4) to get the solution \(W_{0}(y,\tau )\), where \(a(y)=a_{0}(y)\).

Step 3. Solve Eq. (6.1) to obtain the solution \(w_{0}(y, \tau )\), where \(w_{0}(y, \tau ^{*})=W_{0}(y,\tau ^{*})-W^{*}(y)\).

Step 4. Compute the Gâteaux derivative \(J'(a)\psi _{j}=c_{j}\) for \(j=0,1,2,\ldots ,2M\),

$$ c_{j}= \int _{Q_{\tau }}\psi _{j}(y)\cdot w_{0} \biggl(\frac{1}{2L}-W_{y} \biggr)\,dy\,d\tau +N \int _{-L}^{L} \nabla a_{0}(y)\cdot \nabla \psi _{j}(y)\,dy, $$

where the functions \(\psi _{j}\) are taken as the base function under current grid:

$$\begin{aligned}& \psi _{0}(y)=\textstyle\begin{cases} \frac{y_{1}-y}{h}, & y_{0}\leq y\leq y_{1}, \\ 0 & \text{otherwise};\end{cases}\displaystyle \\& \psi _{j}(y)=\textstyle\begin{cases} \frac{y-y_{j-1}}{h}, & y_{j-1}\leq y\leq y_{j}, \\ \frac{y_{j+1}-y}{h}, & y_{j}\leq y\leq y_{j+1}, \\ 0 & \text{otherwise};\end{cases}\displaystyle \\& \psi _{2M}(y)=\textstyle\begin{cases} \frac{y-y_{2M-1}}{h}, & y_{2M-1}\leq y\leq y_{2M}, \\ 0 & \text{otherwise}.\end{cases}\displaystyle \end{aligned}$$

Then the iteration direction from the jth step to the \((j+1)\)th step is given by

$$ C_{0}(y)=\sum_{j=0}^{2M}c_{j} \psi _{j}(y). $$

Step 5. Compute the norm of \(C_{0}(y)\) at the jth step:

$$ e= \Biggl(h\sum_{j=0}^{2M}c_{j}^{2}(y) \Biggr)^{1/2}, $$

where h is the spatial step size.

Step 6. Choose an arbitrary small positive constant ε as the stop** parameter. Go on or stop the iteration is determined by the following steps:

Step 6.1. Let \(k=1\).

Step 6.2. Compute \(\mathrm{err}:=J[a_{0}(y)+kC_{0}(y)]-J(a_{0}(y))+ \frac{1}{2}ke^{2}\).

Step 6.3. Compare it with 0; if \(\mathrm{err}\leq 0\), then go to Step 6.4; Otherwise, let \(k\doteq \mu k\) and go to Step 6.2, where μ is an adjusting parameter.

Step 6.4. Take \(a_{1}(y)=a_{0}(y)+kC_{0}(y)\); if \(\|kC_{0}(y)\|\leq \varepsilon \), then exit and stop the iteration scheme. Otherwise, set \(j=j+1\) and go to Step 2. Let \(a_{1}(y)\) be the new initial value of iteration and go on computing by the induction rules until the iterations meet the termination conditions.

We have performed three numerical experiments to check the stability of our iteration algorithm. Since real-world data are not available, we would like to use artificial data to test the stability of the proposed numerical algorithm, that is, the extra condition is obtained by solving the direct problem. In all experiments, we used the basic parameters

$$ r=0.5, \qquad \sigma _{0}=1,\qquad \varepsilon =10^{-4}, \qquad N=10^{-5}. $$

Equation (1.4) is solved by the classical finite difference method, the Crank–Nicolson difference scheme:

$$\begin{aligned}& \frac{W^{n+1}_{j}-W^{n}_{j}}{\tau }-\frac{\sigma _{0}^{2}}{4} \biggl( \frac{W^{n}_{j+1}-2W^{n}_{j}+W^{n}_{j-1}}{h^{2}}+ \frac{W^{n+1}_{j+1}-2W^{n+1}_{j}+W^{n+1}_{j-1}}{h^{2}}- \frac{W^{n}_{j+1}-W^{n}_{j-1}}{2h} \\& \quad {}-\frac{W^{n+1}_{j+1}-W^{n+1}_{j-1}}{2h} \biggr)+\frac{a_{j}}{2} \biggl( \frac{W^{n}_{j+1}-W^{n}_{j-1}}{2h}+ \frac{W^{n+1}_{j+1}-W^{n+1}_{j-1}}{2h} \biggr) +\frac{r}{2} \bigl(W^{n}_{j}+W^{n+1}_{j} \bigr)=0. \end{aligned}$$

The discrete equations of the initial boundary value are

$$\begin{aligned}& W^{n}_{0}=W^{n}_{2M}=0, \\& W^{0}_{j}=H\bigl(-(j-M)h\bigr), \quad j=0,1,2,\ldots ,2M. \end{aligned}$$

The difference scheme is absolutely stable, and its truncation error is \(O(\tau ^{2}+h^{2})\).

Example 1

In the first numerical experiment, we take

$$ L=10 $$

and

$$ a(y)=\textstyle\begin{cases} \sqrt[2]{\frac{y^{2}}{2M}+\frac{1}{2}}, & -L+h\leq y< -L+h(M-6), \\ 0.8, & -L+h(M-6)\leq y< -L+h(M+5), \\ \frac{1}{\sqrt[3]{y}},& -L+h(M+5)\leq y\leq L-h. \end{cases} $$

The spatial and time step sizes are taken as

$$ h=\frac{2}{5}, \qquad \tau =0.01. $$

The exact solution and the reconstruction results for different time (denoted by \(\tau ^{*}\)) are shown in Fig. 1. The initial guess is taken to be 0.5. We can see that the drift coefficient \(a(y)\) is well recovered after 260 iterations. The iterative procedure converges quickly, and the effect is satisfactory. However, since \(a(y)\) is a segmented function, the values on the right boundary are quite difficult to reconstruct well. The algorithm converges rather slowly near the right boundary. Also, since the prices near the strike are the most interesting for practitioners, we investigate the recovered function around \(y=0\) (\(S=K\)), where we use three different observation times \(\tau ^{*}= 0.5\), 0.8, and 1. From the figure we observe that the reconstruction is numerically good when \(\tau ^{*}=1\). For all times \(\tau ^{*}\), the reconstructed \(a(y)\) is near-perfect on interval \(y\in (0, 2)\), as shown in Fig. 1.

Figure 1
figure 1

Identified drifts for Example 1

Example 2

In the second numerical experiment, we take

$$ a(y)=\frac{1}{\sqrt[5]{e^{|y|}}}, $$

where L, h, τ are the same as in the first experiment.

The exact solution and the reconstruction results for different iteration times are given in Figs. 2 and 3, where the corresponding iteration numbers are 300 and 380, respectively. The initial guess is taken to be 0. We only needed 380 iterations to achieve a satisfactory result. From the figure we observe that the reconstruction is numerically near-perfect when \(\tau ^{*}=0.5\). The iterative procedure converges quickly and the reconstruction solution seems to be very satisfactory. Since the function \(a(y)\) changes drastically around zero, it is quite difficult to guarantee the convergence at this point. In fact, this form of a function has a larger fluctuation, which also shows that the drift function changes more violently in the real market. However, our algorithm still performs well, and the shape reconstruction of cusp is very good. In the experiment, we find that 400 iterations are not as good as 380 iterations. Since the observation data contains error, which comes from the calculation of forward problem, to obtain stable numerical results, we will cease the iteration at some suitable time.

Figure 2
figure 2

Identified drifts for Example 2 (iterative number 300)

Figure 3
figure 3

Identified drifts for Example 2 (iterative number 380)

Example 3

In the third numerical experiment, we take

$$ a(y)=e^{\frac{-y}{10}}, $$

where L, h, τ are the same as in the first experiment. The numerical results for exact input data can be seen in Fig. 4.

Figure 4
figure 4

Noiseless case for Example 3

To investigate the stability of the numerical solution, we employ the following noisy data:

$$ W^{\delta }(y,1)=W(y,1)\bigl[1+\delta \times \operatorname{random}(y) \bigr] $$

with \(\delta =0.001\) and \(\delta =0.01\). The reconstructed results are shown in Figs. 5 and 6. From these two figures we can see that the reconstruction of \(a(y)\) with the noisy data is also satisfactory. Like in Fig. 1, for all times \(\tau ^{*}\), the reconstructed \(a(y)\) is near-perfect on the interval \(y\in (0, 2)\).

Figure 5
figure 5

Identified drifts with the noisy level 0.001 for Example 3

Figure 6
figure 6

Identified drifts with the noisy level 0.01 for Example 3

From Fig. 5 we observe that the reconstruction of the 0.1% relative random noise data is almost identical to the reconstruction using noiseless data shown in Fig. 4. From Fig. 6 we observe that the reconstruction of the 1% relative random noise data has a noticeable gap when compared to the reconstruction of the noiseless data. We can see that the reconstruction of the 1% has an upward trend. Nonetheless, even in this case, the form of the reconstruction of the noiseless data is maintained, and we think that changes depend on the size of the error (i.e., noise).

From the above results we conclude that our numerical algorithm for reconstructing trend coefficients is indeed stable for data containing 0.1% and 1% relative random noise data.

Remark 6.2

The parameters of numerical examples are taken as \(r=0.5\) and \(\sigma _{0}=1\). However, in the real market, the range of volatility is usually \([0.2,0.8]\), and the risk-free interest rate r hardly reaches 0.5. So, to better adapt to the real market, we consider the more suitable parameters \(r=0.09\) and \(\sigma _{0}=0.5\). Moreover, to observe the effect of \(\tau ^{*}\) on the reconstruction process, we also consider different values of \(\tau ^{*}\). The numerical results are shown in Fig. 7. We can see that our algorithm still performs well for different parameters r, σ, and \(\tau ^{*}\), and the simulated drift rate is in good agreement with the real one.

Figure 7
figure 7

Noiseless case for Example 3, where \(r=0.09\) and \(\sigma =0.5\)

7 Concluding remarks

In this paper, we discuss an inverse problem of reconstructing the drift rate coefficient of stock index options using market observation data. Considering the boundlessness and nonhomogeneity of the original model, we use the artificial boundary method and homogenization technique to transform the original problem into a terminal control problem of homogeneous initial-boundary value equation on a bounded domain. We strictly prove the well-posedness of the minimizer of the control problem and give an iterative calculation scheme. Numerical results show that our algorithm is fast and robust.

This paper focuses on the reconstruction of the drift coefficient. To simplify the problem, we assume that the volatility is constant. This assumption usually does not hold in practice. Therefore an interesting question is what conditions need to be imposed to reconstruct the volatility and drift rate simultaneously, which is also the future work for our research.