A randomized operator splitting scheme inspired by stochastic optimization methods

Eisenmann, Monika; Stillfjord, Tony

doi:10.1007/s00211-024-01396-w

A randomized operator splitting scheme inspired by stochastic optimization methods

Open access
Published: 26 February 2024

Volume 156, pages 435–461, (2024)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

A randomized operator splitting scheme inspired by stochastic optimization methods

Download PDF

531 Accesses
Explore all metrics

Abstract

In this paper, we combine the operator splitting methodology for abstract evolution equations with that of stochastic methods for large-scale optimization problems. The combination results in a randomized splitting scheme, which in a given time step does not necessarily use all the parts of the split operator. This is in contrast to deterministic splitting schemes which always use every part at least once, and often several times. As a result, the computational cost can be significantly decreased in comparison to such methods. We rigorously define a randomized operator splitting scheme in an abstract setting and provide an error analysis where we prove that the temporal convergence order of the scheme is at least 1/2. We illustrate the theory by numerical experiments on both linear and quasilinear diffusion problems, using a randomized domain decomposition approach. We conclude that choosing the randomization in certain ways may improve the order to 1. This is as accurate as applying e.g. backward (implicit) Euler to the full problem, without splitting.

Splitting methods and numerical approximations for a coupled local/nonlocal diffusion model

Article 09 December 2021

Strong convergence analysis of iterative solvers for random operator equations

Article 23 October 2019

A splitting algorithm for stochastic partial differential equations driven by linear multiplicative noise

Article 15 March 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The main objective of this paper is to combine two successful strategies from the literature: the first being operator splitting schemes for evolution equations on general, infinite dimensional frameworks and the second being stochastic optimization methods. Operator splitting schemes are an established tool in the field of numerical analysis of evolution equations and have a wide range of applications. Stochastic optimization methods have proven to be efficient at solving large-scale optimization problems, where it is infeasible to evaluate full gradients. They can drastically decrease the computational cost in e.g. machine learning settings. The link between these two seemingly disparate areas is that an iterative method applied to an optimization problem can also be seen as a time-step** method applied to a gradient flow connected to the optimization problem. In particular, stochastic optimization methods can then be interpreted as randomized operator splitting schemes for such gradient flows. In this context, we introduce a general randomized splitting method that can be applied directly to evolution equations, and provide a rigorous convergence analysis.

Abstract evolution equations of the type

$$\begin{aligned} {\left\{ \begin{array}{ll} u'(t) + A(t)u(t) = f(t), \quad t \in (0,T],\\ u(0) = u_0 \end{array}\right. } \end{aligned}$$

are an important building block for modeling processes in physics, biology and social sciences. Standard examples which appear in a variety of applications are fluid flow problems, where we model how a flow evolves on a given domain over time, compare [1, 26] and [37, Section 1.3]. The operator A(t) can denote, for example, a non-linear diffusion operator such as the p-Laplacian or a porous medium operator.

Deterministic operator splitting schemes as discussed in more detail in [16] are a powerful tool for this type of equation. An example is given by a domain decomposition scheme, where we split the domain into sub-domains. Instead of solving one expensive problem on the entire domain, we deal with cheaper problems on the sub-domains. This is particularly useful in modern computer architectures, as the sub-problems may often be solved in parallel.

Moreover, evolution equations are tightly connected to unconstrained optimization problems, because the solution of $\min _u F(u)$ is a stationary point of the gradient flow $u'(t) = -\nabla F(u(t))$. The latter is an evolution equation on an infinite time horizon with $A = -\nabla F$ and $f = 0$. In the large-scale case, such optimization problems benefit from stochastic optimization schemes. The most basic such method, the stochastic gradient descent, was first introduced already in [32], but since then it has been extended and generalized in many directions. See, e.g., the review article [3] and the references therein.

Via the gradient flow interpretation, we can see these optimization methods as time-step** schemes where a randomly chosen sub-problem is considered in each time step. In essence, it is therefore a randomized operator splitting scheme. The difference between the works mentioned above and ours is that we apply these stochastic optimization techniques to solve the evolution equation itself rather than just finding its stationary state.

We consider nonlinear evolution equations in an abstract framework similar to [7, 10, 11] where operators of a monotone type have been studied. Deterministic splitting schemes for such equations has been considered in e.g. [14, 15, 17, 29]. A particular kind of splitting schemes which is most closely related to our work, domain decomposition methods, have been studied in [6, 7, 13, 30, 31]. In this paper, we extend this framework of deterministic splitting schemes to a setting of randomized methods.

Outside of the context of optimization, other kinds of randomized methods have already proved themselves to be useful for solving evolution equations. Starting in [34, 35] explicit schemes for ordinary differential equations have been randomized. This approach has been further extended in [2, 4, 18, 22, 24]. In [8], it has been extended both to implicit methods and to partial differential equations and in [23] to finite element approximations. While these works considered certain randomizations in their schemes, they are conceptually different from our approach. Their main idea is to approximate any appearing integrals through

$$\begin{aligned} \int _{t_{n-1}}^{t_n} f(t) \,\textrm{d}t \approx f(\xi _n) \quad \text {and} \quad \int _{t_{n-1}}^{t_n} A(t)v \,\textrm{d}t \approx A(\xi _n) v, \end{aligned}$$

where $\xi _n$ is a random variable that takes on values in $[t_{n-1}, t_n]$. This ansatz coincides with a Monte Carlo integration idea. In this paper, we use a different approach where we decompose the operator in a randomized fashion. More precisely, we approximate data

$$\begin{aligned} f = \frac{1}{s}\sum _{\ell = 1}^{s} f_{\ell } \quad \text {and} \quad A = \frac{1}{s}\sum _{\ell = 1}^{s} A_{\ell } \end{aligned}$$

by

$$\begin{aligned} f_B = \frac{1}{|B |}\sum _{\ell \in B} f_{\ell } \quad \text {and} \quad A_B = \frac{1}{|B |}\sum _{\ell \in B} A_{\ell } \end{aligned}$$

where the batch $B \subset \{1,\dots ,s\}$ is chosen randomly. The stochastic approximations $f_B$ and $A_B$ of the original data f and A are cheaper to evaluate in applications. This is less related to Monte Carlo integration and more similar to stochastic optimization methods, compare [3, 9]. Similar ideas have been considered in [19, 20, 28], where a random batch method for interacting particle systems has been studied. Moreover, very recently and during the preparation of this work, a similar approach has also been applied to the optimal control of linear time invariant (LTI) dynamical systems in [38]. While the convergence rate provided there is essentially the same as what we establish in our main result Theorem 5.2, our setting is more general and allows for nonlinear operators on infinite dimensional spaces rather than finite dimensional matrices. We also consider the error of the time step** method that is used to approximate the solution to $u'(t) + A_B(t)u(t) = f_B(t)$, while the error bounds in [38] assume that this evolution equation is solved exactly.

This paper is organized as follows. In Sect. 2, we begin by explaining our abstract framework. This includes both the precise assumptions that we make and the definition of our time-step** scheme. We give a more concrete application of the abstract framework in Sect. 3. With the setting fixed, we first prove in Sect. 4 that the scheme and its solution are indeed well-defined. We prove the convergence of the scheme in expectation in Sect. 5. These theoretical convergence results are illustrated by numerical experiments with two-dimensional linear and quasilinear nonlinear and linear diffusion problem in Sect. 6. Finally, we collect some more technical auxiliary results in Appendix A.

2 Setting

In the following, we introduce a theoretical framework for the randomized operator splitting. This setting is similar to the one in [7].

Assumption 2.1

Let $(H,( \cdot , \cdot )_{H},\Vert \cdot \Vert _H)$ be a real, separable Hilbert space and let $(V, \Vert \cdot \Vert _V)$ be a real, separable, reflexive Banach space, which is continuously and densely embedded into H. Moreover, there exists a semi-norm $|\cdot |_V$ on V and a $C_V \in (0,\infty )$ such that $|\cdot |_V \le C_V \Vert \cdot \Vert _V$.

Denoting the dual space of V by $V^*$ and identifying the Hilbert space H with its dual space, the spaces from Assumption 2.1 form a Gelfand triple and fulfill, in particular,

$$\begin{aligned} V \overset{d}{\hookrightarrow }\ H \cong H^* \overset{d}{\hookrightarrow }\ V^*. \end{aligned}$$

Assumption 2.2

Let the spaces H and V be given as stated in Assumption 2.1. Furthermore, for $T \in (0,\infty )$ as well as $p \in [2,\infty )$, let $\{A(t)\}_{t \in [0,T]}$ be a family of operators $A(t) :V \rightarrow V^*$ that satisfy the following conditions:

(i)
The map** $Av :[0,T] \rightarrow V^*$ given by $t \mapsto A(t)v$ is continuous almost everywhere in (0, T) for all $v \in V$.
(ii)
The operator $A(t) :V \rightarrow V^*$, $t \in [0,T]$, is radially continuous, i.e., the map** $s \mapsto \langle A(t)(v+s w), w \rangle _{V^*\times V}$ is continuous on [0, 1] for all $v,w \in V$.
(iii)
There exists $\kappa _A \in [0,\infty )$ and $\eta _A \in [0,\infty )$, which do not depend on t, such that the operator $A(t) + \kappa _A I :V \rightarrow V^*$, $t \in [0,T]$, fulfills the monotonicity-type condition
$$\begin{aligned} \langle A(t)v - A(t)w, v - w \rangle _{V^*\times V} + \kappa _A \Vert v - w\Vert _H^2 \ge \eta _A |v - w |_V^p \end{aligned}$$
for all $v,w \in V$.
(iv)
The operator $A(t) :V \rightarrow V^*$, $t \in [0,T]$, is uniformly bounded such that there exists $\beta _A \in [0,\infty )$, which does not depend on t, with
$$\begin{aligned} \Vert A(t) v \Vert _{V^*} \le \beta _A \big (1 + \Vert v\Vert _V^{p-1}\big ) \end{aligned}$$
for all $v \in V$.

Assumption 2.3

The function f is an element of the Bochner space $L^2(0,T;H)$, and the initial value $u_0 \in H$, where H is the Hilbert space from Assumption 2.1.

Remark 1

We note that Assumption 2.2 (iii) implies that the operator $A(t) + \kappa _A I :V \rightarrow V^*$, $t \in [0,T]$, fulfills a uniform semi-coercivity condition. That is, there exist constants $\mu _A, \lambda _A \in [0,\infty )$, which do not depend on t, such that

$$\begin{aligned} \langle A(t) v, v \rangle _{V^*\times V}+ \kappa _A \Vert v\Vert _H^2 + \lambda _A \ge \mu _A |v |_V^p \end{aligned}$$

for all $v \in V$. This follows by taking $w = 0$ in (iii), since then

$$\begin{aligned} \langle A(t)v, v \rangle _{V^*\times V} + \kappa _A \Vert v\Vert _H^2 \ge \langle A(t)0, v \rangle _{V^*\times V} + \eta _A |v |_V^p, \end{aligned}$$

and by the Cauchy-Schwarz inequality and the weighted Young’s inequality (Lemma A.2),

$$\begin{aligned} \langle A(t)0, v \rangle _{V^*\times V} \ge -\Vert A(t)0\Vert _{V^*}\Vert v\Vert _V \ge -\Bigg ( \frac{\Vert A(t)0\Vert _{V^*}^q}{\varepsilon ^{\frac{q}{p}} q } + \varepsilon \Vert v\Vert _V^p \Bigg ) \end{aligned}$$

with $\frac{1}{p} + \frac{1}{q} = 1$ and $\varepsilon > 0$. Since $|v |_V \le C_V\Vert v\Vert _V$, we can absorb the second term and take $\lambda _A = \varepsilon ^{-\frac{q}{p}} q^{-1} \Vert A(t)0\Vert _{V^*}^q$ and $\mu _A = \eta _A - \varepsilon $ after choosing an $\varepsilon $ such that $\mu _A \ge 0$. This also shows that the constants $\lambda _A$ and $\mu _A$ are not unique. We can, e.g., increase the coercivity constant at the cost of a larger constant term $\lambda _A$. Both these terms enter into our error bounds, which can thus be tuned slightly.

In the case that $A(t)0 = 0$, the constant term disappears and we have $\mu _A = \eta _A$. If $A(t)0 \ne 0$, one could recover this situation by the transformation $(A, f) \rightarrow (\tilde{A}, \tilde{f})$ with $\tilde{A}(t)u = A(t)u - A(t)0$, $\tilde{f}(t) = f(t) - A(t)0$. But in the case that $A(t)0 \in V^* {\setminus } H$ this can cause issues since we require that $f(t) \in H$. Moreover, it might lead to difficulties in solving the nonlinear equations of the form $(I - h_n\tilde{A}(t_n)) u^{n} = u^{n-1} + h_n\tilde{f}(t_n)$. We therefore do not apply such a transformation in this paper.

Assumptions 2.1–2.3, are requirements on the problem that we want to solve. The following Assumptions 2.4–2.5 are needed to state the approximation scheme for the given problem.

Assumption 2.4

Let $(\Omega , \mathcal {F}, \mathcal {P})$ be a complete probability space and let $\{\xi _n\}_{n \in \mathbb {N}}$ be a family of mutually independent random variables. Further, let the filtration $\{\mathcal {F}_n\}_{n \in \mathbb {N}}$ be given by

$$\begin{aligned}&\mathcal {F}_0 := \sigma \big (\mathcal {N} \in \mathcal {F}: \mathcal {P}(\mathcal {N}) = 0 \big )\\&\mathcal {F}_n := \sigma \big (\sigma \big (\xi _i : i \in \{1,\dots ,n\}\big ) \cup \mathcal {F}_0 \big ), \quad n \in \mathbb {N}, \end{aligned}$$

where $\sigma $ denotes the generated $\sigma $-algebra.

In the following, we denote the expectation with respect to the probability distribution of $\xi $ for a random variable X in the Bochner space $L^1(\Omega ; H)$ by $\mathbb {E}_{\xi }[X]$. Moreover, we abbreviate the total expectation by

$$\begin{aligned} \mathbb {E}_n [X] = \mathbb {E}_{\xi _1}[\mathbb {E}_{\xi _2}[ \dots \mathbb {E}_{\xi _n}[X] \dots ]]. \end{aligned}$$

We denote the space of Hölder continuous functions on [0, T] with Hölder coefficient $\gamma \in (0,1)$ and values in H by $C^{\gamma }([0,T];H)$. For notational convenience we include the case $\gamma = 1$ and denote the space of Lipschitz continuous functions by $C^{1}([0,T];H)$.

Assumption 2.5

Let Assumptions 2.1–2.4 be fulfilled. Assume that for almost every $\omega \in \Omega $ there exists a real Banach space $V_{\xi (\omega )}$ such that $V {\mathop {\hookrightarrow }\limits ^{d}} V_{\xi (\omega )} {\mathop {\hookrightarrow }\limits ^{d}} H$, $\bigcap _{\omega \in \Omega } V_{\xi (\omega )} = V$ and there exists a semi-norm $|\cdot |_{V_{\xi (\omega )}}$ on $V_{\xi (\omega )}$ and a $C_{V_{\xi (\omega )}} \in (0,\infty )$ such that $|\cdot |\le C_{V_{\xi (\omega )}} \Vert \cdot \Vert _{V_{\xi (\omega )}}$. Moreover, the map** from $\omega \mapsto V_{\xi (\omega )}$ is measurable in the sense that for every $v \in H$ the set $\{ \omega \in \Omega : v \in V_{\xi (\omega )}\}$ is an element of the complete generated $\sigma $-algebra

$$\begin{aligned} \mathcal {F}_{\xi } := \sigma \big (\sigma (\xi ) \cup \sigma \big (\mathcal {N} \in \mathcal {F}: \mathcal {P}(\mathcal {N}) = 0 \big ) \big ). \end{aligned}$$

Further, let the family of operators $\{A_{\xi (\omega )}(t)\}_{\omega \in \Omega , t \in [0,T]}$ be such that for almost every $\omega \in \Omega $, $\{A_{\xi (\omega )}(t)\}_{t \in [0,T]}$ fulfills Assumption 2.2 with the spaces $V_{\xi (\omega )}$, H and $V_{\xi (\omega )}^*$ and corresponding constants $\kappa _{\xi (\omega )}$, $\eta _{\xi (\omega )}$, $\beta _{\xi (\omega )}$. These give rise to the semi-coercivity constants $\mu _{\xi (\omega )}$ and $\lambda _{\xi (\omega )}$ as in Remark 1. Moreover, the map** $A_{\xi }(t) v :\Omega \rightarrow V^*$ is $\mathcal {F}_{\xi }$-measurable and the equality $\mathbb {E}_{\xi } [ A_{\xi }(t) v ] = A(t) v$ is fulfilled in $V^*$ for $v \in V$. The map**s $\kappa _{\xi }, \eta _{\xi }, \mu _{\xi }, \beta _{\xi }, \lambda _{\xi } :\Omega \rightarrow [0,\infty )$ are measurable and there exist $\kappa , \lambda \in [0,\infty )$ which fulfill $\kappa _{\xi } \le \kappa $ almost surely and $\mathbb {E}_{\xi } \big [\lambda _{\xi } \big ] \le \lambda $.

Further, let the family $\{f_{\xi (\omega )}\}_{\omega \in \Omega }$ be given such that $f_{\xi (\omega )} \in L^2(0,T; H)$. Moreover, the map** $f_{\xi }(t) :\Omega \rightarrow H$ is $\mathcal {F}_{\xi }$-measurable and $\mathbb {E}_{\xi } [ f_{\xi }(t) ] = f(t)$ is fulfilled in H for almost all $t \in (0,T)$.

Under the setting explained in the above assumptions, we consider the initial value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} u'(t) + A(t)u(t) = f(t)\quad &{}\text {in } V^*, \quad t \in (0,T],\\ u(0) = u_0 &{}\text {in } H. \end{array}\right. } \end{aligned}$$

(1)

For a non-uniform temporal grid $0 = t_0<t_1< \dots < t_N = T$, a step size $h_n = t_n - t_{n-1}$, $h = \max _{n \in \{1,\dots ,N\}} h_n$, and a family of random variables $\{f^n\}_{n \in \{1,\dots ,N\}}$ such that $f^n:\Omega \rightarrow H$ is $\mathcal {F}_{\xi _n}$-measurable, we consider the scheme

$$\begin{aligned} {\left\{ \begin{array}{ll} U^n - U^{n-1} + h_n A_{\xi _n}(t_n) U^n = h_n f^n \quad &{}\text {in } V_{\xi _n}^*, \quad n \in \{1,\dots ,N\},\\ U^0 = u_0 &{}\text {in } H. \end{array}\right. } \end{aligned}$$

(2)

Note that $U^n :\Omega \rightarrow H$ is a random variable and therefore some statements involving it below only hold almost surely. Whenever there is no risk of misinterpretation, we omit writing almost surely for the sake of brevity.

When proving that the scheme is well-defined and establishing an a priori bound, it is sufficient to assume that $\{f_{\xi _n}\}_{n \in \{1,\dots ,N\}}$ are integrable with respect to the temporal parameter. In that case, we can choose for example

$$\begin{aligned} f^n = \frac{1}{h_n} \int _{t_{n-1}}^{t_n} f_{\xi _n}(t) \,\textrm{d}t \quad \text {in } H \text { almost surely.} \end{aligned}$$

(3)

When considering our error bounds, we assume more regularity for the functions $\{f_{\xi _n}\}_{n \in \{1,\dots ,N\}}$ and demand continuity with respect to the temporal parameter. In this case, we may also use

$$\begin{aligned} f^n = f_{\xi _n}(t_n) \quad \text {in } H \text { almost surely.} \end{aligned}$$

(4)

We will focus on this second choice for the error bounds in Sect. 5.

3 Application: Domain decomposition

One main application that is allowed by our abstract framework is a domain decomposition scheme for a nonlinear fluid flow problem. Domain decomposition schemes are well-known for deterministic operator splittings. However, to the best of our knowledge, it has not been studied in the context of a randomized operator splitting scheme.

3.1 Deterministic domain decomposition

To exemplify our abstract Eq. (1), we consider a (nonlinear) parabolic differential equation. In the following, let $\mathcal {D}\subset \mathbb {R}^d$, $d \in \mathbb {N}$, be a bounded domain with a Lipschitz boundary $\partial \mathcal {D}$. For $p \in [2, \infty )$, we consider the parabolic p-Laplacian with homogeneous Dirichlet boundary conditions

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t u(t,x) - \nabla \cdot (\alpha (t) |\nabla u(t,x) |^{p-2}\nabla u(t,x)) =\tilde{f}(t,x), &{}(t,x) \in (0,T) \times \mathcal {D},\\ u(t,x) = 0,&{}(t,x) \in (0,T) \times \partial \mathcal {D},\\ u(0,x) = u_0(x), &{}x \in \mathcal {D}, \end{array}\right. } \end{aligned}$$

(5)

for $\alpha :[0,T] \rightarrow \mathbb {R}$ and $u_0 :\mathcal {D}\rightarrow \mathbb {R}$. The notation $\tilde{f}$ is used to differentiate between the function $\tilde{f} :(0,T) \times \mathcal {D}\rightarrow \mathbb {R}$ and the abstract function f on (0, T) that it gives rise to through $[f(t)](x) = \tilde{f}(t,x)$. We consider a domain decomposition scheme similar to [13] for $p = 2$ and to [6, 7] for $p \in [2,\infty )$. For the sake of completeness, we recapitulate the setting here also with a different boundary condition.

For $s \in \mathbb {N}$, let $\{ \mathcal {D}_{\ell } \}_{\ell =1}^{s}$ be a family of overlap** subsets of $\mathcal {D}$. Let each subset have a Lipschitz boundary and let the union of them fulfill $\bigcup _{\ell =1}^s \mathcal {D}_{\ell } = \mathcal {D}$. On the sub-domains $\{ \mathcal {D}_{\ell } \}_{\ell =1}^{s}$, let the partition of unity $\{\chi _{\ell } \}_{\ell =1}^{s}\subset W^{1,\infty }(\mathcal {D})$ be given such that the following criteria are fulfilled

$$\begin{aligned} \chi _{\ell } (x)>0\text { for all }x\in \mathcal {D}_{\ell }, \quad \chi _{\ell } (x) = 0\text { for all }x\in \mathcal {D}{\setminus }\mathcal {D}_{\ell }, \quad \sum _{\ell =1}^{s} \chi _{\ell }= 1 \end{aligned}$$

for $\ell \in \{1,\dots ,s\}$. With the help of the functions $\{\chi _{\ell }\}_{\ell \in \{1,\dots ,s\}}$, it is now possible to introduce suitable functional spaces $\{V_{\ell }\}_{\ell \in \{1,\dots ,s\}}$. We use the weighted Lebesgue space $L^p(\mathcal {D}_{\ell },\chi _{\ell })^d$ that consists of all measurable functions $v = (v_1,\dots ,v_d) :\mathcal {D}_{\ell } \rightarrow \mathbb {R}^d$ such that

$$\begin{aligned} \Vert (v_1,\ldots ,v_{d})\Vert _{L^p(\mathcal {D}_{\ell },\chi _{\ell })^d} = \Big (\int _{\mathcal {D}_{\ell }}\chi _{\ell } |(v_1,\ldots ,v_{d})|^p \,\textrm{d}x\Big )^{\frac{1}{p}} \end{aligned}$$

is finite. In the following, let the pivot space $\left( H, ( \cdot , \cdot )_{H}, \Vert \cdot \Vert _H \right) $ be the space $L^2(\mathcal {D})$ of square integrable functions on $\mathcal {D}$ with the usual norm and inner product. The spaces V and $V_{\ell }$, $\ell \in \{1,\dots ,s\}$, are given by

$$\begin{aligned} V = \text {clos}_{\Vert \cdot \Vert _{V}} \big (C_0^{\infty }(\mathcal {D})\big ) = W_0^{1,p}(\mathcal {D}) \quad \text {and} \quad V_{\ell } = \text {clos}_{\Vert \cdot \Vert _{V_{\ell }}} \big (C_0^{\infty }(\mathcal {D})\big ), \end{aligned}$$

with respect to the norms

$$\begin{aligned} \Vert \cdot \Vert _{V} = \Vert \cdot \Vert _H + \Vert \nabla \cdot \Vert _{L^p(\mathcal {D})^d} \quad \text {and}\quad \Vert \cdot \Vert _{V_{\ell }} = \Vert \cdot \Vert _H + \Vert \nabla \cdot \Vert _{L^p(\mathcal {D}_{\ell },\chi _{\ell })^d} \end{aligned}$$

(6)

and semi-norms

$$\begin{aligned} |\cdot |_{V} = \Vert \nabla \cdot \Vert _{L^p(\mathcal {D})^d} \quad \text {and}\quad |\cdot |_{V_{\ell }} = \Vert \nabla \cdot \Vert _{L^p(\mathcal {D}_{\ell },\chi _{\ell })^d}. \end{aligned}$$

Note that a bootstrap argument involving the Sobolev embedding theorem shows that the norm given in (6) is equivalent to the standard norm in the space. We can now introduce the operators $A(t) :V \rightarrow V^*$, $A_{\ell }(t) :V_{\ell } \rightarrow V^*_{\ell }$, $\ell \in \{ 1,\dots ,s\}$, $t\in [0,T]$, given by

$$\begin{aligned} \langle A(t) u, v \rangle _{V^*\times V}&= \int _{\mathcal {D}} \alpha (t) |\nabla u |^{p-2} \nabla u \cdot \nabla v \,\textrm{d}x,\quad u,v\in V, \\ \langle A_{\ell }(t) u,v \rangle _{V_{\ell }^{*}\times V_{\ell }}&= \int _{\mathcal {D}_{\ell }} \chi _{\ell } \alpha (t) |\nabla u |^{p-2} \nabla u \cdot \nabla v \,\textrm{d}x, \quad u,v\in V_{\ell }. \end{aligned}$$

Similarly, we define the right-hand sides $f_{\ell } :[0,T] \rightarrow H$, $\ell \in \{1,\dots ,s\}$, where $f_{\ell }(t) = \chi _{\ell } f(t)$ in H for almost every $t \in (0,T)$.

Lemma 3.1

Let the parameters of Eq. (5) be given such that $\alpha \in C([0,T];\mathbb {R})$, $u_0 \in L^2(\mathcal {D})$ and $\tilde{f} \in L^2((0,T) \times \mathcal {D})$. Then the setting described above fulfills Assumptions 2.1–2.3.

Let the partition of unity $\{\chi _{\ell } \}_{\ell =1}^{s}\subset W^{1,\infty }(\mathcal {D})$ fulfill that for every function $\chi _{\ell }$ there exists $\varepsilon _0 \in (0,\infty )$ such that $\mathcal {D}_{\ell }^{\varepsilon } = \{ x\in \mathcal {D}_{\ell }: \chi _{\ell }(x) \ge \varepsilon \}$ is a Lipschitz domain for all $\varepsilon \in (0,\varepsilon _0)$. Then V and $V_{\ell }$, $\ell \in \{1,\dots ,s\}$, are reflexive Banach spaces and $V = \bigcap _{\ell = 1}^s V_{\ell }$. Further, the family of operators $\{A_{\ell }(t)\}_{t \in [0,T]}$, $\ell \in \{1,\dots ,s\}$ fulfills Assumption 2.2 with the spaces $V_{\ell }$, H and $V_{\ell }^*$. Moreover, $\sum _{\ell = 1}^{s} A_{\ell }(t) v = A(t) v$ is fulfilled in $V^*$ for $v \in V$ for almost every $t \in (0,T)$ and corresponding constants $\kappa _A = \kappa _{\ell } = \lambda _A = \lambda _{\ell } = 0$, $\mu _A = \mu _{\ell } = \eta _A = \eta _{\ell } = 1$.

Finally, the family $\{f_{\ell }\}_{\ell \in \{1,\dots ,s\}}$ fulfills $f_{\ell } \in L^2(0,T; H)$ and $\sum _{\ell = 1}^{s} f_{\ell }(t) = f(t)$ in H for almost all $t \in (0,T)$.

Proof

The space $H = L^2(\mathcal {D})$ is a real, separable Hilbert space, while $V = W_0^{1,p}(\mathcal {D})$ is a real, separable Banach space that is densely embedded into H. Thus, they fulfill Assumption 2.1. Analogously to [6, Lemma 3], the spaces V and $V_{\ell }$, $\ell \in \{1,\dots ,s\}$, are reflexive Banach spaces and since $C_0^{\infty }(\mathcal {D})$ is dense in H and $C_0^{\infty }(\mathcal {D}) \subseteq V \subset V_{\ell }$ it follows that V and $V_{\ell }$ are dense in H. It remains to prove that $\bigcap _{\ell = 1}^s V_{\ell } = V$ is fulfilled. First, we notice that $\Vert w\Vert _{L^p( \mathcal {D}_{\ell },\chi _{\ell })^d} \le \Vert w\Vert _{L^p(\mathcal {D})^d}$ for every $w \in L^p(\mathcal {D})^d$. Thus, it follows that $V \subseteq V_{\ell }$ for every $\ell \in \{1,\dots ,s\}$ and in particular $V \subseteq \bigcap _{\ell = 1}^s V_{\ell }$. The other inclusion $\bigcap _{\ell = 1}^s V_{\ell } \subseteq V$ requires more attention. For $\varepsilon \in (0,\infty )$, we introduce the set $\mathcal {D}_{\ell }^{\varepsilon } = \{ x \in \mathcal {D}: \chi _{\ell }(x) \ge \varepsilon \}$. By assumption the sets $\mathcal {D}_{\ell }^{\varepsilon }$ have Lipschitz boundary for $\varepsilon $ small enough. We consider the spaces of restricted functions

$$\begin{aligned} C_0^{\infty }(\mathcal {D})\vert _{\mathcal {D}_{\ell }^{\varepsilon }} = \{ u \in C^{\infty }(\mathcal {D}_{\ell }^{\varepsilon }) : u\vert _{\partial \mathcal {D}_{\ell }^{\varepsilon } \cap \partial \mathcal {D}} = 0\} \quad \text {and} \quad V_{\ell }^{\varepsilon } = \{ u\vert _{\mathcal {D}_{\ell }^{\varepsilon }} : u \in V_{\ell } \}. \end{aligned}$$

If a weight function $\chi _{\ell }$ fulfills $0< \varepsilon< \chi _{\ell } \le 1 <\infty $ on the whole domain $\mathcal {D}$, it follows that the weighted Lebesgue space $L^p(\mathcal {D}_{\ell }^{\varepsilon },\chi _{\ell })^d$ coincides with the space $L^p(\mathcal {D}_{\ell }^{\varepsilon })^d$ (see, e.g., [25, Chapter 3]). Thus, we obtain $V_{\ell }^{\varepsilon }= W^{1,p}(\mathcal {D}_{\ell }^{\varepsilon })$. The continuity of the trace operator (see, e.g., [27, Theorem 15.23]), implies that

$$\begin{aligned} \overline{C_0^{\infty }(\mathcal {D})\vert _{\mathcal {D}_{\ell }^{\varepsilon }} }^{\Vert \cdot \Vert _{V_{\ell } }} = \{ u \in W^{1,p}(\mathcal {D}_{\ell }^{\varepsilon }) : u\vert _{\partial \mathcal {D}_{\ell }^{\varepsilon } \cap \partial \mathcal {D}} = 0\}. \end{aligned}$$

This shows that $u \in V_{\ell }$ is zero on $\partial \mathcal {D}_{\ell }^{\varepsilon } \cap \partial \mathcal {D}$ for every $\varepsilon \in (0,\infty )$ small enough. As $\varepsilon $ can be chosen arbitrarily small, it follows that $u \in V_{\ell }$ fulfills $v\vert _{\partial \mathcal {D}\cap \partial \mathcal {D}_{\ell }} = 0$. In combination with [6, Lemma 1], we obtain that $\bigcap _{\ell = 1}^{s} V_{\ell } = W^{1,p}_0(\mathcal {D}) = V$.

Similar to the argumentation of [6, Lemma 4], it follows that the families of operators $\{A(t)\}_{t \in [0,T]}$ and $\{A_{\ell }(t)\}_{t \in [0,T]}$, $\ell \in \{1,\dots ,s\}$, fulfills Assumption 2.2 with respect to the corresponding spaces with $\kappa _A = \kappa _{\ell } = \lambda _A = \lambda _{\ell } = 0$, $\mu _A = \mu _{\ell } = \eta _A = \eta _{\ell } = 1$.

Assumption 2.3 is fulfilled as $\tilde{f} \in L^2((0,T) \times \mathcal {D})$ means that the abstract function f belongs to $L^2(0,T;L^2(\mathcal {D}))$. Thus, as $\chi _{\ell } \in W^{1,\infty }(\mathcal {D})$, it follows that $f_{\ell } = \chi _{\ell } f \in L^2(0,T;H)$ and $\sum _{\ell = 1}^{s} f_{\ell }(t) = f(t)$ in H for almost every $t \in (0,T)$. $\square $

3.2 Randomized scheme

For a randomized splitting in combination with a domain decomposition, different approaches can be applied. One possibility is to choose a random support of the weight functions $\{\chi _{\ell }\}_{\ell \in \{1,\dots ,s\}}$. This could possibly be done efficiently using priority queue techniques similar to those in [36]. In this paper, we instead fix the weight functions, but choose a random part of the operator in every time step. For the operator $A(t) = \sum _{\ell = 1}^{s} A_{\ell }(t)$ and a right hand side $f(t) = \sum _{\ell = 1}^{s} f_{\ell }(t)$, we introduce a random variable $\xi :\Omega \rightarrow 2^{\{1, \dots , s\}}$ such that $[A_{\xi }(t)](\omega ) = \sum _{\ell \in \xi (\omega )} A_{\ell }(t) / \tau _{\ell }$ and $[f_{\xi }(t)](\omega ) = \sum _{\ell \in \xi (\omega )} f_{\ell }(t) / \tau _{\ell }$ with

$$\begin{aligned} \tau _{\ell } = \sum _{ B \in 2^{\{1, \dots , s\}} : \ \ell \in B} \mathcal {P}(\Omega _{\xi = B}) \quad \text {with} \quad \Omega _{\xi = B} = \{ \omega \in \Omega : \xi (\omega ) = B\}. \end{aligned}$$

The value $\tau _{\ell }$ is the proper scaling factor which ensures that $\mathbb {E}_{\xi } [A_{\xi }(t)] = A(t)$ and $\mathbb {E}_{\xi } [f_{\xi }(t)] = f(t)$. We tacitly assume that $\tau _{\ell } > 0$, because otherwise we would be in a situation where at least one $A_{\ell }(t)$ is never chosen. Such a strategy would obviously not work. We set $V_{\xi (\omega )} = \bigcap _{\ell \in \xi (\omega )} V_{\ell }$.

Lemma 3.2

Let $\{\xi _n\}_{n \in \{1,\dots ,N\}}$ fulfill Assumption 2.4 such that $\xi _n :\Omega \rightarrow 2^{\{1,\dots ,s\}}$ and $\xi _n^{-1}(B) \in \mathcal {F}_{\xi _n}$ for all $B \subset 2^{\{1,\dots ,s\}}$ and $n \in \{1,\dots ,N\}$. Under the setting above, Assumption 2.5 is fulfilled.

Proof

In the following proof, we drop the index n to keep the notation simpler. The embedding and norm properties are fulfilled as verified in the previous lemma. It remains to verify the measurability condition. We need to verify that for every $v \in H$, the set $\{\omega \in \Omega : v \in V_{\xi (\omega )}\} \in \mathcal {F}_{\xi } = \sigma \big ( \sigma (\xi ) \cup \sigma (\mathcal {N} \in \mathcal {F}: \mathcal {P}(\mathcal {N}) = 0)\big )$. For fixed $v \in H$, we set $B_v = \{\ell \in \{1,\dots , s\}: v \in V_{\ell }\} \in 2^{\{1,\dots ,s\}}$. Then it follows that

$$\begin{aligned} \{\omega \in \Omega : v \in V_{\xi (\omega )}\} = \big \{ \omega \in \Omega : \xi (\omega ) \in 2^{B_v} \big \} = \xi ^{-1}\big (2^{B_v}\big ) \in \mathcal {F}_{\xi }. \end{aligned}$$

Moreover, we need to verify that the map** $\omega \mapsto A_{\xi (\omega )}(t)v$ is measurable for every $v \in H$. This can be seen from the decomposition $A_{\xi }(t)v = S_{A(t)v} \circ \xi $ where $S_{A(t)v} :2^{\{1,\dots ,s\}} \rightarrow V^*$ is given through $S_{A(t)v} (B) = \sum _{\ell \in B} A_{\ell }(t)v$. As $\xi ^{-1}(B) \in \mathcal {F}_{\xi }$ for all $B \subset 2^{\{1,\dots ,s\}}$ and $S_{A(t)v}^{-1}(X) \subset 2^{\{1,\dots ,s\}}$ for any open set $X \subset V^*$, the map** $\omega \mapsto A_{\xi (\omega )}(t)v$ is measurable. Analogously, it can be proved that the map** $\omega \mapsto f_{\xi (\omega )}(t)$ is measurable. In Lemma 3.1, we already verified that an operator $A_{\xi (w)}$ fulfills the conditions from Assumption 2.2. Thus, it only remains to prove the expectation property from Assumption 2.5. This is fulfilled as

$$\begin{aligned} \mathbb {E}_{\xi } [A_{\xi }(t) v]&= \sum _{B \in 2^{\{1,\dots ,s\}}} \mathcal {P}(\Omega _{\xi = B})\sum _{\ell \in B} \frac{1}{\tau _{\ell }} A_{\ell }(t) v \\&= \sum _{\ell = 1}^{s} \frac{1}{\tau _{\ell }} A_{\ell }(t) v \sum _{B \in 2^{\{1,\dots ,s\}} : \ \ell \in B }{ \mathcal {P}(\Omega _{\xi = B}) } = \sum _{\ell = 1}^{s} A_{\ell }(t) v = A(t) v \quad \text {in } V^* \end{aligned}$$

holds true for $v \in V$ and for almost every $t \in [0,T]$. The same algebraic manipulation in H instead of $V^*$ shows that $\mathbb {E}_{\xi } [f_{\xi }(t)] = f(t)$. $\square $

4 Solution is well-defined

In the coming section, we show that our scheme (2) is well-defined. This includes that first of all the scheme possesses a unique solution. We consider a purely deterministic Eq. (1). However, as the numerical scheme is randomized, the solution $U^n$ of (2) is a map** of the type $U^n :\Omega \rightarrow H$. Thus, we also need to make sure that it is a measurable function. These facts are verified in Lemma 4.1. Moreover, we provide an integrability result in the form of an a priori bound in Lemma 4.2.

Lemma 4.1

Let Assumptions 2.1–2.5 be fulfilled. Further, let the random variables $f^n :\Omega \rightarrow H$ be given such that they are $\mathcal {F}_{\xi _n}$-measurable for every $n \in \{1,\dots , N\}$. Then for $\kappa h_n \le \kappa h < 1$ there exists a unique $\mathcal {F}_n$-measurable function $U^n :\Omega \rightarrow H$ such that $U^n(\omega ) \in V_{\xi _n(\omega )}$ and (2) is fulfilled for every $n \in \{1,\dots ,N\}$.

Proof

For $\omega \in \Omega $, we find that the operator $I + h_n A_{\xi _n(\omega ) }(t_n) :V_{\xi _n(\omega )}\rightarrow V_{\xi _n(\omega )}^*$ is monotone, radially continuous and coercive. Thus, it is surjective, compare [33, Theorem 2.18]. Moreover, for $U_1, U_2 \in V_{\xi _n(\omega )}$ with $\big (I + h_n A_{\xi _n(\omega ) }(t_n) \big )U_1 = \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big )U_2$, it follows that

$$\begin{aligned} 0&= \left\langle \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big )U_1 - \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big )U_2, U_1 - U_2 \right\rangle _{V_{\xi _n(\omega )}^* \times V_{\xi _n(\omega )}}\\&\ge \big (1 - h_n \kappa \big ) \Vert U_1 - U_2 \Vert _H^2. \end{aligned}$$

Thus, it follows that $\Vert U_1 - U_2 \Vert _H = 0$ and $I + h_n A_{\xi _n(\omega ) }(t_n) $ is injective for $\kappa h_n < 1$ and, in particular, bijective.

It remains to verify that $U^n :\Omega \rightarrow H$ is well-defined. We define the auxiliary function $g :\Omega \times H \rightarrow V^*$ such that

$$\begin{aligned} (\omega , U) \mapsto {\left\{ \begin{array}{ll} h_n f^n (\omega ) + U^{n-1} - \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big ) U, &{}U \in V_{\xi _n(\omega )}\\ e, &{}U \in H {\setminus } V_{\xi _n(\omega )}, \end{array}\right. } \end{aligned}$$

where $e \in V^*$ with $\Vert e\Vert _{V^*} = 1$. In the following, we want to apply Lemma A.3 to the function g to prove that $U^n$ is measurable. Applying [33, Lemma 2.16], it follows that for fixed $\omega \in \Omega $, the function $v \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is continuous for all $v, w \in V_{\xi _n(\omega )}$. It remains to verify that for fixed $v \in H$ and $w \in V$, the function $\omega \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is measurable. Let B be an open set in $V^*$. It then follows that

$$\begin{aligned}&\big (g(\cdot , v) \big )^{-1} (B) \\&\quad = \{ \omega \in \Omega : g(\omega , v) \in B \}\\&\quad = \{ \omega \in \Omega : v \in V_{\xi _n(\omega )}, h_n f^n (\omega ) + U^{n-1} - \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big ) v \in B \}\\&\quad \quad \cup \{ \omega \in \Omega : v \in H {\setminus } V_{\xi _n(\omega )}, e \in B \} \\&\quad = \big (\{ \omega \in \Omega : v \in V_{\xi _n(\omega )}\} \cap \{ \omega \in \Omega : h_n f^n (\omega ) + U^{n-1} - \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big ) v \in B \} \big )\\&\quad \quad \cup \big (\{ \omega \in \Omega : v \in H {\setminus } V_{\xi _n(\omega )}\} \cap \{ \omega \in \Omega : e \in B \}\big )\\&\quad =: (T_1 \cap T_2) \cup T_3. \end{aligned}$$

As the function $\omega \mapsto h_n f^n (\omega ) + U^{n-1} - \big (I + h_n A_{\xi _n(\omega ) }(t_n) \big )v$ is measurable, it follows that $T_2 \subset \Omega $ is measurable. The sets $T_1$ and $T_3$ are measurable by assumption. Thus, it follows that $\omega \mapsto g(\omega , v)$ and therefore $\omega \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is measurable.

As argued above for every $\omega \in \Omega $, there exists a unique element $U^n(\omega )$ such that $g(\omega , U^n(\omega )) = 0$. Thus, we can now apply Lemma A.3 to prove that $U^n :\Omega \rightarrow H$ is $\mathcal {F}_n$-measurable. $\square $

Lemma 4.2

Let Assumptions 2.1–2.5 be fulfilled. Further, let the random variables $f^n :\Omega \rightarrow H$ be given such that they are $\mathcal {F}_{\xi _n}$-measurable and $\mathbb {E}_{\xi _n} \big [ \Vert f^n \Vert _{H}^2 \big ] < \infty $ for every $n \in \{1,\dots , N\}$. Then for $2\kappa h_n \le 2\kappa h < 1$ the solution $\{U^n\}_{n \in \{1,\dots ,N\}}$ of (2) fulfills the a priori bound

$$\begin{aligned}&\mathbb {E}_n \big [\Vert U^n\Vert _H^2\big ] + \sum _{i=1}^{n}\mathbb {E}_i \big [ \Vert U^i - U^{i-1}\Vert _H^2 \big ] + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \mu _{\xi _i}|U^i|_{V_{\xi _i}}^2\big ]\\&\quad \le C \Big ( 2\Vert u_0\Vert ^2 + 4 T \lambda + 5 C T \sum _{i=1}^{N} h_i \mathbb {E}_{\xi _i} \big [ \Vert f^i \Vert _{H}^2 \big ] \Big ), \end{aligned}$$

where $C = \frac{1}{1- 2\,h \kappa } \exp \big (\frac{2\kappa T}{1- 2\,h \kappa }\big )$ for all $n \in \{1,\dots ,N\}$.

The proof of this lemma is very similar to the proof of our main result Theorem 5.1 and therefore omitted. The main necessary modification is to directly test (2) with $U^n$ and use the semi-coercivity from Remark 1.

5 Stability and convergence in expectation

With the previous sections in mind, we can now turn our attention to the main results of this paper. We provide error bounds for the scheme (2) measured in expectation. First, we give a stability result in Theorem 5.1. The aim of this bound is to show how two solutions of the same scheme with respect to different right-hand sides and initial values differ. This stability result can then be used to prove the desired error bounds in Theorem 5.2 by using well-chosen data that agrees with the exact solution at the grid points. Note that in contrast to other works (e.g. [10, 11]), we measure $f(t) - A(t)u(t)$ in the H-norm. This can be interpreted as a stricter regularity assumption. The advantage is that certain error terms disappear in expectation, compare the second bound in Lemma A.4.

Theorem 5.1

Let Assumptions 2.1–2.5 be fulfilled. Further, let the random variable $f^n :\Omega \rightarrow H$ be given such that it is $\mathcal {F}_{\xi _n}$-measurable and $\mathbb {E}_{\xi _n} \big [ \Vert f^n \Vert _H^2 \big ] < \infty $ for every $n \in \{1,\dots , N\}$. Let $\{U^n\}_{n \in \{1,\dots ,N\}}$ be the solution of (2) and let $\{V^n\}_{n \in \{1,\dots ,N\}}$ be the solution of

$$\begin{aligned} {\left\{ \begin{array}{ll} V^n - V^{n-1} + h_n A_{\xi _n}(t_n) V^n = h_n g^n \quad &{}\text {in } V_{\xi _n}^*, \quad n \in \{1,\dots ,N\}, \\ V^0 = v_0 \quad &{}\text {in } H, \end{array}\right. } \end{aligned}$$

(7)

for $v_0 \in H$ and $g^n :\Omega \rightarrow H$ such that it is $\mathcal {F}_{\xi _n}$-measurable and $\mathbb {E}_{\xi _n} \big [ \Vert g^n\Vert _H^2 \big ] < \infty $ for every $n \in \{1,\dots , N\}$. Then for $2\kappa h_n \le 2\kappa h < 1$, it follows that

$$\begin{aligned}&\mathbb {E}_n \big [ \Vert U^n - V^n\Vert _H^2\big ] + \frac{1}{2} \sum _{i=1}^{n} \mathbb {E}_i \big [\Vert U^i - V^i - (U^{i-1} -V^{i-1})\Vert _H^2 \big ]\\&\quad \quad + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\big ] \\&\quad \le 2 C \Vert u_0 -v_0\Vert ^2 + 4 C \sum _{i=1}^{N} h_i^2 \mathbb {E}_i \big [ \Vert f^i - g^i\Vert _H^2\big ] + 5 C^2 T \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H^2 \end{aligned}$$

for $C = \frac{1}{1 - 2\,h \kappa } \exp \big (\frac{2\kappa T}{1 - 2 \kappa T}\big )$ and $n \in \{1,\dots ,N\}$.

Proof

We start by subtracting (7) from (2) and testing with $U^i - V^i$ to get

$$\begin{aligned}&\big ((U^i - V^i) - (U^{i-1} - V^{i-1}),U^i - V^i\big )_{}\nonumber \\&\quad + h_n \langle A_{\xi _i}(t_i) U^i - A_{\xi _i}(t_i) V^i, U^i - V^i \rangle _{V_{\xi _i}^*\times V_{\xi _i}} = h_n ( f^i - g^i , U^i - V^i )_{}.\quad \end{aligned}$$

(8)

For the first term of this equality, we use the identity $( a - b , a )_{} = \frac{1}{2} (\Vert a\Vert ^2 - \Vert b\Vert ^2 + \Vert a-b\Vert ^2 )$ for $a, b \in H$ to find that

$$\begin{aligned}&\big ((U^i - V^i) - (U^{i-1} - V^{i-1}),U^i - V^i\big )_{}\\&\quad = \frac{1}{2} \big ( \Vert U^i - V^i\Vert _H^2 - \Vert U^{i-1} - V^{i-1}\Vert _H^2 + \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2 \big ). \end{aligned}$$

Due to the monotonicity condition from Assumption 2.2 (iii), we obtain

$$\begin{aligned} \langle A_{\xi _i}(t_i) U^i - A_{\xi _i}(t_i) V^i, U^i - V^i \rangle _{V_{\xi _i}^*\times V_{\xi _i}} + \kappa _{\xi _i}\Vert U^i - V^i \Vert _H^2 \ge \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p. \end{aligned}$$

It remains to find a bound for the right-hand side of (8). Applying Cauchy-Schwarz’s inequality and the weighted Young’s inequality for products (Lemma A.2 with $\varepsilon = 1$), it follows that

$$\begin{aligned}&h_i \big (f^i - g^i,U^i - V^i\big )_{}\\&\quad = h_i \big (f^i - g^i,U^{i-1} - V^{i-1}\big )_{} + h_i \big (f^i - g^i,U^i - V^i - (U^{i-1} - V^{i-1})\big )_{}\\&\quad \le h_i \big (f^i - g^i,U^{i-1} - V^{i-1}\big )_{} + h_i \Vert f^i - g^i\Vert _H \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H\\&\quad \le h_i \big (f^i - g^i,U^{i-1} - V^{i-1}\big )_{} + h_i^2 \Vert f^i - g^i\Vert _H^2 + \frac{1}{4} \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2. \end{aligned}$$

Combining the previous statements, we find

$$\begin{aligned} 0&= ( U^i - V^i - (U^{i-1} - V^{i-1}) , U^i - V^i )_{}\\&\quad + h_i \langle A_{\xi _i}(t_i) U^i - A_{\xi _i}(t_i) V^i , U^i - V^i \rangle _{V_{\xi _i}^*\times V_{\xi _i}} - h_i \big (f^i - g^i,U^i - V^i\big )_{}\\&\ge \frac{1}{2} \big ( \Vert U^i - V^i\Vert _H^2 - \Vert U^{i-1} - V^{i-1}\Vert _H^2 + \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2 \big )\\&\quad - h_i \kappa _{\xi _i}\Vert U^i - V^i \Vert _H^2 + h_i \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\\&\quad -h_i \big (f^i - g^i,U^{i-1} - V^{i-1}\big )_{} - h_i^2 \Vert f^i - g^i\Vert _H^2 - \frac{1}{4} \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2. \end{aligned}$$

After rearranging the terms and multiplying both sides of the inequality with the factor 2, we obtain the following bound

$$\begin{aligned}&\Vert U^i - V^i\Vert _H^2 - \Vert U^{i-1} - V^{i-1}\Vert _H^2 + \frac{1}{2}\Vert U^i - V^i - (U^{i-1} -V^{i-1})\Vert _H^2\\&\qquad + 2 h_i \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\\&\quad \le 2 h_i \kappa _{\xi _i}\Vert U^i - V^i \Vert _H^2 + 2 h_i \big (f^i - g^i,U^{i-1} - V^{i-1}\big )_{} + 2 h_i^2 \Vert f^i - g^i\Vert _H^2. \end{aligned}$$

By first taking the $\mathbb {E}_{\xi _i}$-expectation of this inequality and then applying also the $\mathbb {E}_{i-1}$-expectation, we find that

$$\begin{aligned}&\mathbb {E}_i \big [ \Vert U^i - V^i\Vert _H^2\big ] - \mathbb {E}_{i-1}\big [ \Vert U^{i-1} - V^{i-1}\Vert _H^2\big ] + \frac{1}{2} \mathbb {E}_i \big [\Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2 \big ]\\&\qquad + 2 h_i \mathbb {E}_i \big [ \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\big ] \\&\quad \le 2 h_i \mathbb {E}_i \big [ \kappa _{\xi _i}\Vert U^i - V^i \Vert _H^2 \big ] + 2 h_i \mathbb {E}_{i-1}\big [ \big (\mathbb {E}_{\xi _i} \big [f^i - g^i \big ],U^{i-1} - V^{i-1}\big )_{}\big ]\\&\qquad + 2 h_i^2 \mathbb {E}_{\xi _i} \big [ \Vert f^i - g^i\Vert _H^2\big ]. \end{aligned}$$

After combining the previous two inequalities and summing up from $i = 1$ to $n \in \{1,\dots ,N\}$, we obtain

$$\begin{aligned} \begin{aligned}&\mathbb {E}_n \big [\Vert U^n - V^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n} \mathbb {E}_i \big [ \Vert U^i - V^i - (U^{i-1} - V^{i-1})\Vert _H^2 \big ]\\&\qquad + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\big ]\\&\quad \le \Vert u_0 - v_0\Vert _H^2 + 2 \kappa \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \Vert U^i - V^i \Vert _H^2\big ] \\&\qquad + 2 \sum _{i=1}^{n} h_i \mathbb {E}_{i-1}\big [ \big (\mathbb {E}_{\xi _i} \big [f^i - g^i \big ],U^{i-1} - V^{i-1}\big )_{}\big ] + 2 \sum _{i=1}^{N} h_i^2 \mathbb {E}_{\xi _i} \big [ \Vert f^i - g^i\Vert _H^2\big ], \end{aligned} \end{aligned}$$

(9)

where we only made the right-hand side bigger by summing to the final value N. In the following, denote $i_{\max } \in \{1,\dots ,N\}$ such that $\max _{i \in \{1,\dots ,N\}} \mathbb {E}_i \big [\Vert U^i - V^i \Vert _H^2 \big ] = \mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }} - V^{i_{\max }}\Vert _H^2 \big ]$. By Lemma A.3, it follows that $U^{i-1} - V^{i-1}$ is $\mathcal {F}_{i-1}$-measurable and thus independent of the $\mathcal {F}_{\xi _i}$-measurable random variable $f^i - g^i$. Therefore, we find that

$$\begin{aligned}&2 \sum _{i=1}^{n} h_i \mathbb {E}_{i-1}\big [ \big (\mathbb {E}_{\xi _i} \big [f^i - g^i \big ],U^{i-1}- V^{i-1}\big )_{}\big ]\\&\quad \le 2 \sum _{i=1}^{n} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H\mathbb {E}_{i-1} \big [ \Vert U^{i-1} - V^{i-1}\Vert _H \big ] \\&\quad \le 2 \big (\mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }} - V^{i_{\max }}\Vert _H^2 \big ]\big )^{\frac{1}{2}} \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H. \end{aligned}$$

To keep the presentation compact, we abbreviate

$$\begin{aligned} B_1 = \sum _{i=1}^{N} h_i^2 \mathbb {E}_i \big [ \Vert f^i - g^i\Vert _H^2\big ] \quad \text {and} \quad B_2 = \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H. \end{aligned}$$

Setting

$$\begin{aligned} x_n&= \mathbb {E}_n \big [\Vert U^n - V^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n} \mathbb {E}_i \big [ \Vert U^i - V^i -(U^{i-1} - V^{i-1})\Vert _H^2 \big ]\\&\quad + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^p\big ], \end{aligned}$$

we have $2\kappa \sum _{i=1}^{n}{ h_i \mathbb {E}_i \big [ \Vert U^i - V^i \Vert _{H}^2\big ]} \le 2\kappa \sum _{i=1}^{n}{ h_i x_i}$. We can now apply Grönwall’s inequality (Lemma A.1) to (9). It follows that

$$\begin{aligned} x_n \le C \Big (\Vert u_0 - v_0\Vert ^2 + 2 B_1 + 2 \big (\mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }} - V^{i_{\max }} \Vert _H^2 \big ]\big )^{\frac{1}{2}} B_2 \Big ), \end{aligned}$$

(10)

for $C = \frac{1}{1- 2\,h \kappa } \exp \big (\frac{2\kappa T}{1- 2\,h \kappa }\big )$. As this inequality holds for every $n \in \{1,\dots ,N\}$, it is also fulfilled for $i_{\max }$. Thus, it follows that

$$\begin{aligned}&\mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }} - V^{i_{\max }}\Vert _H^2\big ]\\&\quad \le C \big (\Vert u_0 -v_0\Vert ^2 + 2 B_1+ 2 \big (\mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }} - V^{i_{\max }}\Vert _H^2 \big ]\big )^{\frac{1}{2}} B_2\big ). \end{aligned}$$

We can now use that $x^2 \le 2ax + b^2$ implies that $x \le 2a +b$ for $a,b,x \in [0,\infty )$ and find

$$\begin{aligned} \big (\mathbb {E}_{i_{\max }} \big [\Vert U^{i_{\max }}- V^{i_{\max }}\Vert _H^2\big ]\big )^{\frac{1}{2}} \le C^{\frac{1}{2}} \big (\Vert u_0 -v_0\Vert ^2 + 2 B_1\big )^{\frac{1}{2}} + 2 C B_2. \end{aligned}$$

Inserting this bound in (10) and applying Young’s inequality (Lemma A.2 for $\varepsilon = 1$), we then obtain

$$\begin{aligned}&\mathbb {E}_n \big [\Vert U^n -V^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n}\mathbb {E}_i \big [ \Vert U^i - V^i -(U^{i-1} - V^{i-1})\Vert _H^2 \big ]\\&\qquad + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|U^i - V^i |_{V_{\xi _i}}^2\big ]\\&\quad \le C \Big (\Vert u_0 -v_0\Vert ^2 + 2 B_1 + 2 C^{\frac{1}{2}} \big (\Vert u_0 - v_0\Vert ^2 + 2 B_1\big )^{\frac{1}{2}} B_2+ 4 C B_2^2 \Big )\\&\quad \le C \Big (\Vert u_0 -v_0\Vert ^2 + 2 B_1 + \big (\Vert u_0 - v_0\Vert ^2 + 2 B_1\big ) + C B_2^2 + 4 C B_2^2 \Big )\\&\quad = 2 C \big (\Vert u_0 -v_0\Vert ^2 + 2 B_1\big ) + 5 C^2 B_2^2. \end{aligned}$$

It only remains to insert

$$\begin{aligned} B_2^2 = \Big ( \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H \Big )^2 \le T \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H^2, \end{aligned}$$

to finish the proof. $\square $

Theorem 5.2

Let Assumptions 2.1–2.5 be fulfilled. Further, let $f_{\xi _n} \in C([0,T]; H)$ almost surely and $f^n = f_{\xi _n}(t_n) \in L^2(\Omega ;H)$ for all $n \in \{1,\dots ,N\}$. Let $\{U^n\}_{n \in \{1,\dots ,N\}}$ be the solution of (2) and u be the solution of (1) that fulfills $u' \in C^{\gamma } ([0,T]; H)$, $\gamma \in (0,1]$. Moreover, let $A_{\xi _n}(t_n) u(t_n) \in L^2(\Omega ; H)$ be fulfilled.

Then for $2\kappa h_n \le 2\kappa h < 1$ and $e^n = U^n - u(t_n)$, it follows that

$$\begin{aligned}&\mathbb {E}_n \big [\Vert e^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n}\mathbb {E}_i \big [ \Vert e^i - e^{i-1}\Vert _H^2 \big ]+ 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|e^i |_{V_{\xi _i}}^p\big ]\\&\quad \le 8 h C \sum _{i = 1}^{N} h_i \mathbb {E}_{\xi _i} \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^2 \big ] \\&\qquad + 4 h^{1 + 2 \gamma } C |u' |_{C^{\gamma }([0,T];H)}^2 T + 5 h^{2\gamma } C^2 |u' |_{C^{\gamma }([0,T];H)}^{2} T^2, \end{aligned}$$

where $C = \frac{1}{1- 2\,h \kappa } \exp \big (\frac{2\kappa T}{1- 2\,h \kappa }\big )$ for all $n \in \{1,\dots ,N\}$.

Proof

We use $\{V^n\}_{n \in \{1,\dots ,N\}}$ given by

$$\begin{aligned} {\left\{ \begin{array}{ll} V^n - V^{n-1} + h_n A_{\xi _n}(t_n) V^n = h_n g^n \quad &{}\text {in } V_{\xi _n}^*, \quad n \in \{1,\dots ,N\}, \\ V^0 = u_0 \quad &{}\text {in } H, \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} g^n = \frac{1}{h_n} \big ( u(t_n) - u(t_{n-1})\big ) + A_{\xi _n}(t_n) u(t_n) \in L^2(\Omega ; H). \end{aligned}$$

With this particular choice of $g^n$, we can now show that $V^n = u(t_n)$ for every $n \in \{1.\dots ,N\}$. Given the initial value $u_0$, the solution $V^1$ is then given by

$$\begin{aligned} V^1&= u_0 + h_1 g^1 - h_1 A_{\xi _1}(t_1) V^1\\&= u_0 + \big ( u(t_1) - u(t_{0})\big ) + h_1 A_{\xi _1}(t_1) u(t_1) - h_1 A_{\xi _1}(t_1) V^1\\&= u(t_1) + h_1 A_{\xi _1}(t_1) u(t_1) - h_1 A_{\xi _1}(t_1) V^1. \end{aligned}$$

Therefore, it follows that

$$\begin{aligned} (I + h_1 A_{\xi _1}(t_1) ) V^1 = (I + h_1 A_{\xi _1}(t_1) ) u(t_1) \quad \text {in } V_{\xi _1}^*. \end{aligned}$$

Since $I + h_1 A_{\xi _1}(t_1)$ is injective, we find $V^1 = u(t_1)$ in $V_{\xi _1}$. Recursively, it follows that $V^n = u(t_n)$ in $V_{\xi _n}$ for all other $n \in \{1,\dots , N\}$. Together with the stability estimate from Theorem 5.1 we find for $e^n = U^n - V^n = U^n - u(t_n)$ that

$$\begin{aligned} \mathbb {E}_n \big [\Vert e^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n}\mathbb {E}_i \big [ \Vert e^i - e^{i-1}\Vert _H^2 \big ] + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|e^i |_{V_{\xi _i}}^2\big ] \le 4 C B_1 + 5 C^2 B_2^2, \end{aligned}$$

where

$$\begin{aligned} B_1&= \sum _{i=1}^{N} h_i^2 \mathbb {E}_i \big [ \Vert f^i - g^i\Vert _H^2\big ], \quad B_2 = \sum _{i=1}^{N} h_i \big \Vert \mathbb {E}_{\xi _i} \big [ f^i - g^i \big ] \big \Vert _H,\\ C&= \frac{1}{1 - 2 h \kappa } \exp \Big (\frac{2\kappa T}{1 - 2 \kappa T}\Big ). \end{aligned}$$

Applying Lemma A.4 for $u' \in C^{\gamma }([0,T];H)$, it follows that

$$\begin{aligned} B_1&\le h \sum _{i=1}^{N} h_i \mathbb {E}_{\xi _i} \Big [ \Big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big ) \,\textrm{d}t\Big \Vert _H^2 \Big ] \\&\le 2 h \sum _{i = 1}^{N} h_i \mathbb {E}_{\xi _i} \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^2 \big ] \\&\qquad + 2 h^{1 + 2 \gamma } |u' |_{C^{\gamma }([0,T];H)}^2 T \end{aligned}$$

and

$$\begin{aligned} B_2^2&\le T \sum _{i=1}^{N} h_i \Big \Vert \mathbb {E}_{\xi _i} \Big [ f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big )\,\textrm{d}t \Big ] \Big \Vert _H^2\\&\le h^{2\gamma } |u' |_{C^{\gamma }([0,T];H)}^{2} T^2. \end{aligned}$$

Altogether, we obtain

$$\begin{aligned}&\mathbb {E}_n \big [\Vert e^n\Vert _H^2\big ] + \frac{1}{2}\sum _{i=1}^{n}\mathbb {E}_i \big [ \Vert e^i - e^{i-1}\Vert _H^2 \big ] + 2 \sum _{i=1}^{n} h_i \mathbb {E}_i \big [ \eta _{\xi _i}|e^i|_{V_{\xi _i}}^2\big ]\\&\quad \le 8 h C \sum _{i = 1}^{N} h_i \mathbb {E}_{\xi _i} \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^2 \big ] \\&\qquad + 4 h^{1 + 2 \gamma } C |u' |_{C^{\gamma }([0,T];H)}^2 T + 5 h^{2\gamma } C^2 |u' |_{C^{\gamma }([0,T];H)}^{2} T^2. \end{aligned}$$

$\square $

Remark 2

The main results can all be modified to a slightly different setting, where the right-hand side f(t) takes values in $V^*$ and where the family $\{\xi _n\}_{n \in \mathbb {N}}$ of random variables does not have to be mutually independent. In return, this setting requires slightly stronger assumptions on the operator A(t). First, we assume additionally that there exists a constant $c_V \in (0,\infty )$ such that $\Vert \cdot \Vert _V \le c_V \big ( \Vert \cdot \Vert _H + |\cdot |_V\big )$ is fulfilled. To generalize the a priori bound from Lemma 4.2 and the stability results from Theorem 5.1, we need to assume that $\mu _A$ from Assumption 2.2 (v) and $\eta _A$ from Assumption 2.2 (iii) are strictly positive, respectively. Moreover, if there exist $\gamma \in (0,1]$ and $C \in [0,\infty )$ such that

$$\begin{aligned} \sum _{i = 1}^{N} h_i \mathbb {E}_i \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - (f(t_i) - A(t_i) u(t_i)) \big \Vert _{V_{\xi _i}^*}^2\big ] \le C h^{2 \gamma } \end{aligned}$$

is fulfilled and $u' \in C^{\gamma }([0,T];H)$, we obtain similar error bounds. We omit the proofs, which are very similar to the ones presented above.

6 Numerical experiments

To illustrate the theoretical convergence results for the randomized scheme in practice, we apply it to Eq. (5) as discussed in Sect. 3. This boundary, initial-value problem fits our setting as already explained there. We also consider what happens when we replace the nonlinear diffusion term with linear diffusion, and a smoother exact solution.

In both cases, we consider the problem on the spatial domain $\mathcal {D}= [-1,1] \times [-1,1]$ which we split into rectangular sub-domains $\mathcal {D}_{\ell }$, $\ell \in \{1,\ldots , s\}$, with $M_x$ rectangles along the x-axis and $M_y$ rectangles along the y-axis. We choose $\mathcal {D}_{\ell }$ such that they have an overlap of 0.2 on all internal sides. This means that, for example, with $M_x = M_y = 3$, we have $s = M_x M_y = 9$ sub-domains with, e.g., $\mathcal {D}_{1} = [-1, -0.267] \times [-1, -0.267]$, $\mathcal {D}_{2} = [-0.467, 0.467] \times [-1, -0.267]$ and $\mathcal {D}_{5} = [-0.467, 0.467] \times [-0.467, 0.467]$. Note that they are not uniform in size, because the sub-domains adjacent to the outer edge of $\mathcal {D}$ have no overlap on one or two sides.

We have to choose a strategy for which sub-problems to select in each time step, i.e. specify the probabilities $\mathcal {P}(\Omega _{\xi = B})$ for $B \subset 2^{\{1,\dots ,s\}}$. We consider two strategies. In the first, we simply use $\mathcal {P}(\Omega _{\xi = \{\ell \}}) = 1/s$. Thus every sub-domain is equally probable to be chosen. As a minor variation, we instead select a set of k sub-domains by drawing with replacement according to the uniform probabilities.

In the second strategy, we make use of a predictor. In addition to the stochastic approximation, we compute a deterministic approximation $Z^n$ using the backward Euler method, but on a coarser spatial mesh. The idea is that while this approximation is less accurate, it should be significantly cheaper to compute and still resemble the true solution. In the $n^{\text {th}}$ time step, we compute $\Psi _n = |Z^{n-1} |+ |Z^n |+ |\tilde{f}(t_n, \cdot ) |> 10^{-3}$. This function is either 0 or 1 and indicates where in the domain something is actually happening. For each sub-domain, we then check whether it is “sufficiently active” or not by evaluating $\Vert \Psi _n \chi _l\Vert \ge \rho \Vert \Psi _n\Vert $ for a parameter $\rho \in (0,1)$. We select the set of those sub-domains which pass the test with probability $1-\rho $ and the set of all the other sub-domains with probability $\rho $.

We note that the errors for the first strategy are noticeably larger than those of the second strategy. In the following, we will use fewer sub-domains for the first strategy for that reason. More precisely, we use $M_x = 3$ and $M_y = 1$ for first strategy and $M_x = 3$ and $M_y = 3$ for the second strategy. Furthermore, we can observe that the second strategy works better with more sub-domains, since it essentially adaptively groups them into only two larger sub-domains; the active set and the inactive set. Increasing the number of sub-domains increases the fidelity such that the choice of whether each sub-domain is active or not becomes easier, albeit at a higher computational cost. If the spatial discretization is using finite elements, the limit case would be when every element is its own subdomain. This is what is considered in [36] for a deterministic scheme, where it is, indeed, observed that the overhead costs can be prohibitive even when using very efficient data structures.

We only report errors here, since this is the focus of the paper. A natural next step would be to investigate also the computation times and the efficiency of the schemes compared to deterministic schemes. Since the randomized methods need to solve equation systems of smaller size, they are expected to outperform the deterministic schemes. However, this depends on many factors, such as the problem size, the number of subdomains, the behaviour of the exact solution and the random strategy used. Further, for such a comparison to be useful, it has to be performed with equally optimized and parallelized code for both the randomized and deterministic cases. Such advanced software engineering is out of the scope of this article. Nevertheless, when applying our non-parallelized and not fully optimized code to the linear diffusion problem using the first strategy, we observed a factor 2 speed-up that was independent of the number of time steps.

6.1 A nonlinear example

In our first experiment, we use the problem parameters $T = 1$, $p = 4$ and $\alpha (t) \equiv 1$. Further, we choose the source term $\tilde{f}$ such that the exact solution is given by $u(t, x, y) = \tilde{u}(x - r \cos (2\pi t), y - r \sin (2\pi t))$ with $r = 1/2$,

$$\begin{aligned} \tilde{u}( x, y) = \Bigl [0.03 - \frac{10^{3/8}}{4} (x^2 + y^2)^{\frac{4}{3}} \Bigr ]_+^{\frac{3}{4}} \end{aligned}$$

and $[\cdot ]_+ = \max \{\cdot , 0\}$. This describes a localized pulse that starts centered at (0.5, 0) and which then rotates around the origin at the constant distance r. The shape of the pulse is inspired by the closed-form Barenblatt solution to $\partial _t u = \nabla \cdot (|\nabla u(t,x) |^{p-2}\nabla u)$, see e.g. [21]. At $t = 0$, this solution is a Dirac delta, which then expands into a cone-shaped peak for $t>0$. Our pulse is this solution frozen at the time $t = 0.001$. We note that due to the sharp interface where the pulse meets the x-y-plane and to the sharp peak, u is of low regularity.

We discretize the problem in space using central finite differences, such that the approximation of the p-Laplacian is 2nd-order accurate. We use 100 computational nodes in each spatial dimension, for a total of $10\, 000$ degrees of freedom. Thus, the temporal error dominates the spatial error when considering the full error in the following. For the temporal discretization, we use the scheme (2), along with one of the two strategies outlined above. For the first strategy, we try $k = 1$ and $k = 2$. For the second, we evaluate the different parameters $\rho = 0.01, 0.05, 0.1, 0.2$. We compute approximations for the different (constant) time steps $h_n = 2^{-5}, 2^{-6}, \ldots , 2^{-13}$ and estimate their corresponding errors at the final time by running the method with 50 random iterations and averaging. That is, we approximate

$$\begin{aligned} \mathbb {E}_N \big [\Vert e^N\Vert _H^2\big ] \approx \frac{1}{50} \sum _{j=1}^{50}{ \Vert U_j^N - U_{\text {ref}}\Vert _H^2}, \end{aligned}$$

where $U_j^N$ is the numerical approximation on the j-th path and $U_{\text {ref}}$ is the exact solution $u(t_N, \cdot , \cdot )$ evaluated at the spatial grid.

Figure 1 shows the resulting relative errors vs. the time steps, with the first strategy in the upper plot and the second strategy in the lower. We observe that both strategies result in errors that decrease as $\mathcal {O}(h^{1/2})$, in line with Theorem 5.2.

6.2 A linear example

As a second experiment, we consider a linear version of the previous problem. We use the same parameters as in the previous section, except that we set $p = 2$ and $\alpha (t) = 0.1$, and that the rotating pulse is now Gaussian rather than a sharp peak. More precisely, the exact solution is given by

$$\begin{aligned} u(t, x, y) = e^{-100 (x - r \cos (2\pi t))^2 - 100 (y - r \sin (2\pi t))^2}. \end{aligned}$$

The resulting errors are shown in Fig. 2. Again, we note that the first, uniform, strategy converges as $\mathcal {O}(h^{1/2})$, in line with Theorem 5.2. The second strategy with $\rho = 0.01$ performs significantly better and the error behaves like $\mathcal {O}(h)$ in the first part of the plot. This is essentially the same behaviour as if we would apply backward Euler to the full problem, but the method only updates the approximation on the most relevant sub-domains and is therefore cheaper to evaluate. This improved convergence order is possible due to the extra smoothness present in this linear problem. In the error bound of Theorem 5.2, the first term becomes small due to the used strategy, and because the solution is smooth the remaining terms are of size $h^3$ and $h^2$, respectively.

Increasing the parameter $\rho $ means that we disregard more of the information from the predictor, and as seen in Fig. 2 this causes the convergence order to decrease towards 1/2. On the other hand, setting $\rho = 0$ means that we always choose all the sub-domains and thereby do more computations than if we would simply solve the full problem directly. The parameter $\rho $ is therefore a design parameter, and further research is required on how to choose it optimally for specific problem classes. Regardless of the choice, however, we still have $\mathcal {O}(h^{1/2})$-convergence.

References

Aronsson, G., Evans, L.C., Wu, Y.: Fast/slow diffusion and growing sandpiles. J. Differential Equations 131(2), 304–335 (1996)
Article MathSciNet Google Scholar
Bochacik, T., Goćwin, M., Morkisz, P.M., Przybyłowicz, P.: Randomized Runge-Kutta method-stability and convergence under inexact information. J. Complexity 65, 101554 (2021)
Article MathSciNet Google Scholar
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)
Article MathSciNet Google Scholar
Daun, T.: On the randomized solution of initial value problems. J. Complexity 27(3–4), 300–311 (2011)
Article MathSciNet Google Scholar
Eisenmann, M.: Methods for the temporal approximation of nonlinear, nonautonomous evolution equations. PhD thesis, TU Berlin (2019)
Eisenmann, M., Hansen, E.: Convergence analysis of domain decomposition based time integrators for degenerate parabolic equations. Numer. Math. 140(4), 913–938 (2018)
Article MathSciNet Google Scholar
Eisenmann, M., Hansen, E.: A variational approach to the sum splitting scheme. IMA J. Numer. Anal. 42(1), 923–950 (2022)
Article MathSciNet Google Scholar
Eisenmann, M., Kovács, M., Kruse, R., Larsson, S.: On a randomized backward Euler method for nonlinear evolution equations with time-irregular coefficients. Found. Comput. Math. 19(6), 1387–1430 (2019)
Article MathSciNet Google Scholar
Eisenmann, M., Stillfjord, T., Williamson, M.: Sub-linear convergence of a stochastic proximal iteration method in Hilbert space. Comput. Optim. Appl. 83(1), 181–210 (2022)
Article MathSciNet Google Scholar
Emmrich, E.: Two-step BDF time discretisation of nonlinear evolution problems governed by monotone operators with strongly continuous perturbations. Comput. Methods Appl. Math. 9(1), 37–62 (2009)
Article MathSciNet Google Scholar
Emmrich, E., Thalhammer, M.: Stiffly accurate Runge-Kutta methods for nonlinear evolution problems governed by a monotone operator. Math. Comp. 79(270), 785–806 (2010)
Article MathSciNet Google Scholar
Evans, L.C.: Partial differential equations. American Mathematical Society, Providence, RI (1998)
Google Scholar
Hansen, E., Henningsson, E.: Additive domain decomposition operator splittings–convergence analyses in a dissipative framework. IMA J. Numer. Anal. 37(3), 1496–1519 (2017)
MathSciNet Google Scholar
Hansen, E., Ostermann, A.: Dimension splitting for quasilinear parabolic equations. IMA J. Numer. Anal. 30(3), 857–869 (2010)
Article MathSciNet Google Scholar
Hansen, E., Stillfjord, T.: Convergence of the implicit-explicit Euler scheme applied to perturbed dissipative evolution equations. Math. Comp. 82(284), 1975–1985 (2013)
Article MathSciNet Google Scholar
Hundsdorfer, W., Verwer, J.: Numerical solution of time-dependent advection-diffusion-reaction equations. Springer, Berlin (2003)
Book Google Scholar
Jakobsen, E.R., Karlsen, K.H.: Convergence rates for semi-discrete splitting approximations for degenerate parabolic equations with source terms. BIT 45(1), 37–67 (2005)
Article MathSciNet Google Scholar
Jentzen, A., Neuenkirch, A.: A random Euler scheme for Carathéodory differential equations. J. Comput. Appl. Math. 224(1), 346–359 (2009)
Article MathSciNet Google Scholar
**, S., Li, L., Liu, J.-G.: Random batch methods (RBM) for interacting particle systems. J. Comput. Phys. 400, 108877 (2020)
Article MathSciNet Google Scholar
**, S., Li, L., Liu, J.-G.: Convergence of the random batch method for interacting particles with disparate species and weights. SIAM J. Numer. Anal. 59(2), 746–768 (2021)
Article MathSciNet Google Scholar
Kamin, S., Vázquez, J.L.: Fundamental solutions and asymptotic behaviour for the $p$-Laplacian equation. Rev. Mat. Iberoam. 4(2), 339–354 (1988)
Article MathSciNet Google Scholar
Kruse, R., Wu, Y.: Error analysis of randomized Runge-Kutta methods for differential equations with time-irregular coefficients. Comput. Methods Appl. Math. 17(3), 479–498 (2017)
Article MathSciNet Google Scholar
Kruse, R., Wu, Y.: A randomized and fully discrete Galerkin finite element method for semilinear stochastic evolution equations. Math. Comp. 88(320), 2793–2825 (2019)
Article MathSciNet Google Scholar
Kruse, R., Wu, Y.: A randomized Milstein method for stochastic differential equations with non-differentiable drift coefficients. Discrete Contin. Dyn. Syst. Ser. B 24(8), 3475–3502 (2019)
MathSciNet Google Scholar
Kufner, A.: Weighted Sobolev Spaces. Teubner, Leipzig (1980)
Google Scholar
Kuijper, A.: $p$-Laplacian driven image processing. In: 2007 IEEE International conference on image processing, Vol. 5, pp. V-257–V-260 (2007)
Leoni, G.: A First Course in Sobolev Spaces. American Mathematical Society, Providence, RI (2009)
Google Scholar
Li, L., Xu, Z., Zhao, Y.: A random-batch Monte Carlo method for many-body systems with singular kernels. SIAM J. Sci. Comput. 42(3), A1486–A1509 (2020)
Article MathSciNet Google Scholar
Lions, P.-L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Article MathSciNet Google Scholar
Mathew, T.: Domain decomposition methods for the numerical solution of partial differential equations. Springer, Berlin (2008)
Book Google Scholar
Mathew, T.P., Polyakov, P.L., Russo, G., Wang, J.: Domain decomposition operator splittings for the solution of parabolic equations. SIAM J. Sci. Comput. 19(3), 912–932 (1998)
Article MathSciNet Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statistics 22, 400–407 (1951)
Article MathSciNet Google Scholar
Roubíček, T.: Nonlinear partial differential equations with applications, 2nd edn. Birkhäuser, Basel (2013)
Book Google Scholar
Stengle, G.: Numerical methods for systems with measurable coefficients. Appl. Math. Lett. 3(4), 25–29 (1990)
Article MathSciNet Google Scholar
Stengle, G.: Error analysis of a randomized numerical method. Numer. Math. 70(1), 119–128 (1995)
Article MathSciNet Google Scholar
Stone, D., Geiger, S., Lord, G.J.: Asynchronous discrete event schemes for PDEs. J. Comput. Phys. 342, 161–176 (2017)
Article MathSciNet Google Scholar
Vázquez, J.L.: The porous medium equation. Oxford mathematical monographs. The Clarendon Press, Oxford University Press, Oxford (2007)
Google Scholar
Veldman, D.W.M., Zuazua, E.: A framework for randomized time-splitting in linear-quadratic optimal control. Numer. Math. 151(2), 495–549 (2022)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was partially supported by the Crafoord foundation through the grant number 20220657 and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation. The computations were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at LUNARC partially funded by the Swedish Research Council through grant agreement no. 2018-05973.

Funding

Open access funding provided by Lund University.

Author information

Authors and Affiliations

Centre for Mathematical Sciences, Lund University, P.O. Box 118, 22100, Lund, Sweden
Monika Eisenmann & Tony Stillfjord

Authors

Monika Eisenmann
View author publications
You can also search for this author in PubMed Google Scholar
Tony Stillfjord
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Monika Eisenmann.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Auxiliary results

In this appendix, we collect a few useful inequalities and technical results that are needed in the paper.

Lemma A.1

(Discrete Grönwall inequality) Let $(u_n)_{n \in N}$ and $(b_n)_{n \in N}$ be two nonnegative sequences that satisfy, for given $a \in [0,\infty )$ and $n \in \mathbb {N}$, that $u_n \le a + \sum _{i=1}^{n} b_i u_i$. For $b_n \in [0,1)$, it then follows that

$$\begin{aligned} u_n \le \frac{a}{1 - b_n} \exp \Big (\sum _{i=1}^{n-1} \frac{b_i}{1 - b_n} \Big ). \end{aligned}$$

Lemma A.2

(Weighted Young’s inequality) For $a, b \in [0,\infty )$, $\varepsilon \in (0,\infty )$, and $p, q \in (1,\infty )$ such that $\frac{1}{p} + \frac{1}{q} = 1$, it follows that $a b \le \varepsilon a^p + (\varepsilon p)^{- \frac{q}{p}} q^{-1} b^q$.

A proof can be found in [12, Appendix B.2 d].

Lemma A.3

Let Assumptions 2.1–2.5 be fulfilled. Let $Q \subseteq V$ be a countable, dense subset of V, $V_{\xi }$ and H. Let the function $g :\Omega \times H \rightarrow V^*$ be given. Further, for $v \in H$ the map** $\omega \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is measurable for $v \in H$ and $w \in Q$ and for almost every $\omega \in \Omega $ the map** $v \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ continuous for every $v, w \in V_{\xi (\omega )}$. For every $\omega \in \Omega $, the function g has a unique root which lies in $V_{\xi (\omega )}$. We denote this root by $r(\omega ) \in V_{\xi (\omega )} $, i.e. $g(\omega , r(\omega )) = 0$. Then the function $r :\Omega \rightarrow H$ is measurable.

A similar proof can be found in [5, Lemma 2.1.4] and [8, Lemma 4.3]. The main difference in this version is that the function g maps from $\Omega \times H$ instead of $\Omega \times V$ and therefore some small technical alterations have to be considered.

Proof of Lemma A.3

To prove that r is measurable, we show that $r^{-1}(B) \in \mathcal {F}$ for every open set B in H. First, we notice that

$$\begin{aligned} r^{-1} (B)&= \{ \omega \in \Omega :r(\omega ) \in B \} \\&= \{ \omega \in \Omega :\text { there exists } u \in B \text { such that } g(\omega , u) = 0 \} \\&= \{ \omega \in \Omega :\text { there exists } u \in B \text { such that } \langle g(\omega , u), v \rangle _{V^*\times V} = 0\\&\hspace{7cm}\text { for all } v \in Q, \Vert v\Vert _V = 1\} \\&= \bigcap _{ v \in Q, \Vert v\Vert _V=1} \{ \omega \in \Omega :\text { there exists } u \in B \text { such that } \langle g(\omega , u), v \rangle _{V^*\times V} = 0 \} \\&= \bigcap _{ v \in Q, \Vert v\Vert _V=1} \bigcup _{ u \in B} \{ \omega \in \Omega :\langle g(\omega , u), v \rangle _{V^*\times V} = 0 \}. \end{aligned}$$

Since $\omega \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is measurable for $v \in H$ and $w \in Q$, the set

$$\begin{aligned} \{ \omega \in \Omega :\langle g(\omega , u), v \rangle _{V^*\times V} = 0 \} = \big (\langle g(\cdot , u), v \rangle _{V^*\times V}\big )^{-1}(0) \end{aligned}$$

is an element of $\mathcal {F}_{\xi }$ for $v \in Q$ and $u \in H$. If the set B only contains a countable amount of elements, it follows directly that $r^{-1} (B) \in \mathcal {F}_{\xi }$.

In the following, it remains to address the cases where B is not countable. For $\varepsilon \in (0,\infty )$ small enough and a fixed $v \in Q$, we introduce the multi-valued map**

$$\begin{aligned} r_{\varepsilon }^v :\Omega \rightarrow 2^H, \quad r_{\varepsilon }^v (\omega )&= \{ u \in H :|\langle g(\omega , u), v \rangle _{V^*\times V} |< \varepsilon \}. \end{aligned}$$

For $B \subseteq H$ open, it follows that

$$\begin{aligned} \big (r_{\varepsilon }^v\big )^{-1}(B)&= \{ \omega \in \Omega :r_{\varepsilon }^v(\omega ) \in B \}\\&= \{ \omega \in \Omega :\text { there exists } u \in B \text { such that } |\langle g(\omega , u), v \rangle _{V^*\times V} |< \varepsilon \}\\&= \bigcup _{ u \in B} \{ \omega \in \Omega :|\langle g(\omega , u), v \rangle _{V^*\times V} |< \varepsilon \}. \end{aligned}$$

In the following, we will show that

$$\begin{aligned} \big (r_{\varepsilon }^v\big )^{-1}(B) = \big (r_{\varepsilon }^v\big )^{-1}(B \cap Q). \end{aligned}$$

Since $B \cap Q \subseteq B$, it directly follows that $\big (r_{\varepsilon }^v\big )^{-1}(B \cap Q) \subseteq \big (r_{\varepsilon }^v\big )^{-1}(B)$. It remains to verify that $\big (r_{\varepsilon }^v\big )^{-1}(B) \subseteq \big (r_{\varepsilon }^v\big )^{-1}(B \cap Q)$. Let $\omega \in \big (r_{\varepsilon }^v\big )^{-1}(B)$, i.e. there exists $u \in B$ such that

$$\begin{aligned} u \in r_{\varepsilon }^v(\omega ) = \{ w \in H :|\langle g(\omega , w), v \rangle _{V^*\times V} |< \varepsilon \}. \end{aligned}$$

Since $v \mapsto \langle g(\omega , v), w \rangle _{V^*\times V}$ is continuous for every $v, w \in V_{\xi (\omega )}$ and Q is dense in H, there exists $u_Q \in B \cap Q$ such that $|\langle g(\omega , u_Q), v \rangle _{V^*\times V} |< \varepsilon $. Thus, $u_Q \in r_{\varepsilon }^v(\omega )$ and in particular $\omega \!\in \! \big (r_{\varepsilon }^v\big )^{-1}(B \!\cap \! Q)$. This shows altogether that $\big (r_{\varepsilon }^v\big )^{-1}\!(B) \!=\! \big (r_{\varepsilon }^v\big )^{-1}\!(B \!\cap \! Q)$.

We can now finish the proof as

$$\begin{aligned} r^{-1}(B) = \bigcap _{v \in Q, \Vert v\Vert _V = 1} \bigcap _{i \in \mathbb {N}} \big (r_{\frac{1}{i}}^v\big )^{-1}(B) = \bigcap _{v \in Q, \Vert v\Vert _V = 1} \bigcap _{i \in \mathbb {N}} \big (r_{\frac{1}{i}}^v\big )^{-1}(B \cap Q) \in \mathcal {F}_{\xi } \end{aligned}$$

is fulfilled. $\square $

Lemma A.4

Let Assumptions 2.1–2.5 be fulfilled. Further, let $f_{\xi _n}$ be an element of $C([0,T]; V_{\xi _n}^*)$ almost surely for every $n \in \{1.\dots .N\}$. For $u' \in C^{\gamma }([0,T];H)$, $\gamma \in (0,1]$, and a maximal step size $h = \max _{i \in \{1,\dots ,N\}} h_i$, it then follows that

$$\begin{aligned}&\sum _{i=1}^{N} h_i \mathbb {E}_{\xi _i} \Big [ \Big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big ) \,\textrm{d}t\Big \Vert _H^2 \Big ] \\&\quad \le 2 \sum _{i = 1}^{N} h_i \mathbb {E}_{\xi _i} \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^2 \big ] \\&\quad \quad + 2 h^{2 \gamma } |u' |_{C^{\gamma }([0,T];H)}^2 T \end{aligned}$$

and

$$\begin{aligned}&\sum _{i=1}^{N} h_i \Big \Vert \mathbb {E}_{\xi _i} \Big [ f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big ) \,\textrm{d}t \Big ] \Big \Vert _H^2\\&\quad \le h^{2\gamma } |u' |_{C^{\gamma }([0,T];H)}^2 T, \end{aligned}$$

where $|u' |_{C^{\gamma }([0,T];H)}$ is the Hölder semi-norm with values in H of the function $u'$.

Proof

To prove the first bound, we find that

$$\begin{aligned}&\sum _{i=1}^{N} h_i \mathbb {E}_{\xi _i} \Big [ \Big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big ) \,\textrm{d}t\Big \Vert _H^{2} \Big ]\\&\quad \le 2 \sum _{i = 1}^{N} h_i \mathbb {E}_i \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^{2}\big ]\\&\quad \quad + 2 \sum _{i = 1}^{N} h_i \mathbb {E}_i \Big [ \Big \Vert \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big ( f(t_i) - A(t_i) u(t_i) - (f(t) - A(t) u(t)) \big ) \,\textrm{d}t \Big \Vert _H^{2}\Big ]\\&\quad \le 2 \sum _{i = 1}^{N} h_i \mathbb {E}_i \big [ \big \Vert f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i)- (f(t_i) - A(t_i) u(t_i)) \big \Vert _H^{2}\big ]\\&\quad \quad + 2 \sum _{i = 1}^{N} \frac{1}{h_i} \Big \Vert \int _{t_{i-1}}^{t_i} (u'(t_i) - u'(t)) \,\textrm{d}t \Big \Vert _H^{2}. \end{aligned}$$

To further bound the term in the last row, we apply Hölder’s inequality and the regularity condition $u' \in C^{\gamma }([0,T];H)$. We then find that

$$\begin{aligned} 2 \sum _{i = 1}^{N} \frac{1}{h_i} \Big \Vert \int _{t_{i-1}}^{t_i} (u'(t_i) - u'(t)) \,\textrm{d}t \Big \Vert _H^{2}&\le 2 \sum _{i = 1}^{N} \int _{t_{i-1}}^{t_i} \Vert u'(t_i) -u'(t)\Vert _H^{2} \,\textrm{d}t\\&\le 2 h^{2 \gamma } |u' |_{C^{\gamma }([0,T];H)}^{2} T. \end{aligned}$$

It remains to prove the second estimate of the lemma. Recall that $\mathbb {E}_{\xi _i} \big [ f_{\xi _i}(t_i) \big ] = f(t_i)$ and $\mathbb {E}_{\xi _i} \big [ A_{\xi _i}(t_i) u(t_i)\big ] = A(t_i) u(t_i)$ is fulfilled by Assumption 2.5. Using these equalities, it follows that

$$\begin{aligned}&\sum _{i=1}^{N} h_i \Big \Vert \mathbb {E}_{\xi _i} \Big [ f_{\xi _i}(t_i) - A_{\xi _i}(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} \big (f(t) - A(t) u(t)\big ) \,\textrm{d}t \Big ] \Big \Vert _H^2\\&\quad = \sum _{i=1}^{N} h_i \Big \Vert f(t_i) - A(t_i) u(t_i) - \frac{1}{h_i} \int _{t_{i-1}}^{t_i} ( f(t) - A(t) u(t) ) \,\textrm{d}t\Big \Vert _H^2\\&\quad \le \sum _{i=1}^{N} \int _{t_{i-1}}^{t_i} \Vert u'(t_i) - u'(t) \Vert _H^2 \,\textrm{d}t \le h^{2\gamma } |u' |_{C^{\gamma }([0,T];H)}^2 T. \end{aligned}$$

$\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Eisenmann, M., Stillfjord, T. A randomized operator splitting scheme inspired by stochastic optimization methods. Numer. Math. 156, 435–461 (2024). https://doi.org/10.1007/s00211-024-01396-w

Download citation

Received: 18 October 2022
Revised: 21 December 2023
Accepted: 21 January 2024
Published: 26 February 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00211-024-01396-w

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A randomized operator splitting scheme inspired by stochastic optimization methods

Abstract

Similar content being viewed by others

Splitting methods and numerical approximations for a coupled local/nonlocal diffusion model

Strong convergence analysis of iterative solvers for random operator equations

A splitting algorithm for stochastic partial differential equations driven by linear multiplicative noise

1 Introduction

2 Setting

Assumption 2.1

Assumption 2.2

Assumption 2.3

Remark 1

Assumption 2.4

Assumption 2.5

3 Application: Domain decomposition

3.1 Deterministic domain decomposition

Lemma 3.1

Proof

3.2 Randomized scheme

Lemma 3.2

Proof

4 Solution is well-defined

Lemma 4.1

Proof

Lemma 4.2

5 Stability and convergence in expectation

Theorem 5.1

Proof

Theorem 5.2

Proof

Remark 2

6 Numerical experiments

6.1 A nonlinear example

6.2 A linear example

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Auxiliary results

Auxiliary results

Lemma A.1

Lemma A.2

Lemma A.3

Proof of Lemma A.3

Lemma A.4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation