Abstract
Dynamical low-rank approximation has become a valuable tool to perform an on-the-fly model order reduction for prohibitively large matrix differential equations. A core ingredient is the construction of integrators that are robust to the presence of small singular values and the resulting large time derivatives of the orthogonal factors in the low-rank matrix representation. Recently, the robust basis-update & Galerkin (BUG) class of integrators has been introduced. These methods require no steps that evolve the solution backward in time, often have favourable structure-preserving properties, and allow for parallel time-updates of the low-rank factors. The BUG framework is flexible enough to allow for adaptations to these and further requirements. However, the BUG methods presented so far have only first-order robust error bounds. This work proposes a second-order BUG integrator for dynamical low-rank approximation based on the midpoint quadrature rule. The integrator first performs a half-step with a first-order BUG integrator, followed by a Galerkin update with a suitably augmented basis. We prove a robust second-order error bound which in addition shows an improved dependence on the normal component of the vector field. These rigorous results are illustrated and complemented by a number of numerical experiments.
Avoid common mistakes on your manuscript.
1 Introduction
Dynamical low-rank approximation of time-dependent matrices [37] has proven to be an efficient model order reduction technique for applications from widely varying fields including plasma physics [8, 14, 15, 20, 22,23,24, 26, 27, 58], kinetic shallow water models [38], uncertainty quantification [2, 3, 17, 28, 33, 39, 43, 44, 46, 51], and machine learning [52,53,54, 59]. These problems can be written as a prohibitively large matrix differential equation for \({{{\textbf {A}}}}(t)\in {{\mathbb {R}}}^{m\times n}\),
In dynamical low-rank approximation, the solution \({{{\textbf {A}}}}(t)\) is approximated by evolving matrices \({{{\textbf {Y}}}}(t)\in {{\mathbb {R}}}^{m\times n}\) of low rank, which are computed directly without first computing the solution \({{{\textbf {A}}}}(t)\). Rank-r matrices are represented in a non-unique factorized SVD-like form
where the slim matrices \({{{\textbf {U}}}}\in {{\mathbb {R}}}^{m\times r}\) and \({{{\textbf {V}}}}\in {{\mathbb {R}}}^{n\times r}\) each have r orthonormal columns, and the small matrix \({{{\textbf {S}}}}\in {{\mathbb {R}}}^{r\times r}\) is invertible (but not necessarily diagonal).
To preserve the low-rank format of \({{{\textbf {Y}}}}\) over time, dynamical low-rank approximation projects the right-hand side of the differential equation onto the tangent space at the current approximation of the manifold of rank-r matrices:
The orthogonal projection \(P_r({{{\textbf {Y}}}})\) onto the tangent space at \({{{\textbf {Y}}}}= {{{\textbf {U}}}}{{{\textbf {S}}}}{{{\textbf {V}}}}^\top \) is an alternating sum of three subprojections [37, Lemma 4.1]:
The projected differential Eq. (3) can be equivalently written as a system of differential equations for the factors \({{{\textbf {U}}}}(t)\), \({{{\textbf {S}}}}(t)\), \({{{\textbf {V}}}}(t)\) [37, Proposition 2.1], which contain, however, the inverse of \({{{\textbf {S}}}}\) as a factor on the right-hand side of the differential equations for \({{{\textbf {U}}}}\) and \({{{\textbf {V}}}}\). This causes problems since \({{{\textbf {S}}}}\) typically has small singular values: to obtain good accuracy, only small singular values can be discarded in the approximation, and the smallest retained singular values are typically not much larger than the largest discarded singular values. As a consequence, standard time integrators need to use small stepsizes that are proportional to the smallest nonzero singular value; see e.g. [35].
There exist dynamical low-rank integrators that are robust to the presence of small singular values and (as a likely consequence) rapidly changing orthonormal factors \({{{\textbf {U}}}}\) and \({{{\textbf {V}}}}\). These robust integrators allow for much larger stepsizes irrespective of the singular values and of derivatives of \({{{\textbf {U}}}}\) and \({{{\textbf {V}}}}\):
-
the projector-splitting integrator [35, 42], which uses a Lie–Trotter or Strang splitting of the tangent-space projection (4) in (3);
-
the Basis Update & Galerkin (BUG) integrators of [9, 11], which first update (and possibly augment) the basis matrices \({{{\textbf {U}}}}\) and \({{{\textbf {V}}}}\) and then update \({{{\textbf {S}}}}\) by a Galerkin approximation to the differential Eq. (1) in the updated/augmented bases; in the augmented case, this is followed by a truncation back to lower rank via an SVD of the augmented matrix \({{{\textbf {S}}}}\). There is also a robust fully parallel version in \({{{\textbf {U}}}},{{{\textbf {V}}}},{{{\textbf {S}}}}\) [10].
-
the projection methods of [36], where a Runge–Kutta method is directly applied to the projected differential Eq. (3) and the internal stages are truncated back to lower rank by an SVD; see also the related retraction-based methods in [13, 55]. Moreover, we include the projected exponential methods introduced in [7] within this class of methods.
Additional integrators, expected to be robust to small singular values based on numerical evidence but without proof, are discussed in [5, 17, 30] and [45].
Although second order is widely observed for the Strang projector splitting (see, e.g., [8, 31]), the known proof of robust convergence only yields order 1 [35]. The different variants of the BUG integrators also have robust first-order error bounds [9,10,11]. In some situations, however, second order can be observed numerically (a phenomenon that is not well understood; see also Sect. 5.2). The projected Runge–Kutta methods of [36] are so far the only integrators that are known to have robust second-order (or higher-order) error bounds.
In this paper, we propose a BUG method based on the midpoint quadrature rule. This method is proved to admit second-order error bounds that are robust to small singular values. The rigorous convergence analysis is complemented by numerical experiments for a heat equation, a non-stiff discrete Schrödinger equation, and the Vlasov equation. We include a comparison of the error behaviour of the projected Runge method of [36], which is also based on the midpoint quadrature rule and known to be of robust second order, and the new midpoint BUG method.
We expect that the second-order method of this paper extends from low-rank matrices to tree tensor networks in a similar way as was done for the rank-adaptive BUG method in [12], with applications in quantum dynamics. This extension to tensor differential equations is, however, beyond the scope of the present paper.
2 Recap: the augmented BUG integrator of [9]
One time step of integration from time \(t_0\) to \(t_1=t_0+h\), starting from a factored rank-\(r_0\) matrix \({{{\textbf {Y}}}}_0={{{\textbf {U}}}}_0{{{\textbf {S}}}}_0{{{\textbf {V}}}}_0^\top \), computes an updated factorization \({{{\textbf {Y}}}}_1={{{\textbf {U}}}}_1{{{\textbf {S}}}}_1 {{{\textbf {V}}}}_1^\top \) of rank \(r_1 \le 2r_0\). In the following algorithm we let \(r=r_0\) and we put a hat on quantities related to rank 2r.
-
1.
Basis Update Compute augmented basis matrices \(\widehat{{{\textbf {U}}}}\in {{\mathbb {R}}}^{m\times {\hat{r}}}\) and \(\widehat{{{\textbf {V}}}}\in {{\mathbb {R}}}^{n\times {\hat{r}}}\) (typically \(\widehat{r}=2r\)):
K-step: Integrate from \(t=t_0\) to \(t_1\) the \(m \times r\) matrix differential equation
$$\begin{aligned} \dot{{\textbf {K}}}(t) = {{{\textbf {F}}}}(t, {\textbf {K}}(t) {{{\textbf {V}}}}_0^\top ) {{{\textbf {V}}}}_0, \qquad {\textbf {K}}(t_0) = {{{\textbf {U}}}}_0 {{{\textbf {S}}}}_0. \end{aligned}$$(5)Determine the columns of \(\widehat{{{\textbf {U}}}}\in {{\mathbb {R}}}^{m\times {{\hat{r}}}}\) as an orthonormal basis of the range of the \(m\times 2r\) matrix \(({{{\textbf {U}}}}_0,{\textbf {K}}(t_1))\) (e.g. by QR decomposition), in short
$$\begin{aligned} \widehat{{{\textbf {U}}}}= \text {orth}({{{\textbf {U}}}}_0,{\textbf {K}}(t_1)), \end{aligned}$$and compute the \(\widehat{r}\times r\) matrix \(\widehat{{{\textbf {M}}}}= \widehat{{{\textbf {U}}}}^\top {{{\textbf {U}}}}_0\).
L-step: Integrate from \(t=t_0\) to \(t_1\) the \(n \times r\) matrix differential equation
$$\begin{aligned} \dot{{\textbf {L}}}(t) ={{{\textbf {F}}}}(t, {{{\textbf {U}}}}_0 {\textbf {L}}(t)^\top )^\top {{{\textbf {U}}}}_0, \qquad {\textbf {L}}(t_0) = {{{\textbf {V}}}}_0 {{{{\textbf {S}}}}}_0^\top . \end{aligned}$$(6)Compute \(\widehat{{{\textbf {V}}}}= \text {orth}({{{\textbf {V}}}}_0,{\textbf {L}}(t_1))\) and the \(\widehat{r}\times r\) matrix \(\widehat{{{\textbf {N}}}}= \widehat{{{\textbf {V}}}}^\top {{{\textbf {V}}}}_0\).
-
2.
Galerkin method with augmented bases Augment and update \({{{{\textbf {S}}}}}_0 \rightarrow {\widehat{{{\textbf {S}}}}}(t_1)\):
S-step: Integrate from \(t=t_0\) to \(t_1\) the \(\widehat{r} \times \widehat{r}\) matrix differential equation
$$\begin{aligned} \dot{\widehat{{{\textbf {S}}}}}(t) = \widehat{{{\textbf {U}}}}^\top {{{\textbf {F}}}}(t, \widehat{{{\textbf {U}}}}\widehat{{{\textbf {S}}}}(t) \widehat{{{\textbf {V}}}}^\top ) \widehat{{{\textbf {V}}}}, \qquad \widehat{{{\textbf {S}}}}(t_0) = \widehat{{{\textbf {M}}}}{{{\textbf {S}}}}_0 \widehat{{{\textbf {N}}}}^\top . \end{aligned}$$(7)
This augmented BUG method yields the rank-\(\widehat{r}\) approximation
The \(m\times r\), \(n\times r\) and \(\widehat{r}\times \widehat{r}\) matrix differential equations in the substeps are solved approximately using a standard integrator, e.g., a Runge–Kutta method or an exponential integrator when \({{{\textbf {F}}}}\) is predominantly linear. The S-step is a Galerkin method for the differential Eq. (1) in the space of matrices \(\widehat{{{\textbf {U}}}}\widehat{{{\textbf {S}}}}\widehat{{{\textbf {V}}}}^\top \) generated by the augmented basis matrices \(\widehat{{{\textbf {U}}}}\) and \(\widehat{{{\textbf {V}}}}\). Note that for \({{{\textbf {Y}}}}_0={{{\textbf {U}}}}_0{{{\textbf {S}}}}_0{{{\textbf {V}}}}_0^\top \), we have the same starting value \(\widehat{{{\textbf {U}}}}\widehat{{{\textbf {S}}}}(t_0) \widehat{{{\textbf {V}}}}^\top =\widehat{{{\textbf {U}}}}\widehat{{{\textbf {U}}}}^\top {{{\textbf {Y}}}}_0 \widehat{{{\textbf {V}}}}\widehat{{{\textbf {V}}}}^\top ={{{\textbf {Y}}}}_0\), since the columns of \({{{\textbf {U}}}}_0\) are in the range of \(\widehat{{{\textbf {U}}}}\) and those of \({{{\textbf {V}}}}_0\) are in the range of \(\widehat{{{\textbf {V}}}}\).
Truncation Using an SVD of \({{{\textbf {S}}}}(t_1)\), the result is then truncated to a lower rank, either to the original rank r or by prescribing a truncation tolerance for singular values, which yields a rank-adaptive algorithm; see [9] for the details. The resulting approximation after one time step is then given in factorized form,
The step rejection criterion of [10, Section 3.3] can further be added. With this criterion, an arbitrary rank increase becomes possible (e.g., when starting from rank 1) and the normal component of the vector field is estimated.
3 A midpoint BUG integrator
We propose the following low-rank integrator for the matrix differential Eq. (1). Given \({{{\textbf {Y}}}}_0={{{\textbf {U}}}}_0{{{\textbf {S}}}}_0{{{\textbf {V}}}}_0^\top \approx {{{\textbf {A}}}}(t_0)\) in factored form, the algorithm computes a rank-augmented approximation \(\overline{{{{\textbf {Y}}}}}_1 = \overline{{{{\textbf {U}}}}}\, \overline{{{{\textbf {S}}}}}_1 \overline{{{{\textbf {V}}}}}^\top \approx {{{\textbf {A}}}}(t_1)\) of rank \(\overline{r} \le 4r\) (which is then truncated to a lower rank) for \(t_1=t_0+h\). We denote the midpoint as \(t_{1/2}=t_0+h/2\).
The following nested method can be viewed as a BUG version of Runge’s second-order method (see [29, II.1]), which is based on the midpoint quadrature rule.
-
1.
Midpoint approximation Make a step with step size h/2 with the augmented BUG integrator of Sect. 2 to compute the approximation of rank \(\widehat{r} \le 2r\),
$$\begin{aligned} \widehat{{{\textbf {Y}}}}_{1/2}=\widehat{{{\textbf {U}}}}_{1/2}\widehat{{{\textbf {S}}}}_{1/2}\widehat{{{\textbf {V}}}}_{1/2}^\top \approx {{{\textbf {A}}}}(t_{1/2}). \end{aligned}$$ -
2.
Galerkin step Compute augmented orthonormal bases of rank \(\overline{r} \le 4r\),
$$\begin{aligned} \begin{aligned} \overline{{{{\textbf {U}}}}}&= \textrm{orth}(\widehat{{{\textbf {U}}}}_{1/2}, h{{{\textbf {F}}}}(t_{1/2},\widehat{{{\textbf {Y}}}}_{1/2}) \widehat{{{\textbf {V}}}}_{1/2}) \ \hbox { and }\\ \overline{{{{\textbf {V}}}}}&= \textrm{orth}(\widehat{{{\textbf {V}}}}_{1/2}, h{{{\textbf {F}}}}(t_{1/2},\widehat{{{\textbf {Y}}}}_{1/2})^\top \widehat{{{\textbf {U}}}}_{1/2}), \end{aligned} \end{aligned}$$(10)and the \(\overline{r} \times r\) matrices \(\overline{{{{\textbf {M}}}}}=\overline{{{{\textbf {U}}}}}^\top {{{\textbf {U}}}}_0\) and \(\overline{{{{\textbf {N}}}}}=\overline{{{{\textbf {V}}}}}^\top {{{\textbf {V}}}}_0\). Integrate, from \(t=t_0\) to \(t_1\), the \(\overline{r} \times \overline{r}\) matrix differential equation
$$\begin{aligned} \dot{\overline{{{{\textbf {S}}}}}}(t) = \overline{{{{\textbf {U}}}}}^\top {{{\textbf {F}}}}(t, \overline{{{{\textbf {U}}}}}\, \overline{{{{\textbf {S}}}}}(t) \overline{{{{\textbf {V}}}}}^\top ) \overline{{{{\textbf {V}}}}}, \qquad \overline{{{{\textbf {S}}}}}(t_0) = \overline{{{{\textbf {M}}}}}{{{\textbf {S}}}}_0 \overline{{{{\textbf {N}}}}}^\top . \end{aligned}$$(11)
This gives
as the rank-augmented approximation to \({{{\textbf {A}}}}(t_1)\). Then, the result is truncated via an SVD of \(\overline{{{{\textbf {S}}}}}(t_1)\) to the original rank r or according to a given truncation error tolerance, as in [9]. The resulting approximation after one time step is then given in factorized form,
Then, \({{{\textbf {Y}}}}_1\) is taken as the starting value for the next step, which computes \({{{\textbf {Y}}}}_2\) in factorized form, etc.
Remark 1
(Variants) In step 1 above (midpoint approximation) we could use any robust integrator with a first-order error bound. The algorithm described uses the augmented BUG integrator [9] and requires an intermediate rank of at most 4r. We therefore will henceforth call it the Midpoint BUG (4r) scheme. Alternatively, we could also use the fixed-rank BUG integrator of [11]. The Galerkin step is then made with the augmented orthonormal bases of rank \(\overline{r} \le 3r\) given as
This is computationally cheaper because the intermediate rank increases only up to 3r, but is also less accurate. A similar convergence analysis can be done for this variant, which we will call the Midpoint BUG (3r) scheme. We further note that a rank truncation with a given error tolerance \(\vartheta \) can already be done in (10) or (14), so that the ranks are reduced early on.
Remark 2
(A BUG method based on the trapezoidal quadrature rule) The core idea of this work can be used to derive further high-order versions of the BUG integrator. For example, a second-order BUG integrator can be derived based on the trapezoidal quadrature rule. It can be viewed as a BUG version of Heun’s second-order method, which is also based on the trapezoidal quadrature rule. In this case, instead of computing the BUG solution \(\widehat{{{\textbf {Y}}}}_{1/2}\) at the half point, one computes the augmented BUG solution at time \(t_1\), denoted by \(\widehat{{{\textbf {Y}}}}_1 = \widehat{{{\textbf {U}}}}_1\widehat{{{\textbf {S}}}}_1\widehat{{{\textbf {V}}}}_1^{\top }\), with a forward Euler step in the differential equations for \({{{\textbf {K}}}}\) and \({{{\textbf {L}}}}\) and augments the basis matrices according to
Similar to the derivation in Sect. 4, a robust second-order error bound can be shown for this integrator. However, this error bound does not share the favourable dependence on normal components of the vector field.
Remark 3
(Structure preservation) The augmented midpoint-BUG step preserves norm, energy, and dissipation up to the truncation tolerance as does the augmented BUG integrator in [9], in the same situations and by the same proofs. In particular, [9, Lemma 1] directly holds for the augmented midpoint-BUG integrator due to the fact that the augmented basis includes the basis of the previous time step. Then, preservation of norm, energy, and dissipation up to the truncation tolerance follow from the Galerkin properties of the S-step.
4 Robust second-order error bound
We make the same assumptions on the function \({{{\textbf {F}}}}\) in (1) as in [9, 11, 35, 36]. Assume that the following conditions hold in the Frobenius norm \(\Vert \cdot \Vert =\Vert \cdot \Vert _F\):
-
\({{{\textbf {F}}}}\) is Lipschitz-continuous and bounded: for all \({{{\textbf {Y}}}}, {{{\textbf {Z}}}}\in \mathbb {R}^{m \times n}\) and \(0\le t \le T\),
$$\begin{aligned} \Vert {{{\textbf {F}}}}(t, {{{\textbf {Y}}}}) - {{{\textbf {F}}}}(t, {{{\textbf {Z}}}}) \Vert \le L \Vert {{{\textbf {Y}}}}- {{{\textbf {Z}}}}\Vert , \qquad \Vert {{{\textbf {F}}}}(t, {{{\textbf {Y}}}}) \Vert \le B. \end{aligned}$$(15) -
The normal component of \({{{\textbf {F}}}}(t, {{{\textbf {Y}}}})\) is small: with \(P_r^\perp ({{{\textbf {Y}}}})= I - P_r({{{\textbf {Y}}}})\), see (4),
$$\begin{aligned} \Vert P_r^\perp ({{{\textbf {Y}}}}) {{{\textbf {F}}}}(t, {{{\textbf {Y}}}}) \Vert \le \varepsilon _r \quad \text {and}\quad \Vert P_{{\hat{r}}}^\perp (\widehat{{{\textbf {Y}}}}) {{{\textbf {F}}}}(t, \widehat{{{\textbf {Y}}}}) \Vert \le \varepsilon _{{\hat{r}}} \end{aligned}$$(16)for all \({{{\textbf {Y}}}}\in \mathcal {M}_r\) and \(\widehat{{{\textbf {Y}}}}\in \mathcal {M}_{{\hat{r}}}\) in a neighbourhood of \({{{\textbf {A}}}}(t)\) and for \(0\le t \le T\).
Note that possibly \(\varepsilon _{{\hat{r}}} \ll \varepsilon _r\). Under these conditions we have the following local error bound for the midpoint BUG integrator of Sect. 3.
Theorem 1
(Local error bound) Assume \({{{\textbf {A}}}}(t_0)={{{\textbf {Y}}}}_0={{{\textbf {U}}}}_0{{{\textbf {S}}}}_0{{{\textbf {V}}}}_0^\top \) is of rank r. Then, the local error is bounded by
where C depends only on L and B in (15), on the bound of third derivatives of the exact solution \({{{\textbf {A}}}}(t)\) of (1), and on an upper bound of the stepsize h.
Remark 4
(Rank truncation) If \(\overline{{{{\textbf {Y}}}}}_1\) is truncated back to rank r to yield \({{{\textbf {Y}}}}_1\), then the error bound becomes
as is shown by the argument in the proof of [36, Lemma 3] (see inequality (4.2) there). On the other hand, rank truncation with a prescribed error tolerance \(\vartheta \) just adds an extra term \(\vartheta \) to the error bound in Theorem 1.
Remark 5
(Variants) For the Midpoint-BUG (3r) scheme (see Remark 1), there is the larger error bound \( \Vert \overline{{{{\textbf {Y}}}}}_1 - {{{\textbf {A}}}}(t_1) \Vert \le Ch(h^2 + \varepsilon _r). \) This is shown with essentially the same proof.
Proof
The proof of Theorem 1 is subdivided into three parts (a)–(c).
(a) We start from the fundamental theorem of calculus and the midpoint quadrature rule:
Using the known \(O(h(h+\varepsilon _r))\) local error bound of the augmented BUG integrator of [9] and the Lipschitz continuity of \({{{\textbf {F}}}}\), and further the \(\varepsilon _{{\hat{r}}}\)-bound for the normal component of \({{{\textbf {F}}}}\) at \(\widehat{{{\textbf {Y}}}}_{1/2}\), we find that
with the tangential component, see (4),
For the chosen augmented bases \(\overline{{{{\textbf {U}}}}}\) and \(\overline{{{{\textbf {V}}}}}\), in which both \(\widehat{{{\textbf {U}}}}_{1/2}\), \({{{\textbf {F}}}}(t_{1/2},\widehat{{{\textbf {Y}}}}_{1/2})\widehat{{{\textbf {V}}}}_{1/2}\) and \(\widehat{{{\textbf {V}}}}_{1/2}\), \({{{\textbf {F}}}}(t_{1/2},\widehat{{{\textbf {Y}}}}_{1/2})^\top \widehat{{{\textbf {U}}}}_{1/2}\) are included in \(\overline{{{{\textbf {U}}}}}\) and \(\overline{{{{\textbf {V}}}}}\), respectively, we obtain the key relations
With the shorthand notation
this yields the bounds
which further imply
(b) For
we thus have
Since \({{{\textbf {A}}}}(t_0)={{{\textbf {Y}}}}_0= {{{\textbf {U}}}}_0 {{{\textbf {S}}}}_0 {{{\textbf {V}}}}_0^\top \) and since the ranges of \({{{\textbf {U}}}}_0\) and \({{{\textbf {V}}}}_0\) are included in the ranges of \(\overline{{{{\textbf {U}}}}}\) and \(\overline{{{{\textbf {V}}}}}\), respectively, we have
Since \({{{\textbf {R}}}}\) has bounded third derivatives, we obtain
(c) We write
and we will show that
Since we already know that \({{{\textbf {R}}}}(t_1) = O(\mu )\), we will then have \(\overline{{{{\textbf {Y}}}}}_1 - {{{\textbf {A}}}}(t_1) = O(\mu )\). The proof of (17) adapts the proof of Lemma 4 of [11] to the present situation. We include the full self-contained proof for the convenience of the reader. For \(t_0\le t \le t_1\), let
We write
and
with the defect \( {{{\textbf {D}}}}(t):= {{{\textbf {F}}}}(t, \overline{{{{\textbf {U}}}}}{{\widetilde{{{{\textbf {S}}}}}}}(t) \overline{{{{\textbf {V}}}}}^\top + {\textbf {R}}(t)) - {{{\textbf {F}}}}(t, \overline{{{{\textbf {U}}}}}{{\widetilde{{{{\textbf {S}}}}}}}(t) \overline{{{{\textbf {V}}}}}^\top ). \) With the Lipschitz constant L of \({{{\textbf {F}}}}\) and the bound of \({{{\textbf {R}}}}(t)\) from part (b), the defect is bounded by
We compare the two differential equations with the same initial values,
With the Gronwall inequality we obtain
This yields (17) and hence the stated result. \(\square \)
The following result on the global error is obtained from Theorem 1 with the standard argument of Lady Windermere’s fan [29, II.3] with error propagation by the exact flow; cf. [9, 11, 35, 36].
Theorem 2
(Robust second-order global error bound) Let \({{{\textbf {A}}}}(t)\) denote the solution of the matrix differential Eq. (1). Assume that \({{{\textbf {F}}}}\) satisfies the bound and Lipschitz bound (15) and has small normal components as specified in (16) in a neighbourhood of \(t_n=nh\) for the ranks \(r=r_n\) and \(\widehat{r}=\widehat{r}_n\) chosen by the algorithm in the nth step with a truncation tolerance \(\vartheta \), for each n with \(0\le t_n\le T\). Assume further that the error in the initial value is \(\delta \)-small, i.e. \(\Vert {{{\textbf {Y}}}}_0 - {{{\textbf {A}}}}_0 \Vert \le \delta \).
Let \({{{\textbf {Y}}}}_n\) be the low-rank approximation to \({{{\textbf {A}}}}(t_n)\) at \(t_n=nh\) obtained after n steps of the midpoint BUG integrator with stepsize \(h>0\), with rank truncation after each step with tolerance \(\vartheta \). Then, the error satisfies for all n with \(t_n = nh \le T\)
where the constants \(c_i\) depend only on B, L, and T and on a bound of the third derivative of exact solutions \({{{\textbf {A}}}}(t)\) of the matrix differential Eq. (1) with initial values in a neighbourhood of \({{{\textbf {A}}}}_0\).
In particular, the constants are independent of singular values of the exact or approximate solution and are also independent of derivatives of \({{{\textbf {U}}}}(t)\) and \({{{\textbf {V}}}}(t)\) in (2)–(3), which can be large in the presence of small singular values.
The term \(n\vartheta \) in the error bound indicates that it is appropriate to choose the truncation tolerance \(\vartheta \) proportional to the stepsize h, i.e., \(\vartheta =h \theta \) with a fixed \(\theta \).
5 Numerical experiments
In this section, we present the results of various numerical experiments conducted using MATLAB R2023a and C++.
5.1 Heat equation
In the first example, we numerically approximate the solution \(u=u(t,x,y)\) of the heat equation with homogeneous Dirichlet boundary conditions
where \(t \in [0, T]\), and \((x,y) \in [-\pi , \pi ] \times [-\pi , \pi ] \). The initial value \(u_0(x,y)\) and the time-independent source term g(x, y) are provided as follows
We discretize in space using finite differences with \(N = 128\) uniform grid points in each direction. The final time is set to \(T=1\). The resulting discretized equation is thus given by
Denoting \(\varDelta x\) and \(\varDelta y\) as the discretization sizes of the meshes, we have
The source term and the initial value’s factors are determined element-wise as follows
Here k ranges from 1 to 20, while i, and j range from 1 to N; \(x_i\) and \(y_j\) denote the i-th and j-th elements on the space grids, respectively. The off-diagonal elements of \({{{\textbf {S}}}}_0 \in {{\mathbb {R}}}^{20 \times 20}\) are set to zero, and the factor \({{{\textbf {U}}}}_0 \in {{\mathbb {R}}}^{N \times 20}\) is orthonormalized using the scaling factor \(\sqrt{\varDelta x/ \pi }\). The solution of the Lyapunov differential Eq. (18) is obtained via the closed formula
Comparison for the non-stiff test case of the relative approximation errors measured in Frobenius norm among the projected low rank midpoint scheme following [36] and the different BUG integrators for various ranks and time-step sizes with final time \(T=1\)
Comparison for the non-stiff test case of the relative approximation errors measured in Frobenius norm among the projected low rank midpoint scheme following [36] and the different BUG integrators for various ranks and time-step sizes with final time \(T=10\)
Comparison of the absolute error in norm and energy conservation up to \(T=10\) using a time step of \(h=0.05\), for the rank-10 (top) and rank-20 (bottom) numerical approximation obtained with various numerical integrators. The projected midpoint low-rank (MPLR) method is indicated by a dashed line, while the augmented BUG and midpoint BUGs are depicted with solid lines
Time evolution of the 10th and 20th singular values of the numerical reference solution of the discrete Schrödinger equation of (20)
The exponential map and the solution of the Sylvester equation above are computed using dedicated MATLAB routines, namely expm and sylvester. Because each discretized differential equation appearing in both low-rank integrators can be rewritten in a similar manner, the solution of the K-, L-, and S-step is obtained in the same way as (19), with appropriate replacements of factors. In Fig. 1, we show how the augmented BUG integrator behaves compared to the two variants of the Midpoint BUG integrator outlined in Remark 1 for different ranks: \(r=2, 4, 6, 8, 10\). After each time step, both algorithms are truncated to rank r, and we compute the absolute error using the Frobenius norm. Figure 1 shows that, with a moderate increase in computational cost with respect to the augmented BUG integrator, the Midpoint BUG together with its variant provides second-order convergence in time until the approximability saturation level is reached, while the augmented BUG integrator retains only first-order accuracy in time for this stiff problem.
Projector Splitting Integrators [42] (PSI) are not suitable in this context due to instabilities introduced by the backward S-step, when computing the exponential map \(e^{-h {{\widetilde{{{{\textbf {D}}}}}}}_{xx}}\) or its action for the projected stencil \({{\widetilde{{{{\textbf {D}}}}}}}_{xx}\) used in the S-step of the PSI.
Explicit Projected Runge–Kutta (PRK) schemes [36] face severe time step size restrictions due to the stiffness of the problem. Comparison with these methods will be deferred to the next numerical examples, where stiffness is either absent or leads to mild stepsize restrictions.
5.2 Non-stiff numerical test case: a discrete Schrödinger equation
We consider a discrete Schrödinger equation; see e.g. [6] and (mainly for nonlinear discrete Schrödinger equations) also [1, 34]. The differential equation considered here is equipped with periodic boundary conditions and reads
where (with the first unit vector \({\textbf {e}}_1\) and the Nth unit vector \({\textbf {e}}_N\))
The right-hand side \({{{\textbf {H}}}}[\cdot ]\) is linear with a moderate operator norm. As an initial condition, we choose a discretized Gaussian \(u_0(x,y) = \exp \left( -\frac{1}{2}x^2 -\frac{1}{2} (y-1)^2\right) \), with N uniform grid points in each direction. After discretization, the initial value is normalized using the Frobenius norm.
The reference solution is computed with the MATLAB solver ode45 and strict tolerance parameters {’RelTol’, 1e-10, ’AbsTol’, 1e-10} . The time integration for the intermediate K-, L-, and S-steps of the BUG integrators is also conducted using the ode45 solver with the same tolerance parameters. Following each iteration of the BUG numerical integrators, the numerical approximation is retracted to its original rank via a singular value decomposition. A comparison of the global relative error, measured in the Frobenius norm for ranks \(r \in \{2,4,6,8,10\}\) and \(N=128\), is presented in Fig. 2 up to the final time \(T=1\). The same numerical experiment is also performed up to the final time \(T=10\), as shown in Fig. 3. In addition to the augmented and Midpoint BUG integrators, we also compare the reference solution to a Midpoint Projected Low-Rank (MPLR) integrator applied directly to the system (3), following the projection approach of [36] based on Runge’s second-order midpoint method. In Fig. 4, the conservation of energy and norm by the different integrators is illustrated for ranks \(r\in \{10, 20\}\) up to the final time \(T=10\), using a moderately large time-step size of \(h=0.05\).
The Midpoint BUG integrator, along with its variant, achieves the expected order of convergence. It is interesting to observe that the augmented BUG integrator also numerically demonstrates second-order accuracy, a behaviour that is currently not fully understood. The low-rank projected midpoint method, while second-order accurate, exhibits a larger error than the BUG methods. This behavior becomes more pronounced when larger time propagation is performed, as seen in Fig. 3. Furthermore, as discussed in Remark 3, both the augmented BUG and the midpoint BUG integrators preserve energy and norm up to an error proportional to the error introduced by truncating back to the original rank; see Fig. 5 for the time evolution of the 10th and 20th singular values of the numerical reference solution, which corresponds well with the error plots in Fig. 4.
5.3 The Vlasov-Poisson equation
In the last example, we consider the 1x1v Vlasov-Poisson equation. Let \(f=f(t,x,v)\) be the solution of
For the electric field, we assume the existence of a potential \(\varphi \) such that \(E(f) = -\nabla _x \varphi \). Consequently, the curl free condition is naturally satisfied, implying that
For specific information regarding the spatial and velocity discretizations, as well as the use of robust numerical integrators for dynamical low-rank integration, we refer to [24]. In this framework, the domain is defined as \((x, v) \in [0, 4\pi ] \times [-6, 6]\), equipped with periodic boundary conditions. The time evolution is performed until \(T = 10\). We discretize both in space and velocity using a uniform grid with \(N = 128\) points in each direction, respectively. The Poisson Eq. (22) is accurately solved in Fourier space through the use of the Fast Fourier Transform (FFT). Each K-, L-, and S-substep of the augmented and Midpoint BUG integrators is solved accurately using the time-integration method DOPRI5.
Since no analytical solution is available for (21), the reference solution has been obtained using the full-order model solver proposed in [18, 19]. Specifically, Strang splitting with a time step size of \(h=10^{-4}\) is employed, the Poisson problem is solved using FFT, and 512 degrees of freedom are utilized in both spatial and velocity dimensions. A fourth-order semi-Lagrangian discontinuous Galerkin method is applied in both the spatial and velocity domain.
Figure 6 shows convergence plots for the augmented BUG integrator, both variants of the Midpoint BUG integrator using ranks \(r=3, 5, 10\). Additionally, we include a comparison with the standard fixed-rank Projector Splitting Integrator in both its Lie and Strang formulations. Both variants of the Midpoint BUG integrator and the Strang projector splitting integrator show second order. The accuracy of the Midpoint BUG (4r) integrator is roughly equal to the Strang projector splitting integrator, but (consistent with the analysis) better than the accuracy of the Midpoint BUG (3r) variant. For the augmented BUG integrator and the Lie projector splitting integrator we observe first order.
References
Ablowitz, M.J., Prinari, B., Trubatch, A.D.: Discrete and continuous nonlinear Schrödinger systems. Cambridge University Press, Cambridge (2004)
Ali, W.H., Lermusiaux, P.F.: Dynamically orthogonal narrow-angle parabolic equations for stochastic underwater sound propagation. Part II: Applications. J. Acoust. Soc. Amer. 155(1), 656–672 (2024)
Babaee, H., Choi, M., Sapsis, T.P., Karniadakis, G.E.: A robust bi-orthogonal/dynamically-orthogonal method using the covariance pseudo-inverse with application to stochastic flow problems. J. Comput. Phys. 344, 303–319 (2017)
Baumann, L., Einkemmer, L., Klingenberg, C., Kusch, J.: Energy stable and conservative dynamical low-rank approximation for the Su-Olson problem. SIAM J. Sci. Comput. 46(2), B137–B158 (2024)
Billaud-Friess, M., Falcó, A., Nouy, A.: A new splitting algorithm for dynamical low-rank approximation motivated by the fibre bundle structure of matrix manifolds. BIT Numer. Math. 62, 387–408 (2022)
Boykin, T.B., Klimeck, G.: The discretized Schrödinger equation and simple models for semiconductor quantum wells. European J. Phys. 25(4), 503 (2004)
Carrel, B., Vandereycken, B.: Projected exponential methods for stiff dynamical low-rank approximation problems. ar**v preprint ar**v:2312.00172, (2023)
Cassini, F., Einkemmer, L.: Efficient 6d Vlasov simulation using the dynamical low-rank framework Ensign. Comput. Phys. Commun. 280, 108489 (2022)
Ceruti, G., Kusch, J., Lubich, C.: A rank-adaptive robust integrator for dynamical low-rank approximation. BIT Numer. Math. 62(4), 1149–1174 (2022)
Ceruti, G., Kusch, J., Lubich, C.: A parallel rank-adaptive integrator for dynamical low-rank approximation. SIAM J. Sci. Comput. 46(3), B205–B228 (2024)
Ceruti, G., Lubich, C.: An unconventional robust integrator for dynamical low-rank approximation. BIT Numer. Math. 62(1), 23–44 (2022)
Ceruti, G., Lubich, C., Sulz, D.: Rank-adaptive time integration of tree tensor networks. SIAM J. Numer. Anal. 61(1), 194–222 (2023)
Charous, A., Lermusiaux, P.F.: Dynamically orthogonal Runge-Kutta schemes with perturbative retractions for the dynamical low-rank approximation. SIAM J. Sci. Comput. 45(2), A872–A897 (2023)
Coughlin, J., Hu, J.: Efficient dynamical low-rank approximation for the Vlasov-Ampère-Fokker-Planck system. J. Comput. Phys. 470, 111590 (2022)
Coughlin, J., Hu, J., Shumlak, U.: Robust and conservative dynamical low-rank methods for the Vlasov equation via a novel macro-micro decomposition. J. Comput. Phys. 509, 113055 (2024)
Ding, Z., Einkemmer, L., Li, Q.: Dynamical low-rank integrator for the linear Boltzmann equation: error analysis in the diffusion limit. SIAM J. Numer. Anal. 59(4), 2254–2285 (2021)
Donello, M., Palkar, G., Naderi, M.H., Del Rey Fernández, D.C., Babaee, H.: Oblique projection for scalable rank-adaptive reduced-order modelling of nonlinear stochastic partial differential equations with time-dependent bases. Proc. A. 479(2278), 20230320 (2023)
Einkemmer, L.: High performance computing aspects of a dimension independent semi-Lagrangian discontinuous Galerkin code. Comput. Phys. Commun. 202, 326–336 (2016)
Einkemmer, L.: A performance comparison of semi-Lagrangian discontinuous Galerkin and spline based Vlasov solvers in four dimensions. J. Comput. Phys. 376, 937–951 (2019)
Einkemmer, L.: Accelerating the simulation of kinetic shear Alfvén waves with a dynamical low-rank approximation. J. Comput. Phys. 501, 112757 (2024)
Einkemmer, L., Hu, J., Kusch, J.: Asymptotic-preserving and energy stable dynamical low-rank approximation. SIAM J. Numer. Anal. 62(1), 73–92 (2024)
Einkemmer, L., Joseph, I.: A mass, momentum, and energy conservative dynamical low-rank scheme for the Vlasov equation. J. Comput. Phys. 443, 110495 (2021)
Einkemmer, L., Kusch, J., Schotthöfer, S.: Conservation properties of the augmented basis update & Galerkin integrator for kinetic problems. (2023)
Einkemmer, L., Lubich, C.: A low-rank projector-splitting integrator for the Vlasov-Poisson equation. SIAM J. Sci. Comput. 40(5), B1330–B1360 (2018)
Einkemmer, L., Mangott, J., Prugger, M.: A low-rank complexity reduction algorithm for the high-dimensional kinetic chemical master equation. J. Comput. Phys. 503, 112827 (2024)
Einkemmer, L., Ostermann, A., Piazzola, C.: A low-rank projector-splitting integrator for the Vlasov-Maxwell equations with divergence correction. J. Comput. Phys. 403, 109063 (2020)
Einkemmer, L., Ostermann, A., Scalone, C.: A robust and conservative dynamical low-rank algorithm. J. Comput. Phys. 484, 112060 (2023)
Feppon, F., Lermusiaux, P.F.: Dynamically orthogonal numerical schemes for efficient stochastic advection and Lagrangian transport. SIAM Rev. 60(3), 595–625 (2018)
Hairer, E., Nørsett, S. P., Wanner, G.: Solving ordinary differential equations. I. Nonstiff problems, volume 8 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, second edition, (1993)
Hesthaven, J.S., Pagliantini, C., Ripamonti, N.: Rank-adaptive structure-preserving model order reduction of Hamiltonian systems. ESAIM Math. Model. Numer. Anal. 56(2), 617–650 (2022)
Hochbruck, M., Neher, M., Schrammer, S.: Rank-adaptive dynamical low-rank integrators for first-order and second-order matrix differential equations. BIT Numer. Math. 63(1), 9 (2023)
Jahnke, T., Huisinga, W.: A dynamical low-rank approach to the chemical master equation. Bull. Math. Biol. 70(8), 2283–2302 (2008)
Kazashi, Y., Nobile, F.: Existence of dynamical low rank approximations for random semi-linear evolutionary equations on the maximal interval. Stoch. Partial Diff. Equ.: Anal. Comput. 9(3), 603–629 (2021)
Kevrekidis, P. G.: The discrete nonlinear Schrödinger equation: mathematical analysis, numerical computations and physical perspectives, volume 232. Springer Science & Business Media, (2009)
Kieri, E., Lubich, C., Walach, H.: Discretized dynamical low-rank approximation in the presence of small singular values. SIAM J. Numer. Anal. 54(2), 1020–1038 (2016)
Kieri, E., Vandereycken, B.: Projection methods for dynamical low-rank approximation of high-dimensional problems. Comput. Meth. Appl. Math. 19(1), 73–92 (2019)
Koch, O., Lubich, C.: Dynamical low-rank approximation. SIAM J. Matrix Anal. Appl. 29(2), 434–454 (2007)
Koellermeier, J., Krah, P., Kusch, J.: Macro-micro decomposition for consistent and conservative model order reduction of hyperbolic shallow water moment equations: A study using POD-Galerkin and dynamical low rank approximation. ar**v preprint ar**v:2302.01391, (2023)
Kusch, J., Ceruti, G., Einkemmer, L., Frank, M.: Dynamical low-rank approximation for Burgers’ equation with uncertainty. Int. J, Uncertain. Quantif. (2021)
Kusch, J., Stammer, P.: A robust collision source method for rank adaptive dynamical low-rank approximation in radiation therapy. ESAIM: Math. Model. Numer. Anal. 57(2), 865–891 (2023)
Kusch, J., Whewell, B., McClarren, R., Frank, M.: A low-rank power iteration scheme for neutron transport criticality problems. J. Comput. Phys. 470, 111587 (2022)
Lubich, C., Oseledets, I.V.: A projector-splitting integrator for dynamical low-rank approximation. BIT Numer. Math. 54(1), 171–188 (2014)
Musharbash, E., Nobile, F.: Dual dynamically orthogonal approximation of incompressible Navier-Stokes equations with random boundary conditions. J. Comput. Phys. 354, 135–162 (2018)
Musharbash, E., Nobile, F., Vidličková, E.: Symplectic dynamical low rank approximation of wave equations with random parameters. BIT Numer. Math. 60, 1153–1201 (2020)
Nakao, J., Qiu, J.-M., Einkemmer, L.: Reduced Augmentation Implicit Low-rank (RAIL) integrators for advection-diffusion and Fokker–Planck models. ar**v:2311.15143, (2023)
Patil, P., Babaee, H.: Real-time reduced-order modeling of stochastic partial differential equations via time-dependent subspaces. J. Comput. Phys. 415, 109511 (2020)
Peng, Z., McClarren, R.G.: A high-order/low-order (holo) algorithm for preserving conservation in time-dependent low-rank transport calculations. J. Comput. Phys. 447, 110672 (2021)
Peng, Z., McClarren, R.G.: A sweep-based low-rank method for the discrete ordinate transport equation. J. Comput. Phys. 473, 111748 (2023)
Peng, Z., McClarren, R.G., Frank, M.: A low-rank method for two-dimensional time-dependent radiation transport calculations. J. Comput. Phys. 421, 109735 (2020)
Prugger, M., Einkemmer, L., Lopez, C.: A dynamical low-rank approach to solve the chemical master equation for biological reaction networks. J. Comput. Phys. 489, 112250 (2023)
Sapsis, T.P., Lermusiaux, P.F.: Dynamically orthogonal field equations for continuous stochastic dynamical systems. Physica D 238(23–24), 2347–2360 (2009)
Savostianova, D., Zangrando, E., Ceruti, G., Tudisco, F.: Robust low-rank training via approximate orthonormal constraints. ar**v preprint ar**v:2306.01485, (2023)
Schmidt, J., Hennig, P., Nick, J., Tronarp, F.: The rank-reduced Kalman filter: Approximate dynamical-low-rank filtering in high dimensions. ar**v preprint ar**v:2306.07774, (2023)
Schotthöfer, S., Zangrando, E., Kusch, J., Ceruti, G., Tudisco, F.: Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations. Adv. Neural Inf. Process. Syst. 35, 20051–20063 (2022)
Seguin, A., Ceruti, G., Kressner, D.: From low-rank retractions to dynamical low-rank approximation and back. ar** and mass terms. ar**v:2308.08888, (2023)
Zangrando, E., Schotthöfer, S., Ceruti, G., Kusch, J., Tudisco, F.: Rank-adaptive spectral pruning of convolutional layers during training. ar**v preprint ar**v:2305.19059, (2023)
Acknowledgements
C.L. was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - TRR 352 - Project-ID 470903074.
Funding
Open access funding provided by Norwegian University of Life Sciences
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Antonella Zanna Munthe-Kaas.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ceruti, G., Einkemmer, L., Kusch, J. et al. A robust second-order low-rank BUG integrator based on the midpoint rule. Bit Numer Math 64, 30 (2024). https://doi.org/10.1007/s10543-024-01032-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10543-024-01032-x