2.1 Motivation and Basic Assumptions

In this chapter, we introduce the solution space for high-fidelity models based on partial differential equations and the finite element model. The manifold learning approach to model order reduction requires simulated data. Hence, learning projection-based reduced order models (ROM) has two steps: (i) an offline step for the computation of simulated data and for consecutive machine learning tasks, (ii) an online step where the reduced order model is used as a surrogate for the high fidelity model. The offline step generates a train set and a validation set of simulated data. The accuracy and the generalisation of the reduced order model is evaluated in the online step by using a test set of data forecast by the high-fidelity model. The test set aims also to check the computational speedups of the reduced-order model compare to the high-fidelity model.

Learning projection-based reduced order model makes sense only if there is a significant computational speedup at the price of an acceptable loss of accuracy in predictions. The longer the computational time of the high-fidelity model, the smaller the acceptable speed up, if we save hours or days of numerical simulations. Regarding the acceptable accuracy of reduced predictions, engineers working in industry and scientists working in laboratories do not have the same expectations. In our own experience, learning projection-based reduced order model has the capability to adapt to engineering tasks, high-fidelity models elaborated in laboratories, in terms of accuracy and computational time. This approach contributes to data continuity between physics-based knowledge developed in laboratories and practical applications for engineering tasks.

From the mathematical point of view, Céa’s lemma gives an overview of manifold learning for model order reduction applied to elliptic equations. Let \(\mathcal {V}\) be a real Hilbert space, such that the weak form of the elliptic equation reads: find \(\widetilde{u}\in \mathcal {V}\), such that:

$$\begin{aligned} a(\widetilde{u},\widetilde{v}) = L(\widetilde{v}), \quad \forall \, \widetilde{v}\in \mathcal {V}, \end{aligned}$$
(2.1)

where \(a(\cdot ,\cdot )\) is a bilinear form, with coercivity constant \(\beta > 0\) and continuity continuity constant \(C^a > 0\). \(L(\cdot )\) is a linear form. Section 2.2 gives more details on weak forms. Let us \(\mathcal {V}_h\) a finite dimensional subspace of \(\mathcal {V}\). Here, \(\mathcal {V}_h\) is an approximate solution space for the elliptic equation. The approximate solution of the elliptic equation is denoted by \(u\in \mathcal {V}_h\subset \mathcal {V}\). The Galerkin projection of the elliptic equation onto the solution space reads: find \(u\in \mathcal {V}_h\subset \mathcal {V}\) such that:

$$\begin{aligned} a(u,v) = L(v), \quad \forall \, v\in \mathcal {V}_h, \end{aligned}$$
(2.2)

where \(\mathcal {V}_h\) has been substituted for \(\mathcal {V}\). Céa’s lemma states that:

$$\begin{aligned} \Vert \widetilde{u}- u\Vert \le \frac{C^a}{\beta } \, \Vert \widetilde{u}- v\Vert , \quad \forall \, v\in \mathcal {V}_h, \end{aligned}$$
(2.3)

where \(\Vert \cdot \Vert \) is a norm in \(\mathcal {V}\). The related scalar product is denoted by \(\langle \cdot \rangle \). In Finite Element models, \(\mathcal {V}_h\) is the span of finite element shape functions. But Céa’s lemma holds for all finite-dimensional subspace of \(\mathcal {V}\). The closer \(\widetilde{u}\) to the solution space \(\mathcal {V}_h\), the smaller the right-hand term of Céa’s lemma (2.3), and therefore one can expect a better prediction \(u\) in Eq. (2.2), although we do not know \(\widetilde{u}\).

In few words, a reduced-order model is obtained by introducing a smaller solution space \(\mathcal {V}_n\subset \mathcal {V}_h\) of smaller dimension \(n < N\). A projection-based reduced order model can be achieved by using the Galerkin projection (2.2), where \(\mathcal {V}_n\) is substituted for \(\mathcal {V}_h\). The conclusion of Céa’s lemma holds again. The closer \(\widetilde{u}\) to the solution space \(\mathcal {V}_n\), the better the prediction \(\widehat{u}\in \mathcal {V}_n\). Manifold learning comes into play when we are given a set of predictions \((u^{(i)})_{i=1,\ldots , m}\) in a common ambient space \(\mathcal {V}_h\), related to a given finite element mesh. The basic assumptions in manifold learning for model order reduction are:

  • a latent space of reduced dimension is hidden in the data \((u^{(i)})_{i=1,\ldots , m}\), its dimension is denoted by n,

  • a machine learning algorithm is available to learn this latent space by using a train set of simulated data extracted from \((u^{(i)})_{i=1,\ldots , m}\),

  • the distance between \(\widetilde{u}\) and this latent space is small enough, although we do not know \(\widetilde{u}\),

  • a numerical scheme enables the projection of the elliptic equation onto the latent space, in order to set up the reduced order model,

  • the computational complexity of the solution of the reduced order model is smaller that the computational complexity of the finite element prediction,

  • the computational complexity of the reduced order model is an increasing function of n.

As explained above, when the latent space is a vector subspace \(\mathcal {V}_n\), both Galerkin projection and Céa’s lemma hold, but simulation speedup may not be achieved. The study of more complex situations is the purpose of this Chapter. An estimation of computational complexity of projection-based reduced order model is proposed in Sect. 2.3.6.

Remarks:

  • In the Rayleigh-Ritz method a small set of trial functions that satisfy the boundary conditions for \(\widetilde{u}\) is introduce to span a solution space. This solution space is not related to any finite element model. The inclusion of the latent space in the ambient space \(\mathcal {V}_h\) is essential for model order reduction.

  • The finite element ambient space \(\mathcal {V}_h\) incorporates homogeneous Dirichlet boundary conditions (\(\widetilde{u}= 0\)) on a boundary of the domain where the partial differential equations are set up). When such boundary conditions may change in the instances \((\widetilde{u}_N^{(i)})_{i=1,\ldots , m}\) these conditions must be taken into account as a linear constraint that supplement the partial differential equation. Such an issue appears when considering contact problems [38] for instance.

  • Important limitations of projection-based model reduction methods includes situations where the geometry has to be handled in the exploitation phase of the reduced-order models, for instance when the problem features contact boundary conditions, crack propagation or when the geometry is a variability of the problem to learn. Geometrical variabilities are handled in the authors’ works [1, 2, 22, 60, 61, 92].

2.2 High-Fidelity Model (HFM)

Consider an abstract partial differential equation in a domain \(\Omega \), with a \(\mu \)-variability:

$$\begin{aligned} \mathcal {D}(\widetilde{u};\boldsymbol{\xi }, {\boldsymbol{\mu }})=0, \quad \boldsymbol{\xi }\in \Omega , \, \widetilde{u}\in \mathcal {V}. \end{aligned}$$
(2.4)

The weak form of this partial differential equation reads: find \(\widetilde{u}\in \mathcal {V}\) such that

$$\begin{aligned} \int _\Omega \widetilde{v}\, \mathcal {D}(\widetilde{u};\boldsymbol{\xi }, {\boldsymbol{\mu }}) \, d\boldsymbol{\xi }=0, \quad \forall \, \widetilde{v}\in \mathcal {V}. \end{aligned}$$
(2.5)

As an illustration, the concepts of this chapter are illustrated on a nonlinear structural mechanics problem, for which details on the high-fidelity model are provided in this Section. For an example with another physics, we refer to the authors’ work [21], where a nonlinear transient thermal problem is considered.

The mechanical structure occupies the domain \(\Omega \), whose boundary \(\partial \Omega \) is partitioned as \(\partial \Omega =\partial \Omega _D\cup \partial \Omega _N\) such that \(\partial \Omega _D^{\textrm{o}}\cap \partial \Omega _N^{\textrm{o}}=\emptyset \), see Fig. 2.1.

Fig. 2.1
A line drawing of a closed, irregular shape. Two crosses are marked at opposite ends of the shape. The crosses are labeled as the partial difference of omega N and D. The symbol omega is at the center.

Schematic representation of the structure of interest [18]

The structure is subjected to a quasi-static time-dependent loading, composed of an homogeneous Dirichlet boundary conditions on \(\partial \Omega _D\) and Neumann boundary conditions on \(\partial \Omega _N\) in the form of a prescribed traction \(T_N\), as well as a volumic force f. The setting depends on some variability \(\mu \), which can be a parameter vector, or represent some nonparametrized variability. The evolution of the displacement \(\widetilde{u}^{\boldsymbol{\mu }}(\boldsymbol{\xi },t)\) over \((\boldsymbol{\xi },t)\in \Omega \times [0,T]\) is the solution of the following equations:

$$\begin{aligned} \begin{aligned} &{\boldsymbol{\epsilon }}(\widetilde{u}^{\boldsymbol{\mu }})=\frac{1}{2}\left( \nabla \widetilde{u}^{\boldsymbol{\mu }}+ (\nabla \widetilde{u}^{\boldsymbol{\mu }})^T\right) ,&\mathrm {~in~}\Omega \times [0,T],&\quad \mathrm {(compatibility~equation)}\\ &\textrm{div}\left( {\boldsymbol{\sigma }}_c^{\boldsymbol{\mu }}\right) +f^{\boldsymbol{\mu }}=0,&\mathrm {~in~}\Omega \times [0,T],&\quad \mathrm {(equilibrium~equation)}\\ &{\boldsymbol{\sigma }}_c^{\boldsymbol{\mu }}={\boldsymbol{\sigma }}_c({\boldsymbol{\epsilon }}(\widetilde{u}^{\boldsymbol{\mu }}),y^\mu ),&\mathrm {~in~}\Omega \times [0,T],&\quad \mathrm {(constitutive~law)}\\ &\widetilde{u}^{\boldsymbol{\mu }}=0,&\mathrm {~in~}\partial \Omega _D\times [0,T],&\quad \mathrm {(prescribed~zero~displacement)}\\ &{\boldsymbol{\sigma }}_c^{\boldsymbol{\mu }}\cdot n_{\partial \Omega }=T_N^{{\boldsymbol{\mu }}},&\mathrm {~in~}\partial \Omega _N\times [0,T],&\quad \mathrm {(prescribed~traction)}\\ &\widetilde{u}^{\boldsymbol{\mu }}=0,y^{\boldsymbol{\mu }}=0,&\mathrm {~in~}\Omega ~\mathrm{at~t=0},&\quad \mathrm {(initial~condition)} \end{aligned} \end{aligned}$$
(2.6)

where \({\boldsymbol{\epsilon }}\) is the linear strain tensor, \({\boldsymbol{\sigma }}_c^{\boldsymbol{\mu }}\) is the Cauchy stress tensor, \(y^{\boldsymbol{\mu }}\) denotes the internal variables of the constitutive law and \(n_{\partial \Omega }\) is the outward normal vector on \(\partial \Omega \). We precise that the evolution of the internal variables \(y^{\boldsymbol{\mu }}\) is updated when the constitutive law is solved.

Define \(H^1_0(\Omega )=\{v\in L^2(\Omega )|~\frac{\partial v}{\partial \xi _i}\in L^2(\Omega ),~1\le i\le 3\mathrm{~and~}v|_{\partial \Omega _D}=0\}\). Denote \(\{\varphi _i\}_{1\le i\le N}\in \mathbb {R}^{N\times N}\), a finite element basis whose span, denoted \(\mathcal {V}_h\), constitutes an approximation of \(H^1_0(\Omega )^3\); N is the number of finite element basis functions, hence the number of degrees of freedom of the discretized prediction. A discretized weak formulation reads: find \(u^{\boldsymbol{\mu }}\in \mathcal {V}_h\) such that for all \(v\in \mathcal {V}_h\),

$$\begin{aligned} \int _{\Omega }{\boldsymbol{\sigma }}_c({\boldsymbol{\epsilon }}(u^{\boldsymbol{\mu }}),y):{\boldsymbol{\epsilon }}(v)=\int _{\Omega }f^{\boldsymbol{\mu }}\cdot v+\int _{\partial \Omega _N}T^{\boldsymbol{\mu }}_{N}\cdot v, \end{aligned}$$
(2.7)

that we denote for concision \(\textbf{F}^{\boldsymbol{\mu }}\left( \textbf{u}^{\boldsymbol{\mu }}\right) =0\), where \(\textbf{u}^{\boldsymbol{\mu }}\) is the vector of N coordinates for \(u^{\boldsymbol{\mu }}\in \mathcal {V}_h\). A Newton algorithm can be used to solve this nonlinear global equilibrium problem at each time step:

$$\begin{aligned} \displaystyle \frac{D \textbf{F}^{\boldsymbol{\mu }}}{Du}\left( \textbf{u}^{\mu ,k}\right) \left( \textbf{u}^{{\boldsymbol{\mu }},k+1}-\boldsymbol{\textbf{u}}^{{\boldsymbol{\mu }},k}\right) =-\textbf{F}^{\boldsymbol{\mu }}\left( \textbf{u}^{{\boldsymbol{\mu }},k}\right) , \end{aligned}$$
(2.8)

where

$$\begin{aligned} \displaystyle {\frac{D \textbf{F}^{\boldsymbol{\mu }}}{Du}\left( \textbf{u}^k\right) }_{ij}=\int _{\Omega }{\boldsymbol{\epsilon }}\left( \varphi _j\right) :\mathcal {K}\left( {\boldsymbol{\epsilon }}(u^{{\boldsymbol{\mu }},k}),y^{\boldsymbol{\mu }}\right) :{\boldsymbol{\epsilon }}\left( \varphi _i\right) , \end{aligned}$$
(2.9)

and

$$\begin{aligned} \displaystyle {\textbf{F}^{\boldsymbol{\mu }}\left( \textbf{u}^{{\boldsymbol{\mu }},k}\right) }_i=\int _\Omega {\boldsymbol{\sigma }}_c\left( {\boldsymbol{\epsilon }}(u^{{\boldsymbol{\mu }},k}),y^{\boldsymbol{\mu }}\right) :{\boldsymbol{\epsilon }}\left( \varphi _i\right) -\int _\Omega f^{\boldsymbol{\mu }}\cdot \varphi _i-\int _{\partial \Omega _N}T^{\boldsymbol{\mu }}_N\cdot \varphi _i. \end{aligned}$$
(2.10)

In the two relations above, \(\mathcal {K}\left( {\boldsymbol{\epsilon }}(u^{{\boldsymbol{\mu }},k}),y^{\boldsymbol{\mu }}\right) \) is the local tangent operator, \(u^{{\boldsymbol{\mu }},k}\in \mathcal {V}\) is the k-th iteration of the discretized displacement field at the current time-step, and \(\textbf{u}^{{\boldsymbol{\mu }},k}=\left( u_i^{{\boldsymbol{\mu }},k}\right) _{1\le i\le N}\in \mathbb {R}^N\) is such that \(\displaystyle u^{{\boldsymbol{\mu }},k}=\sum _{i=1}^N u_i^{{\boldsymbol{\mu }},k} \varphi _i\). In particular, \(f^{\boldsymbol{\mu }}\), \(T_N^{\mu }\), \(u^{{\boldsymbol{\mu }},k}\) and \(y^{\boldsymbol{\mu }}\) are known and enforce the time-dependence of the solution. Depending on the constitutive law, the computation of the functions \(\displaystyle \left( u^{{\boldsymbol{\mu }},k},y\right) \mapsto {\boldsymbol{\sigma }}_c\left( {\boldsymbol{\epsilon }}(u^{{\boldsymbol{\mu }},k}),y^{\boldsymbol{\mu }}\right) \) and \(\displaystyle \left( u^{{\boldsymbol{\mu }},k},y^{\boldsymbol{\mu }}\right) \mapsto \mathcal {K}\left( {\boldsymbol{\epsilon }}(u^{{\boldsymbol{\mu }},k}),y^{\boldsymbol{\mu }}\right) \) can require the resolution of a complex differential-algebraic system of equations.

2.3 Linear Manifold Learning for Projection-Based Reduced-Order Modeling

We start by explaining the online phase. Since we want to construct and solve the reduced-order model (ROM) in the most efficient way, the offline phase is dedicated to precompute as many steps as possible, under the considered variability.

Linear manifold learning means that the solution manifold is approximated by a vector subspace of the ambient solution space, as illustrated in Fig. 2.2.

Fig. 2.2
A diagram of the linear data compression. There is an oval labeled mu in the parameter space, indicating variability with small influence on u. A line connects this parameter space to the solution space. In the solution space, there is an irregular shape labeled u of mu, representing a small solution manifold.

Linear manifold learning

2.3.1 Approaches Preceding the Use of Machine Learning

In structural mechanics, normal modes have been introduced for the analysis of vibrations in structures. When considering free vibrations, without external force, the solution of linear hyperbolic equation is sought by using the separation of space and time variables: \(\textbf{u}_N(\textbf{x},t) = {\boldsymbol{\psi }}(\textbf{x}) \, \widehat{u}(t)\), where \(\textbf{x}\in \Omega \) is the space variable and \(t \in \mathbb {R}\) the time variable. The hyperbolic equation of free vibration reads: find \({\boldsymbol{\psi }}(\textbf{x})\in \mathcal {V}_N\) and \(\widehat{u}(t) \in \mathbb {R}\) such that \(\textbf{u}_N(\textbf{x},t) = {\boldsymbol{\psi }}(\textbf{x}) \, \widehat{u}(t)\) and

$$\begin{aligned} \langle \rho \, \ddot{\textbf{u}}_N,\textbf{v}_N \rangle + a(\textbf{u}_N,\textbf{v}_N) = 0, \quad \forall \, \textbf{v}_N \in \mathcal {V}_N. \end{aligned}$$
(2.11)

It follows that \(\widehat{u}(t)\) is an harmonic function of frequency denoted by f and \({\boldsymbol{\psi }}\) is the eigenvector related to the eigenvalue \(\lambda = (2 \, \pi \, f)^2\) of the following generalized eigenproblem: find \({\boldsymbol{\psi }}\in \mathcal {V}_N\) and \(\lambda \) such that

$$\begin{aligned} a({\boldsymbol{\psi }},\textbf{v}_N) - \lambda \, \langle \rho \, {\boldsymbol{\psi }},\textbf{v}_N \rangle = 0, \quad \forall \, \textbf{v}_N \in \mathcal {V}_N, \end{aligned}$$
(2.12)

where the rank of this system of equation is supposed to be \(N-1\) in order to find non zero eigenmodes. This eigenproblem admits N orthogonal normal modes that span \(\mathcal {V}_N\). Therefore a selection of n normal modes span a reduced subspace of dimension n. In the beginning of the 21st century, model reduction using variable separation scheme in partial differential equations has been extended to an arbitrary number of variable by using low-rank approximations such as the Proper Generalized Decomposition [5]. Eigenmodes are global functions in contrast to finite element shape functions that have a local support. For dynamical problems involving nonlinear contributions to the PDE, adaptive computations of reduced subspaces have been proposed by Almroth et al. [4] and Noor et al. [79], by using Rayleigh-Ritz global functions. The set of these global functions is a reduced basis of the finite-element approximation-space.

The idea of using statistics to generate a solution space for differential equations has been proposed in the seminal work of Lorenz in [69] (in page 31), by using empirical orthogonal functions. The Galerkin projection of PDEs on empirical modes have been first developed in [13], where a reduced basis is computed via the proper orthogonal decomposition [70] of observational data. This was the first step towards manifold learning for projection-based model order reduction, which we now present. For other presentations of this technologies, the reader can refer to [81, 87].

2.3.2 Online Phase: Galerkin Projection

The reduced-order model is constructed in the form of a Galerkin method written on a Reduced-Order Basis (ROB). In the present case, it consists in assembling the physical problem in the same fashion as the HFM in Sect. 2.2, with the difference that the finite element basis \((\varphi _i)_{1\le i\le N}\in \mathbb {R}^{N\times N}\), is replaced by a ROB \((\psi _i)_{1\le i\le n}\in \mathbb {R}^{n\times N}\), with \(n\ll N\). Hence, the reduced Newton algorithm is constructed as

$$\begin{aligned} \displaystyle \frac{D\mathcal {F}_\mu }{D\hat{u}}\left( \hat{u}_\mu ^k\right) \left( {\boldsymbol{\gamma }}^{{\boldsymbol{\mu }},k+1}-{\boldsymbol{\gamma }}^{{\boldsymbol{\mu }},k}\right) =-\mathcal {F}_\mu \left( \hat{u}_\mu ^k\right) , \end{aligned}$$
(2.13)

where

$$\begin{aligned} \displaystyle {\frac{D\mathcal {F}_\mu }{D\hat{u}}\left( \hat{u}_\mu ^k\right) }_{ij}=\int _{\Omega }\epsilon \left( \psi _j\right) :\mathcal {K}\left( \epsilon (\hat{u}_\mu ^k),y_\mu \right) :\epsilon \left( \psi _i\right) \end{aligned}$$
(2.14)

and

$$\begin{aligned} \displaystyle {\mathcal {F}_\mu \left( \hat{u}_\mu ^k\right) }_i=\int _\Omega \sigma \left( \epsilon (\hat{u}_\mu ^k),y\right) :\epsilon \left( \psi _i\right) -\int _\Omega f_\mu \cdot \psi _i-\int _{\partial \Omega _N}T_{\mu ,N}\cdot \psi _i. \end{aligned}$$
(2.15)

In the two relations above, \(\hat{u}_\mu ^k\in \hat{\mathcal {V}}:=\textrm{Span}\left( \psi _i\right) _{1\le i\le n}\) is the k-th iteration of the reduced displacement field for the current time-step and \(\boldsymbol{\gamma }^{{\boldsymbol{\mu }},k}=\left( \gamma ^{{\boldsymbol{\mu }},k}_i\right) _{1\le i\le n}\in \mathbb {R}^n\) is such that

$$\begin{aligned} \hat{u}_\mu ^k=\sum _{i=1}^n\gamma ^{{\boldsymbol{\mu }},k}_i\psi _i. \end{aligned}$$
(2.16)

Notice that the use of the Galerkin method is made possible by the linearity of the tangent problem (2.13) and the choice of a linear dimensionality reduction technique in (2.16).

The online stage is called efficient if the reduced problem can be constructed and solved in computational complexity independent of N. When the variability \(\mu \) is parametrized, efficiency is possible by precomputing various terms. With nonparametrized variability, depending on its nature, some assembling task with a linear complexity in N may be required at the beginning of the online stage (for instance for a boundary condition). All these scenarios are handled by genericROM, the ROM library developped at Safran and presented in Sect. 4.2.

The offline phase contains three steps, for which we present below the methodological choices made in genericROM.

2.3.3 Offline Phase

2.3.3.1 Data Generation

This step corresponds to the generation of the snapshots by solving the high-fidelity model. In parametric contexts, the simplest workflows consist in choosing parameter values a priori, following Design of Experiments (DoE, see [50] for a recent technique), and computing the corresponding snapshots by solving the high-fidelity model.

2.3.3.2 Data Compression: Dimensionality Reduction

This step corresponds to the generation of the ROB \((\psi _i)_{1\le i\le n}\). One of the most classical and simple method is the snapshot Proper Orthogonal Decomposition (POD) [25, 86], detailed below:

  1. 1.

    Choose a tolerance \(\epsilon _\textrm{POD}\).

  2. 2.

    Compute the correlation matrix \(C_{i,j}=\int _{\Omega }u_i\cdot u_j\), \(1\le i,j\le N_s\), where \(N_s\) is the total number of HFM snapshots.

  3. 3.

    Compute the \(\epsilon _\textrm{POD}\)-truncated eigendecomposition of C: \(\xi _i\in \mathbb {R}^{N_c}\) and \(\lambda _i>0\), where \(1\le i\le n\), are the n first eigenvector and eigenvalues.

  4. 4.

    Compute the reduced order basis \(\displaystyle \psi _i(x)=\frac{1}{\sqrt{\lambda _i}}\sum _{j=1}^{N_s}u_j(x){\xi _i}_j\), \(1\le i\le n\).

The advantages of the snapshot-POD are a reasonable computational complexity when the number of degree of freedom of the high-fidelity model are much larger than the number of snapshots, and the fact that this algorithm can be easily parallelized.

Variants can be used, for instance in the Reduced Basis [84] and the POD-greedy [41] methods, with respectively an orthonormalization of the computed high-fidelity snapshots and the incremental Singular Value Decomposition (SVD) [15].

2.3.3.3 Operator Compression

A ROM is called online-efficient if in the online stage, the reduced problems can be constructed and solved in computational complexity independent of N. The operator compression step consists in additional treatments required for the efficiency of the online stage, by pre-processing some computationally demanding integration tasks over the high-fidelity domain \(\Omega \) and \(\partial \Omega _N\). We notice that without any additional treatment, the numerical integration involved in the assembling of Eq. (2.13) strongly limits in practice the efficiency of the ROM: no speedup with respect to the high-fidelity model can be obtained in practice. The complexity of such an additional treatment depends on the type of parameter-dependence of the problem. This step is actually needed for all classes of problems reduced by projection-based methods.

Consider the simplest case: a linear problem with an affine dependence in the parameter \(\mu \), for instance \(A_\mu \textbf{u} =\textbf{c}\), where \(A_\mu = A_0+\mu A_1\). Denote V, the matrix whose columns are the vectors of the ROB evaluated at the high-fidelity degrees of freedom. The obtained ROM writes \(\textbf{V}^T A_\mu \textbf{V}\hat{\textbf{u}} =\textbf{V}^T \textbf{c}\): it is not assembled in the online phase, but rather the matrices \(\textbf{V}^T A_0\textbf{V}\) and \(\textbf{V}^T A_1 \textbf{V}\) and the vector \(\textbf{V}^T \textbf{c}\) are precomputed in the offline stage so that the reduced problem is constructed without approximation and efficiently by summing two small matrices. The operator compression step consists in this case in the construction of \(\textbf{V}^T A_0\textbf{V}\) and \(\textbf{V}^T A_1 \textbf{V}\) in the offline stage.

Actually, there exist linear problems for which the operation compression step require an additional approximation, and nonlinear problems that can be carried-out exactly. In the first case, consider \(A_\mu \textbf{u}=\textbf{c}\) with \(\displaystyle A_{ij}=\int _{\Omega } \nabla \left( g(x,\mu )\varphi _j(x)\right) \cdot \nabla \varphi _i(x)\) and \(c_i=\int _{\Omega }f(x)\varphi _i(x)\), where u is the unknown, f a known loading and \(g(x,\mu )\) a known function with variables x and \(\mu \) that cannot be separated: the previous precomputation of reduced matrices cannot be applied, and a treatment is required to, for example, approximately separate the dependencies in x and \(\mu \) of g as \(g(x,\mu )\approx \sum _{k=1}^d g^a_k(x)g^b_k(\mu )\). Then, \(\textbf{V}^T A_\mu \textbf{V}\approx \sum _{k=1}^d g^b_k(\mu ) A_k\) where \(\displaystyle \left( A_k\right) _{ij}=\int _{\Omega } \nabla \left( g^a_k(x)\varphi _j(x)\right) \cdot \nabla \varphi _i(x)\), so that the efficiency of the online stage is recovered; the Empirical Interpolation Method has been proposed in [14, 71] for this purpose. A case of linear problem in harmonic aeroacoustics with nonaffine dependence with respect to the frequency is available in [20]. Conversely, nonlinearities can be handled without approximation in some cases, for instance, the advection term in fluid dynamics can be precomputed in the form of an order-3 tensor: \(\displaystyle \int _{\Omega }\psi _i\cdot \left( \psi _j\cdot \nabla \right) \psi _k\), \(1\le i,j,k\le n\); see [3] for the reduction of the nonlinear Navier-Stokes equations with an exact operator compression step. Other examples are found in structural dynamics with geometric nonlinearities, where order-2 and -3 tensors can also be precomputed, see [58, Sect. 3.2] and [73].

When additional approximations are required, the methods proposed for the operator compression step are call “hyper-reduction” in the literature. This term was coined by the seminal method proposed in [88] in 2005, but has been extended to refer to all the methods proposing a such second reduction stage. Hyper-reduction methods include the Empirical Interpolation Method (EIM, [14]), the Missing Point Estimation (MPE, [12]), the Best Point Interpolation Method (BPIM, [78]), the Discrete Empirical Interpolation Method (DEIM, [23]), the Gauss-Newton with Approximated Tensors (GNAT, [17]), the Energy-Conserving Sampling and Weighting (ECSW, [36]), the Empirical Cubature Method (ECM, [45]), and the Linear Program Empirical Quadrature Procedure (LPEQP, [100]). The reader can find an algorithmic comparison of the Hyper-Reduction and the Discrete Empirical Interpolation Method for a nonlinear thermal problem in [39]. A particular focus is given on hyper-reduction techniques via oblique projection and empirical cubature in the following sections.

2.3.4 Hyper-Reduction via a Reduced Integration Domain

Hyper-reduction via a reduced integration domain has been proposed in [88]. It requires a train set of displacement predictions, so that a reduced approximation vector space can be trained. The finite elements simulations that generate the train set of displacement fields are also predicting stresses fields \({\boldsymbol{\sigma }}\) and internal variables \(\textbf{y}\). Hence, additional reduced bases can be trained for these variables, by using these simulation results [89]. Heuristically, we found it more accurate to include a reduced basis for the stresses \({\boldsymbol{\sigma }}\), as an additional reduced basis for this hyper-reduction scheme. Such a reduced basis is also very convenient for error estimation [91]. The finite element shape functions for displacement fields are denoted by \(({\boldsymbol{\varphi }}_i)_{i=1,\ldots , N}\). For stress fields, we also need to introduce a related finite element representation. We denote by \(({\boldsymbol{\varphi }}_i^\sigma )_{i=1,\ldots , N^\sigma }\) the dedicated shape functions. In the linear framework of manifold learning, we assume that the same finite element mesh is used for the target simulation, in the online step or for the test set of data, and all simulations used to generate the train set of data.

In practice, the implementation of the hyper-reduction follows the manifold learning step that trains reduced bases for displacements and stresses. We recall that they are respectively denoted by \(\textbf{V}\in \mathbb {R}^{N \times n}\) and \(\textbf{V}^\sigma \in \mathbb {R}^{N^\sigma \times n^\sigma }\) in their matrix form, and \(({\boldsymbol{\psi }}_k)_{k=1,\ldots , n}\) \(({\boldsymbol{\psi }}^\sigma _k)_{k=1,\ldots , n^\sigma }\) in their continuous form:

$$\begin{aligned} {\boldsymbol{\psi }}_k(\textbf{x}) = & {} \sum _{i=1}^N {\boldsymbol{\varphi }}_i(\textbf{x}) V_{ik}, \quad \forall \, \textbf{x}\in \Omega , \, k=1,\ldots , n, \end{aligned}$$
(2.17)
$$\begin{aligned} {\boldsymbol{\psi }}^\sigma _k(\textbf{x}) = & {} \sum _{i=1}^{N^\sigma } {\boldsymbol{\varphi }}^\sigma _i(\textbf{x}) V^\sigma _{ik}, \quad \forall \, \textbf{x}\in \Omega , \, k=1,\ldots , n^\sigma . \end{aligned}$$
(2.18)

The reduced displacement reads:

$$\begin{aligned} \widehat{\textbf{u}}(\textbf{x}) = \sum _{k=1}^n {\boldsymbol{\psi }}_k(\textbf{x}) \, \gamma _k,\, \forall \textbf{x}\in \Omega . \end{aligned}$$
(2.19)

The hyper-reduction method proposed in [88] aims at computing reduced coordinates \((\gamma _k)_{k=1,\ldots , n}\) introduced in Eq. (2.19), by projecting the equilibrium equation on \(\textbf{V}\), via a restriction of the domain \(\Omega \) to a Reduced Integration Domain (RID) denoted by \(\Omega _R\). By following the empirical interpolation method [14], interpolation points are computed for column vectors in \(\textbf{V}\) and \(\textbf{V}^\sigma \) separately [48]. The set of respective interpolation points are denoted by \(\mathcal {P}^u\) and \(\mathcal {P}^\sigma \). We follow Algorithm 2.1, proposed for the Discrete Empirical Interpolation Method [23].

Algorithm 2.1
A set of 9-line algorithm codes for interpolation points of the discrete empirical interpolation method. The inputs are reduced basis vectors. The outputs are interpolation point index sets I. A do-for-loop statement is coded.

Interpolation points of the Discrete Empirical Interpolation Method (DEIM) [23]

The RID \(\Omega _R\) is such that it contains the interpolation points related to \(\textbf{V}\) and \(\textbf{V}^\sigma \). For engineering applications, the RID can also include a zone of interest in \(\Omega \) that is user-defined, by using a subset of finite elements. This zone of interest is denoted by \(\Omega _{ZI} \subset \Omega \). By construction, for contact-free problems, the RID is the following:

$$\begin{aligned} \Omega _R = \Omega _{ZI} \cup _{i \in \mathcal {P}^u} \hbox {supp}({\boldsymbol{\varphi }}_i) \cup _{i \in \mathcal {P}^\sigma } \hbox {supp}({\boldsymbol{\varphi }}_i^\sigma ), \end{aligned}$$
(2.20)

where \(\hbox {supp}(f)\in \Omega \) is the support of function f. In practice, \(\Omega _R\) has its own finite-element mesh. It is a reduced mesh involving much less elements than the original finite element mesh of \(\Omega \). It can be enlarge by adding a layer of connected element in the original mesh. Integration of constitutive equations are performed on this reduced mesh, without any intrusive operation on the original finite element solver. A similar hyper-reduction scheme has been developed in [38] for contact problems.

Once the RID is obtained, a set of test functions is set up in order to restrain the balance equations to \(\Omega _R\). They are denoted by \({\boldsymbol{\psi }}_{R \, j}\):

$$\begin{aligned} \mathcal {P} = & {} \left\{ i \in \{1, \ldots , N \}, \, \int _{\Omega \backslash \Omega _R} ({\boldsymbol{\varphi }}_i)^2 \, d\Omega =0 \right\} , \end{aligned}$$
(2.21)
$$\begin{aligned} {\boldsymbol{\psi }}_{R \, j}(\textbf{x}) = & {} \sum _{i\in \mathcal {P}} {\boldsymbol{\varphi }}_i(\textbf{x}) \, V_{ij}, \quad \forall \textbf{x}\in \Omega , \, j=1,\ldots , n, \end{aligned}$$
(2.22)

where \(\mathcal {P}\) is the set of all degrees of freedom in \(\Omega _R\) excepted those belonging to the interface between \(\Omega _R\) and its counterpart. This interface is denoted by \(\mathcal {I}_R\). As explained in [94], the test functions are null on the interface \(\mathcal {I}_R\), as if Dirichlet boundary conditions were imposed to the RID. On this interface, displacements follow the shape of the modes \({\boldsymbol{\psi }}_k\) according to Eq. (2.19). The hyper-reduction method gives access to reduced coordinates \((\gamma _k)_{k=1,\ldots , n}\) that fulfill the following balance equations, for contactless problems:

$$\begin{aligned} {} & {} \widehat{\textbf{u}}(\textbf{x}) = \sum _{k=1}^n {\boldsymbol{\psi }}_k(\textbf{x}) \, \gamma _k, \, \forall \textbf{x}\in \Omega _R \end{aligned}$$
(2.23)
$$\begin{aligned} {} & {} \int _{\Omega _R} {\boldsymbol{\varepsilon }}( {\boldsymbol{\psi }}_{R \, j} ) \, : \, {\boldsymbol{\sigma }}( {\boldsymbol{\varepsilon }}( \widehat{\textbf{u}} ) ) \, d\Omega \end{aligned}$$
(2.24)
$$\begin{aligned} {} & {} - \int _{\Omega _R} {\boldsymbol{\psi }}_{R \, j} \, f_\mu \, d\Omega - \int _{\partial \Omega _R \cap \partial \Omega _N} {\boldsymbol{\psi }}_{R \, j} \, {T_\mu ,N} \, dS = 0.\end{aligned}$$
(2.25)
$$\begin{aligned} {} & {} \forall j=1,\ldots , n \end{aligned}$$
(2.26)

The matrix form of the hyper-reduced balance equations reads: find \({\boldsymbol{\gamma }}\in \mathbb {R}^n\) such that

$$\begin{aligned} \widehat{\textbf{u}}(\textbf{x}) = & {} \sum _{i=1}^{N} {\boldsymbol{\varphi }}_i(\textbf{x}) \, \widehat{q}_i , \, \forall \textbf{x}\in \Omega _R, \end{aligned}$$
(2.27)
$$\begin{aligned} \widehat{\textbf{q}} = & {} \textbf{V}\, {\boldsymbol{\gamma }}, \end{aligned}$$
(2.28)
$$\begin{aligned} \mathcal {F}^{HR}( {\boldsymbol{\gamma }}) := & {} \textbf{V}[\mathcal {P},:]^T \, \mathcal {F}(\textbf{V}\, {\boldsymbol{\gamma }})[\mathcal {P}], \end{aligned}$$
(2.29)
$$\begin{aligned} \mathcal {F}^{HR}( {\boldsymbol{\gamma }}) = & {} 0, \end{aligned}$$
(2.30)

where \(\textbf{V}[\mathcal {P},:]\) denotes a row restriction of matrix \(\textbf{V}\) to indices in \(\mathcal {P}\). The reduced Newton-Raphson step reads:

$$\begin{aligned} \widehat{\textbf{u}}^k(\textbf{x}) = & {} \sum _{i=1}^{N} {\boldsymbol{\varphi }}_i(\textbf{x}) \, \widehat{q}^k_i , \, \forall \textbf{x}\in \Omega _R, \end{aligned}$$
(2.31)
$$\begin{aligned} \widehat{\textbf{q}}^k = & {} \textbf{V}\, {\boldsymbol{\gamma }}^k, \end{aligned}$$
(2.32)
$$\begin{aligned} \textbf{K}^{HR} := & {} \textbf{V}[\mathcal {P},:]^T \, \frac{D \mathcal {F}_\mu }{D \widehat{\textbf{u}}}(\widehat{\textbf{u}}^{k-1})[\mathcal {P},:] \, \textbf{V}, \end{aligned}$$
(2.33)
$$\begin{aligned} \textbf{K}^{HR} \, ({\boldsymbol{\gamma }}^k - {\boldsymbol{\gamma }}^{k-1}) = & {} - \textbf{V}[\mathcal {P},:]^T \, \mathcal {F}(\widehat{\textbf{u}}^{k-1})[\mathcal {P}], \end{aligned}$$
(2.34)

where the reduced stiffness matrix \(\textbf{K}^{HR}\) is computed by using solely the elements of the RID \(\Omega _R\). We assume that the matrix \(\textbf{K}^{HR}\) is full rank. This assumption is always checked in numerical solutions of hyper-reduced equations. Rank deficiency may appear when the RID construction do not account for the contribution of a reduced basis dedicated to stresses.

Once the RID is represented as a finite element mesh, this hyper-reduction scheme is intrusive solely for the linear solver involved in the Newton-Raphson step and its related convergence criterion. Nevertheless, the mesh of the RID has to include labels for the set \(\mathcal {P}\) or its counterpart \(\mathcal {I}_R\). This counterpart is the set of degrees of freedom connected to elements of the original mesh that are not in the reduced mesh.

Remarks:

  • Here the most complex operations are indeed the computation of \(\textbf{K}^{HR}\) and the solution of the reduced linear system of equations. They respectively scale linearly with \(\hbox {card}(\mathcal {P}) \, n^2\) and \(n^3\). Hence \(n^3\) has to be small enough compared to N if we consider the computational complexity for the solution of sparse linear systems in the finite element method.

  • Because of the spreading nature of interpolation points, most of the time, the RID is not a compact subdomain.

  • The hyper-reduced order model is a kind of submodel where the displacements at the interface \(\mathcal {I}_R\) follow the shape of the modes \({\boldsymbol{\psi }}_k\) according to Eq. (2.19).

  • Finite element corrections for displacements and stresses can be easily computed over the RID once the reduced prediction have been achieved. This scheme is termed Hybrid Hyper-Reduction in [46].

  • A parallel programming of the hyper-reduction method has been proposed in [95].

  • Reduced order models not only save computational time, they save computational resources including energy consumption savings as explained in [90] and memory footprint [46].

Property 1: In linear elasticity, if \(\textbf{K}^{HR}\) is full rank, the hyper-reduced balance equations are equivalent to an oblique projection of the finite element prediction \(\textbf{q}\in \mathbb {R}^N\):

$$\begin{aligned} {\boldsymbol{\Pi }}^T:= & {} \textbf{V}[\mathcal {P},:]^T \, \textbf{K}[\mathcal {P},:] , \end{aligned}$$
(2.35)
$$\begin{aligned} \widehat{\textbf{q}} = & {} \textbf{V}\, ({\boldsymbol{\Pi }}^T \, \textbf{V})^{-1} {\boldsymbol{\Pi }}^T \, \textbf{q},\end{aligned}$$
(2.36)
$$\begin{aligned} \hbox {and} \quad {\boldsymbol{\Pi }}^T \, \widehat{\textbf{q}} = & {} {\boldsymbol{\Pi }}^T \, \textbf{q}, \end{aligned}$$
(2.37)

with \(\textbf{K}\, \textbf{q}= \textbf{F}\). Hence the hyper-reduced prediction of the reduced vector \({\boldsymbol{\gamma }}\) is a minimizer for \(f({\boldsymbol{\beta }})\):

$$\begin{aligned} {\boldsymbol{\gamma }}^\star \in \mathbb {R}^n, \, f({\boldsymbol{\gamma }}^\star ) = \Vert {\boldsymbol{\Pi }}^T \, \left( \textbf{V}\, {\boldsymbol{\gamma }}^\star - \textbf{q}\right) \Vert _2^2. \end{aligned}$$
(2.38)

Here \({\boldsymbol{\Pi }}\) is a projector for elastic stresses in \(\Omega _R\) according to the reduced test functions:

$$\begin{aligned} \sum _{i=1}^{N} \Pi _{ik} \, \left( \textbf{V}\, {\boldsymbol{\gamma }}- \textbf{q}\right) _i = \int _{\Omega _R} {\boldsymbol{\varepsilon }}( {\boldsymbol{\psi }}_{R \, k} ) \, : \, \left( {\boldsymbol{\sigma }}(\widehat{\textbf{q}}) - {\boldsymbol{\sigma }}(\textbf{q}) \right) \, d\Omega . \end{aligned}$$
(2.39)

The proof is straightforward. Here, \(\textbf{K}^{HR}={\boldsymbol{\Pi }}^T \, \textbf{V}\). The Jacobian matrix for f reads \(\textbf{J}= \textbf{V}^T \, {\boldsymbol{\Pi }}\, {\boldsymbol{\Pi }}^T \, \textbf{V}= (\textbf{K}^{HR})^T \, \textbf{K}^{HR}\). If \(\textbf{K}^{HR}\) is full rank, then \(\textbf{J}\) is symmetric definite positive and \(\textbf{J}^{-1} = (\textbf{K}^{HR})^{-1} \, (\textbf{K}^{HR})^{ -T}\). Then, both the minimization problem and the hyper-reduced equation have a unique solution. The solution of the minimization problem is:

$$\begin{aligned} \textbf{q}^f = & {} \textbf{V}\, \textbf{J}^{-1} \textbf{V}^T \, {\boldsymbol{\Pi }}\, {\boldsymbol{\Pi }}^T \, \textbf{q},\end{aligned}$$
(2.40)
$$\begin{aligned} = & {} \textbf{V}\, (\textbf{K}^{HR})^{ -1} \, {\boldsymbol{\Pi }}^T \, \textbf{q}, \end{aligned}$$
(2.41)
$$\begin{aligned} = & {} \textbf{V}\, (\textbf{K}^{HR})^{ -1} \, \textbf{V}[\mathcal {P},:]^T \, \textbf{K}[\mathcal {P},:] \, \textbf{q},\end{aligned}$$
(2.42)
$$\begin{aligned} = & {} \textbf{V}\, (\textbf{K}^{HR})^{ -1} \, \textbf{V}[\mathcal {P},:]^T \, \textbf{F}[\mathcal {P}], \end{aligned}$$
(2.43)
$$\begin{aligned} = & {} \widehat{\textbf{q}}. \end{aligned}$$
(2.44)

As an intermediate result, Eq. (2.42) is the oblique projection.

In linear elasticity, the Céa’s lemma holds. Let us denote \(\textbf{q}^\circ \in \mathbb {R}^N\) the minimizer of the upper bound in Eq. (2.3) related to this lemma:

$$\begin{aligned} \textbf{q}^\circ = & {} \arg \min _{\textbf{q}^\star \in \mathbb {R}^N} \Vert \widetilde{u}- \sum _{i=1}^{N} {\boldsymbol{\varphi }}_i q_i^\star \Vert , \end{aligned}$$
(2.45)
$$\begin{aligned} \widetilde{v}^\circ = & {} \sum _{i=1}^{N} {\boldsymbol{\varphi }}_i q_i^\circ . \end{aligned}$$
(2.46)

The best projection of the minimizer \(\textbf{q}^\circ \) in the approximation space is denoted by \({\boldsymbol{\gamma }}_P\):

$$\begin{aligned} {\boldsymbol{\gamma }}^P = & {} \hbox {argmin}_{\textbf{g}\in \mathbb {R}^n} ( \textbf{q}^\circ - \textbf{V}\, \textbf{g})^T \textbf{M}( \textbf{q}^\circ - \textbf{V}\, \textbf{g}), \end{aligned}$$
(2.47)
$$\begin{aligned} \widetilde{v}^P = & {} \sum _{i=1}^{N} {\boldsymbol{\varphi }}_i (\textbf{V}\, \gamma ^P)_i. \end{aligned}$$
(2.48)

Let us introduce an ideal reduced basis \(\textbf{V}^\circ \in \mathbb {R}^{N \times n}\) (It assumes that n is an ideal reduced dimension) such that: \(\textbf{q}^\circ = \textbf{V}^\circ \, {\boldsymbol{\gamma }}^\circ \), and \(\textbf{V}^{\circ T} \textbf{M}\textbf{V}^\circ = \textbf{I}\), where \(M_{ij} = \langle {\boldsymbol{\varphi }}_i, {\boldsymbol{\varphi }}_j \rangle \). Hence \({\boldsymbol{\gamma }}^P = \textbf{V}^T \textbf{M}\textbf{V}^\circ \, {\boldsymbol{\gamma }}^\circ \).

Property 2: In linear elasticity, the upper bound of approximation error is increased by a Chordal distance [101] between \(\textbf{V}\) and the ideal reduced basis \(\textbf{V}^\circ \):

$$\begin{aligned} \Vert \widetilde{u}- \widetilde{v}^\circ \Vert \le \, \Vert \widetilde{u}- \widetilde{v}^P\Vert + \Vert {\boldsymbol{\gamma }}^\circ \Vert _2 \, d^{Ch}(\textbf{V}^\circ , \textbf{V}), \end{aligned}$$
(2.49)

where \(d^{Ch}(\textbf{V}^\circ , \textbf{V})\) is the Chordal distance between \(\textbf{V}^\circ \) and \(\textbf{V}\).

Hence, the smaller the Chordal distance between the sub-space spanned by \(\textbf{V}\) and \(\textbf{V}^\circ \), the better the reduced prediction by using a Galerkin projection (When the RID covers the full domain). A certification of the reduced projection can be achieved, when all errors admit an upper bound, by following the constitutive relation error proposed in [47, 55].

The Chordal distance uses the principal angles \({\boldsymbol{\theta }}\in \mathbb {R}^n\), \(\theta _k \in [0,\pi /2[\) for \(k=1,\ldots , n\), computed via a full singular value decomposition:

$$\begin{aligned} \textbf{V}^T \, \textbf{M}\, \textbf{V}^\circ = & {} \textbf{U}\, \cos ({\boldsymbol{\theta }}) \textbf{U}^{\circ T}, \quad \, \textbf{U}^T\textbf{U}= \textbf{U}^{\circ T} \textbf{U}^\circ = \textbf{I},\end{aligned}$$
(2.50)
$$\begin{aligned} d^{Ch}(\textbf{V}^\circ ,\textbf{V}) = & {} \Vert \hbox {sin}({\boldsymbol{\theta }}) \Vert _F, \end{aligned}$$
(2.51)
$$\begin{aligned} \Vert \textbf{U}^\circ \Vert _F^2 = & {} n, \end{aligned}$$
(2.52)

where \(\Vert \cdot \Vert _F\) is the Frobenius norm. Here, \(\cos ({\boldsymbol{\theta }})\) and \(\sin ({\boldsymbol{\theta }})\) are cosine and sine diagonal matrices. In addition the following property holds when a full SVD is computed:

$$\begin{aligned} \textbf{U}^\circ \, \textbf{U}^{\circ T} = \textbf{I}, \quad \textbf{U}\, \textbf{U}^{T} = \textbf{I}. \end{aligned}$$
(2.53)

The proof of the previous property is straightforward by using the triangular inequality. We just need to prove that:

$$\begin{aligned}\Vert \widetilde{v}^\circ - \widetilde{v}^P \Vert \le \Vert {\boldsymbol{\gamma }}^\circ \Vert _2 \, d^{Ch}(\textbf{V}^\circ , \textbf{V}).\end{aligned}$$

Hence, the proof is the following:

$$\begin{aligned} \Vert \widetilde{v}^\circ - \widetilde{v}^P \Vert ^2 = & {} {\boldsymbol{\gamma }}^{\circ T} \, (\textbf{V}^\circ - \textbf{V}\, \textbf{V}^T \textbf{M}\textbf{V}^\circ )^T \, \textbf{M}\, (\textbf{V}^\circ - \textbf{V}\, \textbf{V}^T \textbf{M}\textbf{V}^\circ ) \, {\boldsymbol{\gamma }}^\circ \nonumber \\ = & {} {\boldsymbol{\gamma }}^{\circ T} \, (\textbf{I}- \textbf{V}^{\circ T} \textbf{M}\textbf{V}\, \textbf{V}^T \textbf{M}\textbf{V}^\circ ) \, {\boldsymbol{\gamma }}^\circ . \end{aligned}$$
(2.54)

Therefore

$$\begin{aligned} \Vert \widetilde{v}^\circ - \widetilde{v}^P \Vert ^2 = & {} {\boldsymbol{\gamma }}^{\circ T} \, (\textbf{I}- \textbf{U}^{\circ } \, \hbox {cos}({\boldsymbol{\theta }})^2 \, \textbf{U}^{\circ T}) \, {\boldsymbol{\gamma }}^\circ \end{aligned}$$
(2.55)
$$\begin{aligned} = & {} {\boldsymbol{\gamma }}^{\circ T} \, \textbf{U}^\circ (\textbf{I}- \hbox {cos}({\boldsymbol{\theta }})^2 ) \, \textbf{U}^{\circ T} \, {\boldsymbol{\gamma }}^\circ \end{aligned}$$
(2.56)
$$\begin{aligned} = & {} {\boldsymbol{\gamma }}^{\circ T} \, \textbf{U}^\circ \hbox {sin}({\boldsymbol{\theta }})^2 \, \textbf{U}^{\circ T} \, {\boldsymbol{\gamma }}^\circ \end{aligned}$$
(2.57)
$$\begin{aligned} = & {} \Vert \hbox {sin}({\boldsymbol{\theta }}) \, \textbf{U}^{\circ T} \, {\boldsymbol{\gamma }}^\circ \Vert _2^2. \end{aligned}$$
(2.58)

For all matrices \(\textbf{A}\in \mathbb {R}^{n \times m}\) and \(\textbf{B}\in \mathbb {R}^{m \times n}\) the following property holds:

$$\begin{aligned}\Vert \textbf{A}\textbf{B}\Vert _F \le \Vert \textbf{A}\Vert _F \, \Vert \textbf{B}\Vert _F,\end{aligned}$$

and for \(\textbf{a}\in \mathbb {R}^{n}\): \(\Vert \textbf{a}\Vert _F = \Vert \textbf{a}\Vert _2\).

Thus:

$$\begin{aligned} \Vert \widetilde{v}^\circ - \widetilde{v}^P \Vert ^2 \le \Vert \hbox {sin}({\boldsymbol{\theta }}) \Vert _F^2 \, \Vert \textbf{U}^{\circ T} \, {\boldsymbol{\gamma }}^\circ \Vert _2^2 \le \Vert \hbox {sin}({\boldsymbol{\theta }}) \Vert _F^2 \, \Vert {\boldsymbol{\gamma }}^\circ \Vert _2^2. \end{aligned}$$
(2.59)

Property 3: When the identity matrix is substituted for \(\textbf{K}\) in Eqs. (2.35) and (2.36) is known as the Gappy POD reconstruction [35] of truncated variables \(\textbf{q}[\mathcal {P}]\). The reconstructed vector in \(\mathbb {R}^N\) is:

$$\begin{aligned} \widetilde{\textbf{q}} = \textbf{V}\, (\textbf{V}[\mathcal {P},:]^T \, \textbf{V}[\mathcal {P},:])^{-1} \textbf{V}[\mathcal {P},:]^T \, \textbf{q}[\mathcal {P}]. \end{aligned}$$
(2.60)

This Gappy POD reconstruction is useless for displacement variables because the oblique projection in Eq. (2.36) is a direct outcome of the hyper-reduced prediction. But such a reconstruction is very convenient for stress variables that the hyper-reduced scheme forecasts only on \(\Omega _R\). The reconstructed stress variables reads:

$$\begin{aligned} \widetilde{\textbf{q}}^\sigma = \textbf{V}^\sigma \, (\textbf{V}^\sigma [\overline{\mathcal {P}}^\sigma ,:]^T \, \textbf{V}^\sigma [\overline{\mathcal {P}}^\sigma ,:])^{-1} \textbf{V}^\sigma [\overline{\mathcal {P}}^\sigma ,:]^T \, \textbf{q}^\sigma [\overline{\mathcal {P}}^\sigma ], \end{aligned}$$
(2.61)

where \(\overline{\mathcal {P}}^\sigma \) is the set of all stress indices available in \(\Omega _R\). Since the RID contains interpolation points for \(\textbf{V}^\sigma \), these points are included in \(\overline{\mathcal {P}}^\sigma \), therefore the truncated matrix \(\textbf{V}^\sigma [\overline{\mathcal {P}}^\sigma ,:]\) is full column rank and the reconstruction is a well posed problem.

Remark about the RID construction and the DEIM: If the RID contains solely the elements connected to interpolation points related to the reduced basis \(\textbf{V}\), such that \(\mathcal {P} = \mathcal {P}^u\), then the Gappy POD gives the interpolation scheme of the DEIM:

$$\begin{aligned} \widetilde{\textbf{q}}^{DEIM} = \textbf{V}\, (\textbf{V}[\mathcal {P}^u,:])^{-1} \, \textbf{q}[\mathcal {P}^u]. \end{aligned}$$
(2.62)

But, when considering the hyper-reduction scheme, one can observe overfitting in the sense that the train set of displacement is very well approximated by the DEIM reconstruction, but the hyper-reduced predictions are not accurate. For this reason, we recommend the use of the additional reduced basis \(\textbf{V}^\sigma \) and the related interpolation points.

Various applications of the hyper-reduction method using a RID have been developed for:

  • thermal problems in structures or solids, in [88],

  • boundary element models [93],

  • reduced simulations of sintering processes [99],

  • ductile damage predictions, including unstable localisation of strains [97],

  • reduction of multidimensional domains, when space variables are an Euclidean space of arbitrary dimension \(D > 3\) [98],

  • simulation of viscoelastic-viscoplastic composites materials [74],

  • model calibration in plasticity of materials [37, 46, 96],

  • contact problems using Lagrange multipliers [38, 62],

  • arc length algorithm for buckling problems or strain localisation [59],

  • micromorphic continua including higher order stress fields [48].

2.3.5 Hyper-Reduction via Empirical Cubature

To assemble the linearized equations of the reduced Newton algorithm (2.13) when using the ROM in the online phase, hyper-reduction techniques via empirical cubature aim to compute the costly integrals over the high-fidelity domain by replacing the high-dimensional quadrature formula by a low-dimensional reduced quadrature with positive weights. The ECSW [36], ECM [45] and LPEQP [100] are methods implementing such reduced quadratures. In this section, we present the ECM, more details are available in [19].

We consider the high-fidelity model described in Sect. 2.2. The integrals involved in the assembling of the linearized Eq. (2.7) make use of high-fidelity quadrature formulas. Apply such quadrature to the reduced internal forces vector:

$$\begin{aligned} \begin{aligned} \hat{F}^\textrm{int}_i(t)&:=\int _\Omega \sigma \left( \epsilon (\hat{u}),y\right) (x,t):\epsilon \left( \psi _i\right) (x)\\ &=\sum _{e\in E}\sum _{k=1}^{n_e}\omega _k\sigma \left( \epsilon (\hat{u}),y\right) (x_k,t):\epsilon \left( \psi _i\right) (x_{k}), 1\le i\le n, \end{aligned} \end{aligned}$$
(2.63)

where E denotes the set of elements of the mesh, \(n_e\) the number of quadrature points for the element e, \(\omega _k\) and \(x_k\) are the quadrature weights and points associated to e. The total number of quadrature points is denoted \(N_G\).

The ECM aims to approximate the high-fidelity quadrature by a reduced quadrature with positive weights, which, when applied to the reduced internal forces vector, writes

$$\begin{aligned} \hat{F}^\textrm{int}_i(t)\approx \sum _{k'=1}^{n_g}\hat{\omega }_{k'}\sigma \left( \epsilon (\hat{u}),y\right) (\hat{x}_{k'},t):\epsilon \left( \psi _i\right) (\hat{x}_{k'}), 1\le i\le n, \end{aligned}$$
(2.64)

where \(\hat{\omega }_{k'}>0\) and \(\hat{x}_{k'}\) are respectively the reduced quadrature weights and points, and \(n_g\ll N_G\) is the length of the reduced quadrature.

Denote \(f_q:=\sigma \left( \epsilon (u_{(q//n)+1}),y\right) :\epsilon \left( \psi _{(q\%n)+1}\right) \), \(1\le q\le nN_c\). where // and \(\%\) are the quotient and the remainder of the Euclidean division. Denote as well \(\mathcal {Z}^{n_G}\) a subset of \([1;N_G]\) of size \(n_G\) and \(J_{\mathcal {Z}^{n_G}}\in \mathbb {R}^{nN_c\times n_G}\) and \(\boldsymbol{b}\in \mathbb {N}^{nN_c}\) such that for all \(1\le q\le nN_c\) and all \(1\le q'\le n_G\),

$$\begin{aligned} J_{\mathcal {Z}^{n_G}} = \Bigg (f_q(x_{\mathcal {Z}^{n_G}_{q'}})\Bigg )_{1\le q\le nN_c,~q'\in \mathcal {Z}^{n_G}},\qquad \boldsymbol{b} = \left( \int _{\Omega }f_q\right) _{1\le q\le nN_c}, \end{aligned}$$
(2.65)

where \(\mathcal {Z}^{n_G}_{q'}\) denotes the \(q'\)-th element of \(\mathcal {Z}^{n_G}\). We remind that n is the number of POD modes, see Sect. 2.3.3.2. Let \(\hat{\boldsymbol{\omega }}\in {\mathbb {R}^{+}}^n_G\), \(\displaystyle \left( J_{\mathcal {Z}^{n_G}}\hat{\boldsymbol{\omega }}\right) _q=\sum _{q'=1}^{n_G}\hat{\omega }_{q'}\sigma \left( \epsilon (u_{{(q//n)+1}}),y\right) (x_{\mathcal {Z}^{n_G}_{q'}}):\epsilon \left( \psi _{(q\%n)+1}\right) (x_{\mathcal {Z}_{q'}})\), \(1\le q\le nN_c\), is a candidate approximation for \(\displaystyle \int _{\Omega }\sigma \left( \epsilon (u_{{(q//n)+1}}),y\right) :\epsilon \left( \psi _{(q\%n)+1}\right) = b_q\) , \(1\le q\le nN_c\). The problem of finding the more accurate reduced quadrature formula of length \(n_G\) for the reduced internal forces vector is:

$$\begin{aligned} \left( \hat{\omega },\mathcal {Z}^{n_G}\right) =\arg \underset{\hat{\omega }'\in {\mathbb {R}^{+}}^{n_G},\mathcal {Z}'^{n_G}\subset [1;N_G]}{\min }\left\| J_{\mathcal {Z}'^{n_G}}\hat{\omega }'-b\right\| , \end{aligned}$$
(2.66)

where \(\left\| \cdot \right\| \) denotes the Euclidean norm. Minimizing the length of the reduced quadrature formula as well leads to a NP-hard problem, which solution can be approximated using a Nonnegative Orthogonal Matching Pursuit algorithm, see Algorithm 2.2.

Algorithm 2.2
A set of 9-line algorithm codes for non-negative orthogonal matching pursuit. The inputs are J, b, tolerance, and x k. The outputs are omega cap k and x cap k. A do-while loop statement is coded.

Nonnegative Orthogonal Matching Pursuit.

In Algorithm 2.2, \(J_{[1;N_G]}\) satisfies the definition (2.65) with \(\mathcal {Z}^{n_G}=[1;N_G]\). The positivity of the weights of the reduced quadrature preserves the spectral properties of the operator associated with the high-fidelity problem, see [19, Remark 1].

2.3.6 Computational Complexity

In this section we restrict our attention to elliptic problems or to linearized problems. The bilinear part of the weak form for finite-dimensional solution spaces is a matrix. When using Finite Element solution space, this matrix is sparse. But when using a reduced solution space, this matrix is usually a full matrix. Therefore, computational complexity of the finite element prediction is the complexity of the solution of sparse linear system. It scales linearly with \(N \, \omega ^2\), where \(\omega \) is the band width of the sparse matrix. For the reduced prediction, the solution of a full linear system scales linearly with \(n^3\). We recommend to restrict linear model reduction schemes, with or without hyper-reduction, to reduced dimension n lower than \(N^{1/3}\), otherwise the solution of reduced equation will have a computational complexity similar to the complexity of the finite element model. This recommendation does not concern explicit solvers.

2.4 Nonlinear Manifold Learning for Projection-Based Reduced-Order Modeling

Consider a parametrized variability, and a set of snapshots generated using the high-fidelity model over a sampling of the parameter domain. The parametrized problem is said nonreducible when applying a linear data compression over this set of snapshots leads to a ROB containing too many vectors for the online problem to feature an interesting speedup. Formally, this happens when the Kolmogorov n-width \(d_{n}(\mathcal {M})\) decreases too slowly with respect to n, where we recall that n is the cardinality of the ROB,

$$\begin{aligned} d_{n}(\mathcal {M}) := \underset{\mathcal {H}_{n} \in \text {Gr}(n,\mathcal {H})}{\inf } \ \underset{u \in \mathcal {M}}{\sup } \ \underset{v \in \mathcal {H}_n}{\inf } || u - v ||_{\mathcal {H}}, \end{aligned}$$
(2.67)

with the Grassmannian \(\text {Gr}(n,\mathcal {H})\) being the set of all n-dimensional subspaces of \(\mathcal {H}\) and \(\mathcal {H}_{n} \in \text {Gr}(n,\mathcal {H})\) the subspace spanned by the considered ROB. Qualitatively, the solution manifold \(\mathcal {M}\) covers too many independent directions to be embedded in a low-dimensional subspace. To address this issue, several techniques have been developed:

  • Problem-specific methods tackle the difficulties of some specific physics problems that are known to be nonreducible, such as advection-dominated problems which have been largely investigated, for instance in [16, 49, 85].

  • Online-adaptive model reduction methods update the ROM in the exploitation phase by collecting new information online as explained in [102], in order to limit extrapolation errors when solving the parametrized governing equations in a region of the parameter space that was not explored in the training phase. The ROM can be updated for example by querying the high-fidelity model when necessary for basis enrichment [18, 44, 56, 80, 88].

  • ROM interpolation methods [6,7,8,9, 24, 64,65,66,67,68, 75, 76] use interpolation techniques on Grassmann manifolds or matrix manifolds to adapt the ROM to the parameters considered in the exploitation phase by interpolating between two precomputed ROMs.

  • Dictionaries of basis vector candidates enable building a parameter-adapted ROM in the exploitation phase by selecting a few basis vectors. This technique is presented in [54, 72] for the Reduced Basis method.

  • Nonlinear manifold ROM methods [57, 63] learn a nonlinear embedding and project the governing equations onto the corresponding approximation manifold, by means of a nonlinear function map** a low-dimensional latent space to the solution space.

  • Dictionaries of ROMs rely on the construction of several local ROMs adapted to different regions of the solution manifold. These local ROMs can be obtained by partitioning the time interval [32, 33], the parameter space [33, 34, 42, 44, 51, 52, 82], or the solution space [10, 11, 27, 40, 77, 82, 92].

In the following Sects. 2.4.1 and 2.4.2, we provide more details on the last two entries of the previous list.

2.4.1 Nonlinear Dimensionality Reduction via Auto-Encoder

Nonlinear manifold learning means that the solution manifold is approximated by a domain in the ambient solution space that is not included in a low-dimensional vector subspace, as illustrated in Fig. 2.3.

Fig. 2.3
A diagram of the non-linear data compression. An oval labeled mu in the parameter space indicates variability with a large influence on u. 2 lines connect this parameter space to the solution space. In the solution space, there is an irregular shape labeled u of mu, indicates a large solution manifold.

Nonlinear manifold learning

Let us consider a formal representation of parabolic and nonlinear Partial Differential Equations (PDEs) that are parameterized with respect to some physical parameters of interest. We mean by physical parameters the parameters that appear directly within the equations such as the boundary conditions, the viscosity for fluid mechanics (henceforth the Reynolds number), the time step for dynamical systems of fluid flows or infectious diseases, etc. These parameters are denoted \(\mu \) without any loss of generality as introduced in the preceding section. The formal representation of the equations is given as follows:

$$\begin{aligned} \frac{\partial \widetilde{u}}{\partial t} = f(\widetilde{u}, \mu ). \end{aligned}$$
(2.68)

Reduced order modeling based on nonlinear data compression techniques might be a solution for example when the described physical fields by the model equations require a large number of vectors in the ROM as specified above. Nevertheless there are cases where even if the physical solution fields are completely reducible, the Galerkin projection may not be appropriate for the model equations describing this physics.

Convection-diffusion PDEs have this stability issue even more generally when considering Galerkin projection using finite element basis functions. In the literature, it is proved that the coherent structures of a turbulent, unsteady and in-compressible fluid flow are reproducible by a small number of POD basis functions. However, if these functions are used to solve the Newton-Raphson problem of the associated reduced order Galerkin dynamical system, then an instability appears as a function of the time. In the literature, many solutions are proposed to tackle this difficulty while kee** the reduced order approximation in a linear space spanned by the POD basis functions. We can refer to the Petrov-Galerkin technique, the least square minimization of the equations residual, the variational finite element method, etc. Recently, nonlinear approximations of the solution fields in a manifold of reduced dimension start to gain importance in the literature. In this case, the reduced order model is said to be a nonlinear projection based reduced model. Some authors introduced nonlinear approximations using Deep Learning approaches for projection based reduced models. We find in the literature more classical nonlinear approximations based on the Kernel POD technique.

In what follows, we make a focus from the literature on Deep Learning projection based reduced models and their different possible formulations. More precisely, we are talking about Deep AutoEncoders (DAEs) from the domain of Deep Learning. DAEs are artificial neural networks formed of layers of spatial convolutions, nonlinear activation functions and linear systems called fully connected functions. These architectures are used to perform nonlinear dimensional reduction, following unsupervised data compression. Henceforth, the DAE allows to determine latent features within a set of given inputs.

We denote by h and g respectively the encoder map** and the decoder map** of a DAE. In general g is the transpose map** of h. We denote by \(\widehat{\alpha }\) the reduced latent features inferred by h. The dimension of the latent features is equal to the intrinsic dimensionality of the manifold as stated in Remark 2.1 in [63]. This intrinsic dimensionality is in the current case the dimension \(N_{\mu }\) of the vector of parameters \(\mu \), which may include the time variable also.

We note that the reduced latent features of a DAE are not parameterized variables in general. In other words they can be seen as non-parametric features associated with a given set of inputs. This formulation is interesting in the framework of projection-based model reduction, where the associated parameters are given straightforwardly by the physical equations. Hence, knowing the variable parameters within the inputs data helps only with the determination of the intrinsic dimension of the manifold.

In the literature, we find two different formulations for DAEs projection based reduced models.

The first formulation was proposed by Kashima [53] and Hartman et al. [43]. It is very analogical to the Galerkin formulation of projection based reduced models: given h and g such that \(g\circ h: \widetilde{u} \xrightarrow {} \widehat{u}\) and \(\widehat{u}\approx \widetilde{u}\), then the reduced model is formulated as follows:

$$\begin{aligned} \frac{\partial }{\partial t}\widehat{\alpha }(\mu ) = & {} h \circ f \circ g (\widehat{\alpha }(\mu )) \; \; , \;\; \widehat{\alpha }(t=0) = h (\widetilde{u}(t=0))\end{aligned}$$
(2.69)
$$\begin{aligned} = & {} h\left( f(g(\widehat{\alpha }(\mu )),\mu )\right) . \end{aligned}$$
(2.70)

In the above formulation, the authors relied on the following three points in order to set the time derivative of the latent features equal to the right hand side of Eq. (2.69).

  • \(\frac{\partial }{\partial t}g(\widehat{\alpha }(\mu ))\) belongs to the manifold described by \(\mu \xrightarrow {} \widetilde{u}(\mu )\),

  • \(h\left( \frac{\partial }{\partial t}g(\widehat{\alpha }(\mu ))\right) =\frac{\partial }{\partial t}h\left( g(\widehat{\alpha }(\mu ))\right) \),

  • \(h\circ g = I_{N_{\mu }}\).

Remark 2.1

The first two above items are hypothesis that are fulfilled in the case where h and g are linear or affine functions. The last item is fulfilled theoretically by the inputs data compression using parameters optimisation of the DAE architecture.

The second formulation of DAEs projection based reduced models is proposed in [63], where a least square minimisation of the residual of parabolic PDEs because of the decoder approximation is performed. Then, the reduced model is formulated as follows in order to determine the reduced latent features:

$$\begin{aligned} \frac{\partial }{\partial t}\widehat{\alpha }(\mu ) = \textrm{argmin}_{\widehat{v}(\mu )\in \mathbb {R}^{N_{\mu }}}\left\| J(\widehat{a}(\mu ))\widehat{v}(\mu )-f(g(\widehat{a}(\mu )),\mu )\right\| ^{2}_{2} , \end{aligned}$$
(2.71)

where \(\widehat{v}(\mu )\) is the time derivative of \(\widehat{a}(\mu )\), \(\left\| .\right\| _{2}\) denotes the mean square norm or the euclidean norm and, J is the Jacobian matrix of the decoder map** which belong element-wise to the tangent space to the solutions manifold at a given point. J is expressed as follows:

$$\begin{aligned}J: \widehat{a} \xrightarrow {} \nabla g(\widehat{a}).\end{aligned}$$

In this second formulation, the authors do not suppose that the velocity of the decoder approximation is in the manifold of the solutions because mathematically it belongs to the tangent space to the manifold at a given point. Hence, they claim that encoding the decoder approximation will produce a poor approximation by the reduced model.

2.4.2 Piecewise Linear Dimensionality Reduction via Dictionary-Based ROM-Nets

Parts of this section has been inspired from the authors previous work [30].

Piecewise linear manifold learning means that the solution manifold is approximated by a dictionary of local linear subspaces, as illustrated in Fig. 2.4, where we denote \(\mathcal {M}\) the solution manifold.

Fig. 2.4
A diagram of the piecewise linear data compression. An oval labeled mu in the parameter space indicates variability with a large influence on u. 2 lines connect this parameter space to the solution space. In the solution space, there is an irregular shape labeled u of mu, indicates a large solution manifold.

Piecewise linear manifold learning

The solution manifold is partitioned to get a collection of subsets \(\mathcal {M}_k \subset \mathcal {M}\) that can be covered by a dictionary of low-dimensional subspaces, enabling the use of local linear ROMs. If \(\{ \mathcal {M}_k \}_{k \in [\![ 1;K ]\!]}\) is a partition of \(\mathcal {M}\), then:

$$\begin{aligned} \forall k \in [\![ 1;K ]\!], \ \forall N \in \mathbb {N}^{*}, \quad d_{N}(\mathcal {M}_k) \le d_{N}(\mathcal {M}). \end{aligned}$$
(2.72)

The concepts of ROM-net and dictionary-based ROM-net are introduced in [27], which we present in this section. Suppose we dispose of an already computed dictionary of ROMs for the parametrized problem (2.4), where each element of the dictionary is a ROM that can approximate the problem on a subset of the solution manifold \(\mathcal {M}\). A dictionary-based ROM-net is a machine learning algorithm trained to assign the parameter \(\mu \in \mathcal {X}\) to the ROM of the dictionary leading to the most accurate reduced prediction. This assignment, called model recommendation in [77], is a classification task, see Fig. 2.5.

Fig. 2.5
A flow diagram starts with the new parameter mu, followed by classifier C k, ROM 1 to K, simulation with local ROM C k of mu, and quantity of interest Z of mu.

Exploitation phase of a dictionary-based ROM-net. K local ROMs, combined with a classifier \(\mathcal {C}_K\) for automatic ROM \(\mathcal {C}_K(\mu )\) recommendation, are used to predict the quantity of interest \(Z(\mu )\)

The dictionary of ROMs is constructed in a clustering stage, during which snapshots are regrouped depending on their respective proximity on \(\mathcal {M}\), in the sense of a particular dissimilarity measure we introduced in [29] and [26]. The dissimilarity between two parameter values \(\mu , \mu '\in \mathcal {X}\), denoted by \({\delta }(\mu , \mu ')\), involves the sine of the principal angles between subspaces associated to the solutions of the HFM \(u(\mu ), u(\mu ')\in \mathcal {M}\), see [26, Definition 4.10]. Applying a k-medoids clustering algorithm on the solution manifold \(\mathcal {M}\) using the dissimilarity \(\delta \) leads to an optimal partitioning for a dictionary of local ROMs, in a sense introduced in [26, Property 4.13]. We refer to the remaining of [26] for the description of a practical efficiency criterion of the dictionary-based ROM-net, which enables to decide, before the computationally costly steps of the workflow, if a dictionary of ROMs is preferable to one global ROM, and how to calibrate the various hyperparameters of the ROM-net.

Remark 2.2

Importance of the classification. One could argue that the classification step can be replaced by choosing the cluster k for which the dissimilarity measure \({\delta }(\mu , \tilde{\mu }_k)\) between the parameter \(\mu \) and the cluster medoid \(\tilde{\mu }_k\) is the smallest. However, we recall that the computation of the dissimilarity measure requires solving the HFM at the parameter value \(\mu \), which would render the complete model reduction framework useless. Hence, the classification step enables to bypass this HFM solve and directly recommend the appropriate local ROM.

As briefly mentioned in the introduction of Sect. 2.4, local ROMs can be constructed by partitioning the parameter space [33, 34], in which case the classification step is not required: the cluster affectation is made by computing distances directly in the parameter space. In other cases, partitioning in the solution space can be considered without requiring a classification step [10]. Consider a time-dependent problems where the initial condition is not a parameter of the problem, and suppose an efficient computation of the clustering distance in the solution space based on the reduced solution at the previous time-step. Then, local basis affectation and switching is possible without requiring classification.

The training of the classifier can be difficult when working with physical fields: simulations are costly, data are in high dimension and classical data augmentation techniques for images cannot be applied. Hence, we can consider replacing the HFM by an intermediate-fidelity solver for generating the data needed for the training of the classifier, by considering coarser meshes and fewer time steps. We point out that the HFM should be used at the end for generating the data required in the training of the local ROMs. We propose in [28] improvements for the training of the classifier in our context by develo** a fast variant of the mRMR [83] feature selection algorithm, and new class-conserving transformations of our data, acting like a data augmentation procedure.

2.5 Iterative and Greedy Strategies

For the sake of the presentation, we have separated the offline and online phases, where the Reduced-Order Model is learnt, then exploited. Actually, more involved strategies exist, where the ROM is constructed in a iterative fashion. The Reduced Basis Method [84] is a greedy method, where the ROB is constructed by a single snapshot, corresponding to a randomly chosen parameter value, and the ROB is enriched by the parameter value that maximizes the error made by the current ROM. In complex parameter dependencies, the hyper-reduction scheme can be simulateously constructed as the ROB grows, see [31] for such a scheme, with the EIM as hyper-reduction. This greedy construction as been extended to time-dependant problems in the POD-greedy method [41], and simultaneous hyper-reduction construction have also been proposed [88].

Such iterative strategies rely on a efficient computation of the error made by the ROM. Error estimation is investigated in the next chapter.