1 Introduction

The aim of this paper is to study the accuracy of the piecewise linear finite element method for the two-dimensional scalar Signorini problem

$$\begin{aligned} \begin{aligned} \begin{array}{ll} - \varDelta u + u= f &{}\quad \text { in } \varOmega , \\ \partial _n u \ge 0, \quad u \ge 0,\quad u \partial _n u = 0 &{}\quad \text { on } \partial \varOmega \end{array} \end{aligned} \end{aligned}$$
(1)

on a convex polygonal domain \(\varOmega \) (in our analysis, for simplicity, assumed to be the unit square—see Sect. 2 for the precise assumptions). Using a Céa-type lemma, a supercloseness result, and a non-standard duality argument that is based on ideas of Mosco, we establish new \(W^{1,p}\)- and \(L^p\)-error estimates for the problem (1) that, in view of the \(W^{2,p}\)- and \(H^s\)-regularity properties of the exact solution u, are optimal for right-hand sides \(f \in L^\infty (\varOmega )\). In particular, we prove an \(L^4\)-error estimate of the form \(\Vert u - u_h\Vert _{L^4(\varOmega )} = \mathcal {O}(h^{2-\varepsilon })\) for all \(\varepsilon > 0\) which explains the order of convergence in the lower \(L^p\)-norms that is typically observed in numerical experiments, cf. Sect. 7 and [41, Section 7]. For the main contributions of this paper, we refer the reader to the regularity result in Theorem 2.3, the supercloseness result in Theorem 3.6, and the a priori finite element error estimates collected in Theorems 4.3 and 6.1.

Before we begin with our analysis, let us give some background: As one of the simplest examples of a problem that models contact, the Signorini problem (1) (along with its various reformulations and the closely related obstacle problem) has been subject to active research for a long period of time. In the context of finite element methods, early contributions on (1) and its approximation can be traced back at least to the nineteen-seventies, see [8, 9, 24, 33, 37], and even though these seminal works have been followed by a large number of other papers, the analysis of, for instance, FE-error estimates for (1) still receives considerable attention to this day. See, e.g., [16, 21, 23, 41] for some recent contributions. The main reason for the ongoing interest in the problem (1) and the fact that, even after more than forty years, the approximation properties of the finite element method for (1) are still not fully understood is that the weak formulation of (1) takes the form of an elliptic variational inequality. This causes the differences \(u - u_h\) between the continuous solution u of (1) and its finite element approximations \(u_h\) to lack the property of Galerkin orthogonality and renders standard tools for the error analysis of finite element methods inapplicable. We remark that a notable exception to this rule are \(L^\infty \)-estimates which can be established along roughly the same lines as for elliptic PDEs in the case of the Signorini problem (1) by employing the discrete maximum principle or regularization techniques, cf. [4, 11, 13, 15, 18, 22, 28, 34, 35].

For error estimates in the \(H^1\)-norm, which, as the energy norm of the problem, is the most natural choice for the analysis of (1), the lack of Galerkin orthogonality proved a challenge that could not be properly overcome for a considerable amount of time. Compare, for instance, with the contributions [5, 8, 9] in this context, which all require additional assumptions on the structure of the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\) of the exact solution u to establish \(H^1\)-error estimates of optimal order for the Signorini problem. Only very recently, it has been shown in [16] that these conditions are, in fact, not needed and that the \(H^1\)-norm of \(u - u_h\) is indeed always of order \(\mathcal {O}(h)\) when (1) is discretized with standard piecewise linear finite elements (i.e., as in Sect. 2.3) and u possesses the natural regularity \(u \in H^2(\varOmega )\) (cf. Proposition 2.1). We remark that, for other discretization schemes as, e.g., primal-dual approaches or Nitsche’s method, similar results on the \(H^1\)-error have recently also been obtained in [21]. See [10] for a survey article on this topic.

While \(H^1\)- and \(L^\infty \)-error estimates for obstacle- and Signorini-type problems have been discussed quite extensively by various authors, results on the finite element error in other norms are only very rarely addressed in the literature. See, e.g., [12, 31, 33, 40, 41, 43] for some of the few contributions on this topic. The reason for this is that, for error estimates in general \(L^p\)- and \(W^{1,p}\)-norms, the missing Galerkin orthogonality of the differences \(u - u_h\) is an even more severe problem than in the \(H^1\)-case. This becomes apparent, for instance, in the study of the \(L^2\)-error: Recall that, for linear elliptic partial differential equations, \(L^2\)-error estimates of optimal order are typically established by means of the so-called Aubin–Nitsche trick. This trick is based on the idea to consider a dual partial differential equation, which contains the primal error \(u - u_h\) as a right-hand side, and requires three main ingredients: an error estimate of optimal order in the \(H^1\)-norm, the Galerkin orthogonality of the finite element approximations, and the \(H^2\)-regularity of the dual solution, see [6, Section 7], [14, Section 3.2] or other standard references. Extending such a duality argument to the Signorini problem (1), where Galerkin orthogonality is not available, is clearly far from trivial. Nevertheless, since the derivation of the first \(H^1\)-error estimates for elliptic variational inequalities with unilateral constraints in the nineteen-seventies, several authors have tried to accomplish precisely that, see [31, 33, 40, 43]. The approaches that have been proposed in this context are typically based on the idea to consider a suitably defined dual variational inequality that, due to its construction, allows to bypass the lack of Galerkin orthogonality of the primal error \(u - u_h\). Unfortunately, at least to the best of the authors’ knowledge, none of the contributions published so far has been able to simultaneously also satisfy the third prerequisite of the Aubin–Nitsche trick, namely, to show that the dual solution possesses enough regularity for the classical duality argument to go through. Compare, e.g., with [33] and [43, Section 5.2] in this context, where the \(H^2\)-regularity of the dual solution is used as an assumption, or with [40] where it is implicitly assumed that the dual problem possesses a sufficiently regular solution, satisfies a constraint qualification, and admits sufficiently regular multipliers. Approaches that follow different lines to establish \(L^2\)-error estimates, on the other hand, typically yield orders of convergence that are far from optimal. See, e.g., [41], where an error estimate in the \(H^{1/2}\)-norm is derived on the boundary and subsequently used to establish an \(L^2\)-estimate of order \(\mathcal {O}(h^{3/2-\varepsilon })\) for all \(\varepsilon > 0\). The results on the \(L^2\)-error (and the error in the lower \(L^p\)-norms in general) available in the literature are thus not very satisfactory and prove to be very unnatural in view of the required regularity assumptions on the involved primal and dual quantities. Even less seems to be known about \(W^{1,p}\)-error estimates for the Signorini problem with \(p \ne 2\). At least to the authors’ best knowledge, there are no contributions on this topic.

The purpose of the present paper is to demonstrate that, in the situation of (1), it is indeed possible to derive finite element error estimates of optimal order in non-energy norms for the Signorini problem while working only with reasonable assumptions on primal quantities. To be more precise, in what follows, we show that, if u enjoys a composite \(W^{2,p}\)- and \(H^s\)-regularity, that can be proved to hold in various situations (see Theorem 2.3), then it is possible to establish error estimates in \(W^{1,p}(\varOmega )\), \(L^\infty (\varOmega )\), \(W^{1,\infty }(\varOmega )\), and \(H^{1/2}(\partial \varOmega )\) for (1) that are optimal for problems with \(L^\infty (\varOmega )\)-right-hand sides (see Corollaries 4.1 and 4.2 and Theorem 4.3). Under the additional assumption that the contact set of the continuous solution u is sufficiently regular and that the contact sets of the finite element approximations \(u_h\) do not exhibit a degenerative behavior in the limit \(h \searrow 0\) (and thus in a setting comparable to those in [5, 8, 9] - see conditions (A) and \((A_h)\) in Sects. 2 and 5), we are moreover able to extend the classical Aubin–Nitsche trick to (1) and to establish an \(L^4\)-estimate of the form \(\Vert u - u_h\Vert _{L^4(\varOmega )} = \mathcal {O}(h^{2-\varepsilon })\) for all \(\varepsilon > 0\). In combination with our previous findings, this provides us with a complete set of optimal-order finite element error estimates for problems (1) with \(L^\infty (\varOmega )\)-right-hand sides that does not require any artificial assumptions on the regularity properties of dual quantities (see Theorems 4.3 and 6.1).

The method of proof that we use for our finite element error analysis is somewhat non-standard in that it does not rely on a multiplier reformulation of (1) but on certain one-sided approximation results that are apparently only rarely employed in the literature. For a problem on the unit square \(\varOmega = (0, 1)^2\), whose right-hand side f is in \(L^\infty (\varOmega )\) and whose solution u has a sufficiently regular contact set, our approach can essentially be summarized as follows:

Using an elementary argument, we prove that the \(H^1\)-error between the finite element approximation \(u_h\) and the Ritz projection \(R_h(u)\) of the exact solution u is smaller than the \(H^1\)-norm of every finite element function \(w_h\) that satisfies \(R_h(u) - u \le w_h \le R_h(u)\) on \(\partial \varOmega \) (see Lemma 3.4). This best approximation property yields, in tandem with results on unilateral finite element approximations (see Lemma 3.5) and the regularity properties of solutions to (1) (see Theorem 2.3), that \(\Vert u_h - R_h(u)\Vert _{H^1(\varOmega )} = \mathcal {O}(h^{3/2-\varepsilon })\) holds for all \(\varepsilon \in (0, 1/2)\) (see Theorem 3.6). By exploiting this supercloseness property, inverse estimates, and standard results for the Ritz projection, we immediately arrive at error estimates of optimal order in \(W^{1,4}(\varOmega )\), \(W^{1,\infty }(\varOmega )\), \(L^\infty (\varOmega )\), and \(H^{1/2}(\partial \varOmega )\) (see Theorem 4.3 and Remark 4.4). To study the error in the lower \(L^p\)-norms, we follow an approach of Mosco and consider two dual problems, one for each of the components \(\max (0, u - u_h)\) and \(\min (0, u - u_h)\) (see Sect. 5). As we will see, the solutions of our dual variational inequalities suffer from the same regularity problems as those in [31, 33, 43] and cannot be expected to be elements of \(H^2(\varOmega )\). However, by invoking the results of [19, 20], we can show that \(W^{2, 4/3 -\varepsilon }\)-regularity for all \(\varepsilon > 0\) is obtainable instead. This observation and the fact that \(q := 4/3\) is precisely the Hölder conjugate of \(p:= 4\) allow us to invoke our \(W^{1,4}\)-estimate to compensate the lack of regularity of the dual solutions and to arrive at an estimate of the type \(\Vert u - u_h\Vert _{L^4(\varOmega )} = \mathcal {O}(h^{2-\varepsilon })\) for all \(\varepsilon > 0\) (see Theorem 6.1).

We would like to point out that the \(H^{1/2}(\partial \varOmega )\)-error estimate that we establish in Theorem 4.3 reproduces [41, Theorem 2.2] under slightly different assumptions on the regularity of the exact solution u (or the right-hand side f, respectively). Further, Theorem 4.3 shows that the order of convergence \(3/2 - \varepsilon \) that has been obtained in [41, Corollary 5.8] in \(L^2(\varOmega )\) is, in fact, even achieved in the \(L^\infty \)-norm. Surprisingly, we obtain this \(L^\infty \)-result without ever invoking the discrete maximum principle (which is normally used to prove pointwise error estimates for variational inequalities with unilateral constraints) and without the related assumptions on the underlying triangulation, cf. [4, 11, 13, 15, 18, 28, 34]. Theorem 6.1 finally improves the order of convergence in [41, Corollary 5.8] by the factor \(h^{1/2}\) and yields an \(L^4\)-error estimate that is optimal. To the best of our knowledge, the \(L^p\)- and \(W^{1,p}\)-error estimates derived in this paper are new. Further, the duality argument in Sect. 5 seems to be the first of its kind that actually works without artificial assumptions on the regularity properties of dual quantities, cf. [31, 33, 43]. Compare also with the analysis and the counterexamples in [12] in this context, which demonstrate that the assumptions on the dual solution made in [31, 33, 40, 43] are indeed unrealistic and cannot be expected to hold for the classical obstacle problem and which, in combination with our positive results in Sect. 5, show that it makes a huge difference for the behavior of the finite element error in the lower \(L^p\)-norms whether the variational inequality under consideration involves inequality constraints on the boundary \(\partial \varOmega \) or in the interior of \(\varOmega \).

To help the reader navigate this paper, we conclude this introduction with a brief overview of the structure and the content of the following sections:

Section 2 is concerned with preliminaries. Here, we clarify the notation, state our precise assumptions, and collect several regularity results for the problem (1). In Sect. 3, we prove the Céa-type lemma and the supercloseness result that are at the heart of our error analysis. Section 4 addresses the consequences that the results of Sect. 3 have for the derivation of finite element error estimates. The main results of this section, Corollaries 4.1 and 4.2 and Theorem 4.3, contain various \(W^{1,p}(\varOmega )\)-, \(L^p(\varOmega )\)-, and \(H^{1/2}(\partial \varOmega )\)-estimates that cover a large variety of different situations. Section 5 is devoted to the analysis of the \(L^4\)-error in the case \(f \in L^\infty (\varOmega )\). Here, we extend the classical Aubin–Nitsche trick to (1) and prove that the continuous and the discrete solution satisfy \(\Vert u - u_h\Vert _{L^4(\varOmega )} = \mathcal {O}(h^{2-\varepsilon })\) for all \(\varepsilon > 0\) when the involved contact sets are sufficiently well-behaved. In Sect. 6, we summarize our results and give some concluding remarks. Section 7 finally contains numerical experiments that confirm our theoretical findings.

2 Notation, assumptions, and preliminaries

2.1 Basic notation

Throughout this paper, we use the standard notations \(L^p(\varOmega )\), \(C^{k, \gamma }(\varOmega )\), \(W^{s,p}(\varOmega )\), and \(H^s(\varOmega )\) for the Lebesgue-, Hölder-, and (fractional) Sobolev spaces on a bounded domain \(\varOmega \subset \mathbb {R}^2\). See, e.g., [1, 3, 17] for details on these spaces. The scalar products on \(L^2(\varOmega )\) and \(H^1(\varOmega )\) are denoted with \((\cdot , \cdot )_{L^2(\varOmega )}\) and \((\cdot , \cdot )_{H^1(\varOmega )}\), respectively, i.e.

$$\begin{aligned} (v_1, v_2)_{L^2(\varOmega )} := \int _\varOmega v_1 v_2 \mathrm {d}\mathcal {L}^2\quad \text {and}\quad \left( v_1 , v_2\right) _{H^1(\varOmega )} := \int _\varOmega \nabla v_1 \cdot \nabla v_2 + v_1 v_2 \mathrm {d}\mathcal {L}^2. \end{aligned}$$

With \(\mathcal {L}^k\) and \(\mathcal {H}^k\), we denote the k-dimensional Lebesgue and Hausdorff measure, and with \({\text {tr}}(\cdot )\) the classical trace operator, cf. [3]. For functions v with a continuous representative, we typically drop the prefix \({\text {tr}}\) and simply write v instead of \({\text {tr}}(v)\). Further, we use the symbols \({\text {cl}}(\cdot )\) and \(\partial \) to denote the topological closure and the boundary of a set, respectively, and the abbreviation \(B_{r}(x)\) to denote the closed ball of radius \(r > 0\) around an \(x \in \mathbb {R}^2\). With \(\mathcal {O}(\cdot )\), , we denote the classical Landau symbols, and with C a generic constant which may change within an estimate but is never dependent on crucial quantities as, e.g., the mesh width. If we want to emphasize that C depends on a quantity \(\alpha \), then we write \(C = C(\alpha )\). Lastly, we define \(a^+ := \max (0, a)\) and \(a^- := \min (0, a)\) for all \(a \in \mathbb {R}\).

2.2 The continuous setting

As already mentioned in the introduction, the purpose of this paper is to study finite element error estimates for the two-dimensional scalar Signorini problem

$$\begin{aligned} \begin{aligned} \begin{array}{ll} - \varDelta u + u= f &{}\quad \text { in } \varOmega , \\ \partial _n u \ge 0, \quad u \ge 0,\quad u \partial _n u = 0 &{}\quad \text { on } \partial \varOmega . \end{array} \end{aligned} \end{aligned}$$
(2)

Here and in what follows, \(\varDelta \) and \(\partial _n\) denote the (distributional) Laplacian and the normal derivative, respectively, and \(f \in L^2(\varOmega )\) is a given right-hand side. To avoid obscuring the basic ideas of our analysis with technicalities and distinctions of cases and to reduce the notational overhead, throughout this paper, we always assume that \(\varOmega \) is the unit square, i.e., \(\varOmega := (0,1)^2\). Our arguments can be extended straightforwardly to other convex polygonal domains with obvious modifications and, depending on the largest interior angle of \(\varOmega \), possibly additional assumptions on the exponent p in the \(L^p\)- and \(W^{1,p}\)-error estimates. The same is true for other variants of Signorini’s problem as, e.g., the version

$$\begin{aligned} \begin{aligned} \begin{array}{ll} -\varDelta u = f \text { in }\varOmega ,\quad u = 0 \quad \text { on } \varGamma _D,\quad \partial _n u = 0 &{}\quad \text { on } \varGamma _N, \\ \partial _n u \ge 0, \quad u \ge 0,\quad u \partial _n u = 0 &{}\quad \text { on } \varGamma _S \end{array} \end{aligned} \end{aligned}$$
(3)

studied in [2] whose partial differential operator does not contain a term of order zero and which involves separate Dirichlet-, Neumann-, and Signorini-boundary parts \(\varGamma _D\), \(\varGamma _N\), and \(\varGamma _S\). For such problems, however, some care has to be taken since the \(H^2\)-regularity result in Proposition 2.1 and error estimates analogous to those in Lemma 3.2 are not directly available in the literature but first have to be established (under suitable assumptions on the angles between \(\varGamma _D\), \(\varGamma _N\), and \(\varGamma _S\)) by means of the techniques of [2, 36, 39]. We omit a detailed discussion of this topic to avoid overloading this paper. Lastly, we would like to point out that a Signorini problem of the type (2) with a sufficiently regular non-zero obstacle on the boundary can always be transformed into a problem with a vanishing obstacle by an elementary translation argument. This allows us to restrict our attention to the homogeneous situation in (2) without a great loss of generality.

To begin our analysis, we recall that the weak formulation of (2) is given by the elliptic variational inequality

figure a

with the admissible set

$$\begin{aligned} K := \left\{ v \in H^1(\varOmega )\, \Big | \, {\text {tr}}(v) \ge 0 \ \mathcal {H}^1\text {-a.e. on }\partial \varOmega \right\} . \end{aligned}$$

From [19, Theorem 3.2.3.1, Example (3,2,3,1)], we further obtain:

Proposition 2.1

(S) admits a unique solution \(u \in H^2(\varOmega )\) for all \(f \in L^2(\varOmega )\).

Note that Proposition 2.1 and the Sobolev embeddings imply that u possesses a representative which is continuous on the closure \({\text {cl}}(\varOmega )\) of the domain \(\varOmega \). In what follows, we always use this representative when talking, e.g., about level sets. As it turns out, solutions u to (S) enjoy additional regularity properties when the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\) is sufficiently well-behaved. To explore this effect, we introduce:

Definition 2.2

(Condition (A)) A solution \(u \in H^2(\varOmega )\) of (S) is said to satisfy condition (A) if the relative boundary of the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\) in \(\partial \varOmega \) has one-dimensional Hausdorff measure zero and if the relative interior of the contact set in \(\partial \varOmega \) consists of at most finitely many connected components.

Under assumption (A), the solution u of the variational inequality (S) can be identified with the solution of a mixed Dirichlet–Neumann problem with segment-wise prescribed boundary conditions on a convex polygonal domain. This, together with the \(H^2\)-regularity result in Proposition 2.1 and the analysis in [19], allows us to prove the following improved regularity result for the Signorini problem (S) which clarifies precisely which regularity assumptions on u are reasonable when it comes to the derivation of finite element error estimates:

Theorem 2.3

(Improved regularity for Signorini’s problem) Suppose that \(u \in H^2(\varOmega )\) solves (S) and satisfies (A). Then, the following holds true:

  1. 1.

    If f is in \(L^p(\varOmega )\) for some \(2< p < 4\), then it holds \(u \in W^{2,p}(\varOmega )\).

  2. 2.

    If f is in \(L^p(\varOmega )\) for some \(4< p < \infty \), then there exist functions \(u_s, u_r \in H^2(\varOmega )\) such that \(u = u_s + u_r\), \(u_r \in W^{2,p}(\varOmega )\), and \(u_s \in H^{5/2 - \varepsilon }(\varOmega )\) holds for all \(\varepsilon \in (0, 1/2)\), and such that the restriction of the trace of \(u_s\) to each of the four sides of the square \(\varOmega = (0,1)^2\) has \(W^{2,2 - \varepsilon }\)-regularity for all \(\varepsilon \in (0, 1/2)\).

Proof

Since u satisfies condition (A), we may find relatively open disjoint straight line segments \(\varGamma _i \subset \partial \varOmega \), \(i=1,\ldots , N+M\), \(N, M \in \mathbb {N}_0\), and a set \(R \subset \partial \varOmega \) of one-dimensional Hausdorff measure zero such that

$$\begin{aligned} \{x \in \partial \varOmega \mid u(x) = 0\} = \bigcup _{i=1}^{N} \varGamma _i \cup R \quad \text {and}\quad \partial \varOmega = \bigcup _{i=1}^{N + M} {\text {cl}}( \varGamma _i). \end{aligned}$$

From the variational inequality (S), the \(H^2\)-regularity of the solution u, and Green’s first identity, it follows further that

$$\begin{aligned} \begin{aligned} \begin{array}{ll} u \in K,\quad \int _\varOmega (- \varDelta u + u - f) (v - u) \mathrm {d}\mathcal {L}^2 + \int _{\partial \varOmega } (\partial _n u)(v - u)\mathrm {d}\mathcal {H}^1 \ge 0&\quad \forall v \in K. \end{array} \end{aligned} \end{aligned}$$

The above yields

$$\begin{aligned} \begin{aligned} \begin{array}{ll} - \varDelta u = f - u &{}\quad \mathcal {L}^2\text {-a.e. in }\varOmega , \\ u = 0 &{}\quad \mathcal {H}^1\text {-a.e. on } \varGamma _i \text { for all } i=1,\ldots ,N, \\ \partial _n u = 0 &{}\quad \mathcal {H}^1\text {-a.e. on } \varGamma _i \text { for all } i=N+1,\ldots , N + M. \end{array} \end{aligned} \end{aligned}$$
(4)

Note that the parts of the contact set that are contained in the line segments \(\varGamma _i\), \(i=N+1,\ldots , N + M\), are negligible here due to the properties of R. Let us suppose now that f is an element of \(L^p(\varOmega )\) for some \(2< p < 4\). Then, we may use [19, Theorem 4.4.3.7] to deduce that there exist real numbers \(c_{i}\) and trigonometric functions \(\phi _i\) such that

$$\begin{aligned} u - \sum _{i=1,\ldots , N+M } c_i \eta _i r_i^{1/2} \phi _i(\theta _i) \in W^{2, p}(\varOmega ) \end{aligned}$$
(5)

holds, where \(r_i \ge 0\) and \(\theta _i \in [0, 2\pi )\) denote local polar coordinates centered at the vertices \(x_i\), \(i=1,\ldots , N+M\), of the partition \(\{{\text {cl}}(\varGamma _i)\}\) of the boundary \(\partial \varOmega \), and where \(\eta _i\) is a smooth cut-off function which is identical one in a neighborhood of \(x_i\) for each i. We already know, however, that \(u \in H^2(\varOmega )\), and it is easy to check that the factor \(r_i^{1/2}\) prevents a function of the form \(\eta _i r_i^{1/2} \phi _i(\theta _i)\) to be an element of \(H^2(\varOmega )\). This implies that all \(c_i\) in (5) have to be zero and proves the first claim, cf. also with the discussion in [41, Remark 2.1] and [32] in this regard. To obtain the second claim, we can proceed along exactly the same lines: If f is an element of \(L^p(\varOmega )\) for some \(4< p < \infty \), then we may use the same arguments as above and again [19, Theorem 4.4.3.7] to deduce that there exist real numbers \({\tilde{c}}_{i}\) and trigonometric functions \({{\tilde{\phi }}}_i\) with

$$\begin{aligned} u - \sum _{i=1,\ldots , N+M } {\tilde{c}}_i \eta _i r_i^{3/2} {{\tilde{\phi }}}_i(\theta _i) \in W^{2, p}(\varOmega ), \end{aligned}$$

where \(\eta _i\), \(r_i\), and \(\theta _i\) are as in (5). The functions

$$\begin{aligned} u_s:= \sum _{i=1,\ldots , N+M } {\tilde{c}}_i \eta _i r_i^{3/2} {{\tilde{\phi }}}_i(\theta _i),\qquad u_r := u - u_s, \end{aligned}$$
(6)

have all of the desired properties (see, for instance, [20, Theorem 1.2.18] for the \(H^{5/2-\varepsilon }\)-regularity of \(u_s\)). This completes the proof. \(\square \)

Some remarks are in order regarding the last result and condition (A):

Remark 2.4

 

  1. 1.

    Due to the presence of the singular part \(u_s\), the solution u of (S) can, in general, not be expected to possess \(W^{2,4}\)- or \(H^{5/2}\)-regularity even for smooth right-hand sides f. Compare, for instance, with the examples in [41] and Sect. 7 in this context. This is an important difference to the classical obstacle problem whose solution can be shown to be in \(W^{2,p}(\varOmega )\) for all \(2 \le p < \infty \) under appropriate assumptions on the problem data, see [25, Theorem IV-2.3].

  2. 2.

    Assumptions similar to (A) are often implicitly made in the literature when it is discussed that the solution u of (S) can be expected to possess \(H^{5/2 - \varepsilon }(\varOmega )\)-regularity for all \(\varepsilon > 0\). Compare, e.g., with [41, Remark 2.1] and [5, Section 2] in this context, where it is supposed (but not explicitly stated) that each point in the relative boundary of the contact set admits an open neighborhood in which the solution u changes precisely once from contact to non-contact. (Note that the latter assumption implies in particular that the relative boundary of the contact set in \(\partial \varOmega \) is finite and is thus stronger than condition (A).) Theorem 2.3 is more precise in this regard in that it explicitly states which assumptions on the contact set are needed to rigorously prove improved regularity properties for (S) and further also quantifies which regularity can be expected for the “regular” part of the solution that remains when the singular contributions coming from the transition points on the boundary are subtracted from u.

  3. 3.

    For problems of the type (3) with right-hand side \(f \equiv 0\) and an additional inhomogeneity on the boundary \(\partial \varOmega \), it is possible to rigorously prove that condition (A) is satisfied. See the recent contribution [2] for details.

  4. 4.

    As already mentioned in the introduction, in the context of finite element error estimates for Signorini’s problem, conditions similar to (A) have been needed for the derivation of optimal-order finite element error estimates in the \(H^1\)-norm for a considerable amount of time. See, e.g., [5, 8, 9], for examples of contributions that require analogous assumptions on the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\) of the continuous solution u. Only very recently in 2015, it was finally accomplished in [16] to get rid of these conditions and to establish \(H^1\)-error estimates of optimal order for (S) that solely rely on standard Sobolev regularity properties. We would like to point out that, similar to the results in [16], the \(W^{1,p}(\varOmega )\)-, \(L^\infty (\varOmega )\)-, \(W^{1,\infty }(\varOmega )\)-, and \(H^{1/2}(\partial \varOmega )\)-error estimates that we derive in Sects. 3 and 4 for the problem (S) do not require condition (A), but only the regularity properties that are implied by it. Only the derivation of the \(L^4\)-error estimates in Sect. 5 will make explicit use of condition (A) (and a comparable condition in the discrete setting). In what follows, in each theorem, lemma etc., we will clearly state whether the respective result requires condition (A) or suitable regularity properties of the solution u of the problem (S).

2.3 The discrete setting

As discrete counterparts of the variational inequality (S), we consider problems of the form

figure b

Our standing assumptions on the quantities in (S\(_h\)) are as follows:

Assumption 2.5

(Standing assumptions for the FE-discretization)

  1. 1.

    \(\{\mathcal {T}_h\}_{0<h < 1/2}\) is a quasi-uniform family of triangulations of \(\varOmega = (0,1)^2\),

  2. 2.

    \(V_h := \{ v \in C({\text {cl}}(\varOmega )) \, | \, v \text { is affine on all cells } T \in \mathcal {T}_h\}\),

  3. 3.

    \(K_h := K \cap V_h = \{v_h \in V_h \, | \, v_h \ge 0 \text { on } \partial \varOmega \}\).

See, e.g., [11, Definition 2] or [7, Definition 4.4.13] for the definition of the term “quasi-uniform family of triangulations”. For brevity’s sake, in what follows, we often ignore the upper bound on the mesh width and simply write “for all \(h>0\)” instead of “for all \(0< h < 1/2\)”. From standard results as, e.g., [25, Theorem II-2.1], we obtain:

Proposition 2.6

(S\(_h\)) is uniquely solvable for all \(f \in L^2(\varOmega )\) and all \(h>0\).

In the remainder of this paper, our aim will be to study the approximation properties of the solution \(u_h\) of (S\(_h\)) for \(h \searrow 0\). The main ingredients of our error analysis are:

3 A Céa-type lemma and a supercloseness result

To study the error \(u - u_h\), we introduce the following operator:

Definition 3.1

(Ritz projection) For every \(v \in H^1(\varOmega )\), we define the Ritz projection \(R_h(v)\) to be the unique element of \(V_h\) with

$$\begin{aligned} \begin{aligned} \begin{array}{ll} \left( R_h(v), w_h\right) _{H^1(\varOmega )} = \left( v, w_h\right) _{H^1(\varOmega )}&\quad \forall w_h \in V_h. \end{array} \end{aligned} \end{aligned}$$

Note that \(R_h : H^1(\varOmega ) \rightarrow V_h\) is precisely the solution operator of the unconstrained problem associated with (S\(_h\)). In particular, \(R_h\) is well-defined, linear, and continuous, and we may invoke classical results to obtain:

Lemma 3.2

 

  1. 1.

    For every \(v \in W^{2,p}(\varOmega )\), \(2 \le p < \infty \), it holds

    $$\begin{aligned} \begin{aligned} \Vert v - R_h(v)\Vert _{L^p(\varOmega )} + h\Vert v - R_h(v)\Vert _{W^{1,p}(\varOmega )}&+ h^{1/p}\Vert v - R_h(v)\Vert _{L^p(\partial \varOmega )} \\&\quad \le C h^2\Vert v\Vert _{W^{2,p}(\varOmega )} \end{aligned} \end{aligned}$$
    (7)

    with some constant \(C > 0\) independent of h and v.

  2. 2.

    If v satisfies \(v \in H^{5/2 - \varepsilon }(\varOmega )\) for all \(\varepsilon \in (0, 1/2)\), then, for every \(\varepsilon \in (0, 1/2)\), there exists a constant \(C > 0\) independent of h with

    $$\begin{aligned} \Vert v - R_h(v)\Vert _{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - \varepsilon }. \end{aligned}$$

Proof

For the domain \(\varOmega = (0,1)^2\), the estimate (7) can be derived as follows: Consider an arbitrary but fixed \(v \in W^{1, \infty }(\varOmega )\) with associated Ritz projection \(R_h(v) \in V_h\). Then, we may use reflections to extend v and \(R_h(v)\) first to the rectangle \((0,1) \times (-1,2)\) and subsequently to the square \({{\tilde{\varOmega }}} := (-1,2)^2\) to construct functions \({\tilde{v}}, {\tilde{v}}_h \in W^{1, \infty }({{\tilde{\varOmega }}})\) with \({\tilde{v}}|_\varOmega = v\), \({\tilde{v}}_h|_\varOmega = R_h(v)\), and \({\tilde{v}}_h = {\tilde{R}}_h({\tilde{v}})\). Here, \({\tilde{R}}_h\) is the Ritz projection operator on the mesh of \({{\tilde{\varOmega }}}\) that is obtained from the reflection procedure. From the interior norm estimate [38, Theorem 1.2], we may now deduce that there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \Vert v - R_h(v)\Vert _{W^{1, \infty }(\varOmega )}&= \Vert {\tilde{v}} - {\tilde{R}}_h({\tilde{v}})\Vert _{W^{1, \infty }(\varOmega )} \\&\le C \left( \Vert {\tilde{v}} \Vert _{W^{1, \infty }({{\tilde{\varOmega }}})} + \Vert {\tilde{v}} - {\tilde{R}}_h({\tilde{v}})\Vert _{L^2({{\tilde{\varOmega }}})} \right) . \end{aligned} \end{aligned}$$

Since \({\tilde{v}}\) has the same \(W^{1,\infty }\)-norm as v, the above, the triangle inequality, and the properties of \({\tilde{v}}\) and \({\tilde{R}}_h({\tilde{v}})\) imply that

$$\begin{aligned} \Vert R_h(v)\Vert _{W^{1, \infty }(\varOmega )} \le C \Vert v \Vert _{W^{1, \infty }(\varOmega )} \end{aligned}$$

holds with some \(C>0\) which does not depend on h. From the Theorem of Riesz-Thorin, cf. [7, Sections 14.1, 14.2], and the estimate \(\Vert R_h(v)\Vert _{H^1(\varOmega )} \le C \Vert v \Vert _{H^1(\varOmega )}\), it now follows straightforwardly that there exists a \(C>0\) independent of h with

$$\begin{aligned} \Vert R_h(v)\Vert _{W^{1, p}(\varOmega )} \le C \Vert v \Vert _{W^{1, p}(\varOmega )} \qquad \forall v \in W^{1,p}(\varOmega ) \qquad \forall p \in [2, \infty ]. \end{aligned}$$

The above estimate, the regularity results of [19], and exactly the same arguments as in the proofs of [36, Equations (1.8), (1.9)] yield that, for every \(2 \le p < \infty \), there exists a constant \(C >0\) independent of h with

$$\begin{aligned} \Vert v - R_h(v)\Vert _{L^p(\varOmega )} + h\Vert v - R_h(v)\Vert _{W^{1,p}(\varOmega )} \le C h^2\Vert v\Vert _{W^{2,p}(\varOmega )}\quad \forall v \in W^{2,p}(\varOmega ). \end{aligned}$$

It remains to prove the \(L^p(\partial \varOmega )\)-estimate. This, however, follows immediately from the last inequality and [19, Theorem 1.5.1.10] with parameter \(\varepsilon := h^p\). Note that the above argumentation only works for rectangles and squares. For more general domains \(\varOmega \), (7) can be obtained by employing the techniques of [36, 39]. To do so, however, one has to study in detail the behavior of certain Green’s functions in the vicinity of the corners of the domain under consideration, cf. the comments in [39, Section 3]. Such a study is beyond the scope of this paper.

To prove the second assertion of the lemma, we suppose that a function v with \(v \in H^{5/2 - \varepsilon }(\varOmega )\) for all \(\varepsilon \in (0, 1/2)\) is given and that \({\tilde{v}}_h\) is the unique element of \(V_h\) with

$$\begin{aligned} \int _\varOmega {\tilde{v}}_h - R_h(v) \mathrm {d}\mathcal {L}^2 = 0,\quad \int _\varOmega \nabla {\tilde{v}}_h \cdot \nabla w_h\mathrm {d}\mathcal {L}^2 = \int _\varOmega \nabla v \cdot \nabla w_h \mathrm {d}\mathcal {L}^2\quad \forall w_h \in V_h. \end{aligned}$$

By proceeding completely analogously to the proof of [29, Lemma 5.7] (with the Besov estimate in [29, Lemma 4.1] replaced with the second estimate in [26, Lemma 2.1]), we obtain that, for every \(\varepsilon \in (0, 1/2)\), there exists a \(C>0\) with

$$\begin{aligned} | v - {\tilde{v}}_h|_{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - \varepsilon }, \end{aligned}$$
(8)

where \(|\cdot |_{H^{1/2}(\partial \varOmega )}\) denotes the \(H^{1/2}\)-seminorm on the boundary \(\partial \varOmega \) (as defined in [19, Sections 1.3.3, 1.5]). Testing the variational equality for \({\tilde{v}}_h\) with \({\tilde{v}}_h - R_h(v) \) and the variational equality for \(R_h(v)\) with \(R_h(v) - {\tilde{v}}_h\) and adding the resulting identities yields

$$\begin{aligned} \begin{aligned}&\int _\varOmega \nabla ({\tilde{v}}_h - R_h(v) )\cdot \nabla ({\tilde{v}}_h - R_h(v)) \mathrm {d}\mathcal {L}^2 = \int _\varOmega (v - R_h(v)) (R_h(v) - {\tilde{v}}_h ) \mathrm {d}\mathcal {L}^2. \end{aligned} \end{aligned}$$

From the inequality of Poincaré-Wirtinger and the first part of the lemma, we may now deduce that

$$\begin{aligned} \Vert {\tilde{v}}_h - R_h(v) \Vert _{H^1(\varOmega )} \le C \Vert v - R_h(v) \Vert _{L^2(\varOmega )} \le C h^2 \end{aligned}$$

holds with some \( C > 0\). If we combine the above with (8), the trace theorem, the triangle inequality, and the \(L^p(\partial \varOmega )\)-estimate in (7), then the claim follows immediately.\(\square \)

We may now make the following observation (that has already been made in [11, Lemma 10] for the classical obstacle problem):

Lemma 3.3

Suppose that u solves (S) for some \(f \in L^2(\varOmega )\). Then, \(R_h(u)\) is the unique solution of the variational inequality

figure c

with

$$\begin{aligned} {\tilde{K}}_h := \{v_h \in V_h \mid v_h \ge R_h(u) - u \text { on } \partial \varOmega \}. \end{aligned}$$

Proof

The problem (\(\tilde{S}_h\)) admits a unique solution \({\tilde{u}}_h\) by [25, Theorem II-2.1]. To see that this solution is precisely \(R_h(u)\), we note that \(R_h(u) \in {\tilde{K}}_h\) and that for all \(v_h \in {\tilde{K}}_h\), i.e., for all \(v_h \in V_h\) with \(v_h - R_h(u) + u \ge 0\) on \(\partial \varOmega \), the definition of \(R_h(u)\) and the variational inequality (S) yield

$$\begin{aligned} \begin{aligned} \left( R_h(u) , v_h - R_h(u) \right) _{H^1(\varOmega )}&= \left( u , v_h - R_h(u) \right) _{H^1(\varOmega )} \\&= \left( u , v_h - R_h(u) + u - u \right) _{H^1(\varOmega )} \\&\ge \left( f, v_h- R_h(u) + u - u \right) _{L^2(\varOmega )} \\&=\left( f, v_h- R_h(u) \right) _{L^2(\varOmega )}. \end{aligned} \end{aligned}$$

This proves the claim.\(\square \)

The above result shows that it suffices to study the error that occurs in the solution \(u_h\) of (S\(_h\)) when the original obstacle (i.e., the zero function) in (S\(_h\)) is replaced with the obstacle \(R_h(u) - u\) to relate the approximate solution \(u_h\) and the Ritz projection \(R_h(u)\) of the exact solution u to each other. By pursuing this approach, we obtain the following Céa-type result:

Lemma 3.4

(A Céa-type property) Let \(f \in L^2(\varOmega )\) be arbitrary but fixed, and let u and \(u_h\) denote the solutions of (S) and (S\(_h\)), respectively. Then, it holds

$$\begin{aligned} \begin{aligned}&\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} \\&\quad \le \inf \Big \{ \Vert w_h\Vert _{H^1(\varOmega )} \ \Big | \ w_h \in V_h, \ R_h(u) - u \le w_h \le R_h(u) \text { on } \partial \varOmega \Big \}. \end{aligned} \end{aligned}$$
(9)

Proof

Consider an arbitrary but fixed \(w_h \in V_h\) that is contained in the set on the right-hand side of (9). (Note that this set is not empty since it contains \(R_h(u)\).) Since \(w_h \ge R_h(u) - u\) on \(\partial \varOmega \) and since \(R_h(u)\) is the solution of (\(\tilde{S}_h\)), it holds

$$\begin{aligned} \left( R_h(u), v_h - R_h(u) \right) _{H^1(\varOmega )} \ge \left( f, v_h - R_h(u) \right) _{L^2(\varOmega )} \end{aligned}$$

for all \(v_h \in V_h\) with \(v_h \ge w_h\) on \(\partial \varOmega \). In particular, the choice \(v_h := u_h + w_h\) yields

$$\begin{aligned} \left( R_h(u), R_h(u) - u_h \right) _{H^1(\varOmega )} \le \left( R_h(u), w_h \right) _{H^1(\varOmega )} + \left( f, R_h(u) - u_h - w_h \right) _{L^2(\varOmega )}. \end{aligned}$$

On the other hand, we know that \(R_h(u) - w_h \ge R_h(u) - R_h(u) = 0\) on \(\partial \varOmega \). Thus, we may choose the test function \(v_h := R_h(u) - w_h\) in (S\(_h\)) to obtain

$$\begin{aligned} \left( u_h, u_h - R_h(u) \right) _{H^1(\varOmega )} \le \left( u_h, - w_h \right) _{H^1(\varOmega )} + \left( f, u_h + w_h - R_h(u)\right) _{L^2(\varOmega )}. \end{aligned}$$

By addition, it now follows that

$$\begin{aligned} \Vert R_h(u) - u_h \Vert _{H^1(\varOmega )}^2 \le \left( R_h(u) - u_h, w_h \right) _{H^1(\varOmega )}. \end{aligned}$$

This proves the claim. \(\square \)

Note that the above arguments work for all elliptic variational inequalities with unilateral constraints (not just for the Signorini problem). To obtain a tangible a priori estimate for the norm \(\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )}\), it remains to construct a function \(w_h \in V_h\) which satisfies \(R_h(u) - u \le w_h \le R_h(u)\) on \(\partial \varOmega \) and which has a small \(H^1\)-norm. This is accomplished in the following lemma by means of a unilateral approximation technique that has also been used in [11, 12, 30, 31, 42]:

Lemma 3.5

(Finite element approximation under constraints)  

  1. 1.

    Suppose that \(v \in W^{2,p}(\varOmega )\), \(2< p < \infty \), is a function with a non-negative trace. Then, for every \(h>0\), we can find a \(w_h \in V_h\) with \(R_h(v) -v \le w_h \le R_h(v)\) on \(\partial \varOmega \) such that \(\Vert w_h\Vert _{H^1(\varOmega )} \le C h^{3/2 - 1/p}\) holds with a C independent of h.

  2. 2.

    Suppose that \(v \in H^2(\varOmega )\) is a function with a non-negative trace that can be decomposed into two parts \(v_s\) and \(v_r\) which satisfy the conditions in point 2 of Theorem 2.3 for some \(4< p < \infty \). Then, for every \(\varepsilon \in (0, 1/2)\) and every \(h>0\), we can find a \(w_h \in V_h\) with \(R_h(v) -v \le w_h \le R_h(v)\) on \(\partial \varOmega \) such that \(\Vert w_h\Vert _{H^1(\varOmega )} \le C h^{3/2 - 2/p - \varepsilon }\) holds with a C independent of h.

Proof

We first introduce some notation: Suppose that \(h> 0\) is arbitrary but fixed. We denote the nodes of the triangulation \(\mathcal {T}_h\) which are contained in the boundary of the square \(\varOmega = (0,1)^2\) with \(x_i\), \(i=0,\ldots , N\), starting with \(x_0 := (0,0)\) and then proceeding counterclockwise. For convenience, we additionally set \(x_{-1} := x_N\), \(x_{N+1} := x_0\), and \(x_{N+2} := x_1\). Further, we define \(\sigma _i\) to be the closed line segment \([x_i, x_{i+1}]\), \(i=-1,\ldots , N+1\), and \(\tau _i\) to be the mesh cell whose boundary contains \(\sigma _i\). With C we again denote a generic constant which may change within an estimate but never depends on h. We now proceed in three steps:

Step 1 (h-Independent Morrey Inequality on the Mesh Cells) Consider the reference element \(T := {\text {conv}}\{ (0,0), (1, 0), (0, 1)\}\). Then, we know from the classical Morrey inequality, see [1, Theorem 5.4], that for every \(2< p < \infty \) there exists a constant \(C = C(p, T)\) with

$$\begin{aligned} \max _{x,y \in T} \frac{|v(x) - v(y)|}{|x - y|^{1 - 2/p}} \le C \Vert v \Vert _{W^{1,p}(T)}\quad \forall v \in W^{1,p}(T). \end{aligned}$$

From the inequality of Poincaré-Wirtinger, we obtain further that there exists another \(C = C(p, T) > 0\) such that, for all \( v \in W^{1,p}(T)\) with average value zero in T, we have

$$\begin{aligned} \Vert v\Vert _{L^p(T)} \le C \Vert \nabla v\Vert _{L^p(T)}. \end{aligned}$$

By combining the last two inequalities, we may deduce that

$$\begin{aligned} \max _{x,y \in T} \frac{|v(x) - v(y)|}{|x - y|^{1 - 2/p}} \le C \Vert \nabla v \Vert _{L^p(T)} \end{aligned}$$

holds for all \(v \in W^{1,p}(T)\) with average value zero on T. Since the seminorms appearing here do not detect constant functions, the last estimate also holds for all \(v \in W^{1,p}(T)\). Consider now an arbitrary but fixed \(\tau _i\), \(i \in \{-1,\ldots , N+1\}\), and denote with \(F_i : T \rightarrow \tau _i\), \(x \mapsto x_i + G_i x\), the affine linear function with \(\det (G_i) > 0\) which maps the reference element T to \(\tau _i\). Then, for every \(v \in W^{1,p}(\tau _i)\), we obtain

$$\begin{aligned} \begin{aligned} \max _{x,y \in T} \frac{|v(F_i(x)) - v(F_i(y))|}{ |G_i^{-1} (F_i(x) - F_i(y))|^{1 - 2/p}}&\le C(p,T) \left( \int _T \left| G_i^T (\nabla v)(F_i) \right| ^p \mathrm {d}\mathcal {L}^2 \right) ^{1/p} \\&\le C(p,T) \frac{|G_i|}{\det (G_i)^{1/p}} \left( \int _{\tau _i} \left| \nabla v \right| ^p \mathrm {d}\mathcal {L}^2 \right) ^{1/p}. \end{aligned} \end{aligned}$$

The above yields

$$\begin{aligned} \begin{aligned} \max _{x,y \in \tau _i} \frac{|v(x) - v(y)|}{ | x - y|^{1 - 2/p}}&\le C(p,T) \frac{| G_i^{-1}|^{1 - 2/p}|G_i| }{\det (G_i)^{1/p}} \left( \int _{\tau _i} \left| \nabla v \right| ^p \mathrm {d}\mathcal {L}^2 \right) ^{1/p}. \end{aligned} \end{aligned}$$

Due to the quasi-uniformity of the underlying family of meshes, we can find a constant C independent of h and i with

$$\begin{aligned} \frac{| G_i^{-1}|^{1 - 2/p}|G_i| }{\det (G_i)^{1/p}} \le C \frac{h^{-1 + 2/p} h }{h^{2/p}} = C. \end{aligned}$$

We may thus conclude that there exists a constant \(C>0\) independent of i and h with

$$\begin{aligned} \begin{aligned} \max _{x,y \in \tau _i} \frac{|v(x) - v(y)|}{ | x - y|^{1 - 2/p}}&{\le } C \Vert \nabla v\Vert _{L^p(\tau _i)}\quad \forall v \in W^{1,p}(\tau _i),\quad \forall i{=}-1,\ldots , N{+}1. \end{aligned} \end{aligned}$$
(10)

Step 2 (Proof in the \(W^{2,p}\)-Case) Suppose now that a function \(v \in W^{2,p}(\varOmega )\), \(2< p < \infty \), with a non-negative trace is given, let \(h>0\) be arbitrary but fixed, and consider the auxiliary problem

$$\begin{aligned} \begin{aligned}&\min \sum _{i=0}^N v_h(x_i),\quad \text {s.t. } v_h \in V_h,\quad R_h(v) - v \le v_h \le R_h(v) \text { on } \partial \varOmega , \\&v_h = 0 \text { at every interior node of the mesh } \mathcal {T}_h, \end{aligned} \end{aligned}$$
(11)

where we use the \(C({\text {cl}}(\varOmega ))\)-representatives of v and \(R_h(v)\). Since (11) is a finite-dimensional minimization problem with a non-empty compact admissible set and a continuous objective functional, it admits at least one solution \({\tilde{w}}_h \in V_h\). Consider now an arbitrary but fixed \(x_i\) with \(0 \le i \le N\). Then, the fact that \({\tilde{w}}_h\) solves (11) implies that we cannot reduce the function value \({\tilde{w}}_h(x_i)\) (while leaving the other nodal values unchanged) without violating the constraint \(R_h(v) - v \le {\tilde{w}}_h\) on \(\partial \varOmega \). This implies that one of the following three cases has to hold true (as one may easily check by contradiction, cf. Figure 1 and the analysis in [11]):

  1. 1.

    It holds \({\tilde{w}}_h(x_i) = R_h(v)(x_i) - v(x_i)\).

  2. 2.

    There exists an \(a \in [x_{i-1}, x_i)\) with

    $$\begin{aligned} \begin{aligned}&{\tilde{w}}_h(a) = R_h(v)(a) - v(a), \\&\quad (\nabla {\tilde{w}}_h)(a) \cdot (x_i - a) = \nabla (R_h(v) - v)(a) \cdot (x_i - a). \end{aligned} \end{aligned}$$
    (12)
  3. 3.

    There exists an \(a \in (x_{i}, x_{i+1}]\) satisfying (12).

Here, \([x_{i-1}, x_i)\) and \((x_i, x_{i+1}]\) denote the relatively half-open straight line segments between \(x_{i-1}\) and \(x_i\), and \(x_i\) and \(x_{i+1}\), respectively, and with \(\nabla R_h(v) (a)\) and \(\nabla {\tilde{w}}_h(a)\) we mean the gradient of the respective finite element function on the mesh cell \(\tau _{i-1}\) in case 2 and on the mesh cell \(\tau _i\) in case 3. Recall in this context that \(W^{2,p}(\varOmega ) \hookrightarrow C^{1, 1 - 2/p}({\text {cl}}(\varOmega ))\) for \(2< p < \infty \).

Fig. 1
figure 1

Prototypical situation on the boundary mesh. The nodes \(x_{i-1}\), \(x_{i+1}\) are covered by case 1, the nodes \(x_{i-2}\), \(x_{i+2}\), \(x_{i+3}\) are covered by case 2, and the nodes \(x_{i-3}\), \(x_{i}\), \(x_{i+1}\), \(x_{i+2}\) are covered by case 3. For \(x_{i}\), the point a is identical to the mesh node \(x_{i+1}\)

Note that, in case 1, we trivially have \({\tilde{w}}_h(x_i) - R_h(v)(x_i) + v(x_i) = 0\). In the second case, we may apply Taylor’s formula in the direction of the line segment \([x_{i-1}, x_i)\) to compute that

$$\begin{aligned} \begin{aligned} 0&\le {\tilde{w}}_h(x_i) - R_h(v)(x_i) + v(x_i) \\&= \int _0^1 \nabla ( {\tilde{w}}_h - R_h(v) + v )(a + t (x_i - a)) \cdot (x_i - a)\mathrm {d}t \\&= \int _0^1 \Big ( \nabla v(a + t (x_i - a)) - \nabla v (a) \Big ) \cdot (x_i - a)\mathrm {d}t \\&\le C h^{2 - 2/p} \int _0^1 \frac{|\nabla v(a + t (x_i - a)) - \nabla v (a)|}{t^{1 - 2/p} |x_i - a|^{1 - 2/p}} t^{1 - 2/p} \mathrm {d}t \\&\le C h^{2 - 2/p} \Vert v\Vert _{W^{2,p}(\tau _{i-1})} \end{aligned} \end{aligned}$$
(13)

with some constant C independent of i and h. Here, we have used the properties of a, the regularity of v, the affine linearity of \( {\tilde{w}}_h\) and \( R_h(v)\) on \([x_{i-1}, x_i)\), and (10). Completely analogously, we obtain in the third case that

$$\begin{aligned} \begin{aligned} 0 \le {\tilde{w}}_h(x_i) - R_h(v)(x_i) + v(x_i) \le C h^{2 - 2/p} \Vert v\Vert _{W^{2,p}(\tau _{i})}. \end{aligned} \end{aligned}$$

We have now proved that

$$\begin{aligned} 0 \le {\tilde{w}}_h(x_i) - R_h(v)(x_i) + I_h(v)(x_i) \le C h^{2 - 2/p} \Big ( \Vert v\Vert _{W^{2,p}(\tau _{i-1})}^p + \Vert v\Vert _{W^{2,p}(\tau _{i})}^p\Big )^{1/p} \end{aligned}$$

holds for all \(i=0,\ldots , N\), where \(I_h\) denotes the Lagrange interpolation operator. Consider now an arbitrary but fixed \(i \in \{0,\ldots , N\}\). Then, it follows

$$\begin{aligned} \begin{aligned}&\int _{\sigma _i} | {\tilde{w}}_h - R_h(v) + I_h(v) |^p \mathrm {d}\mathcal {H}^1 \\&\quad \le |x_i - x_{i+1}| \Vert {\tilde{w}}_h - R_h(v) + I_h(v) \Vert _{L^\infty (\sigma _{i})}^p \\&\quad \le C h^{2p - 1} \Big ( \Vert v\Vert _{W^{2,p}(\tau _{i - 1})}^p + \Vert v\Vert _{W^{2,p}(\tau _i)}^p + \Vert v\Vert _{W^{2,p}(\tau _{i + 1})}^p \Big ) \end{aligned} \end{aligned}$$

and we may deduce by summation that

$$\begin{aligned} \Vert {\tilde{w}}_h \Vert _{L^p(\partial \varOmega )}\le C h^{2 - 1/p} \Vert v\Vert _{W^{2,p}(\varOmega )} + \Vert v - R_h(v)\Vert _{L^p(\partial \varOmega )} + \Vert v - I_h(v)\Vert _{L^p(\partial \varOmega )}. \end{aligned}$$

Using the inverse estimate in [27, Equation (3.1)], again [19, Theorem 1.5.1.10] (with parameter \(h^{p}\)), Lemma 3.2, and standard error estimates for the Lagrange interpolation operator as found in [7, Theorem 4.4.20], we now obtain

$$\begin{aligned} \begin{aligned}&\Vert {\tilde{w}}_h\Vert _{H^{1/2}(\partial \varOmega )} \\&\quad \le C h^{-1/2}\Vert {\tilde{w}}_h\Vert _{L^2(\partial \varOmega )} \\&\quad \le C h^{-1/2} \Vert {\tilde{w}}_h\Vert _{L^p(\partial \varOmega )} \\&\quad \le C h^{3/2 - 1/p} \Vert v\Vert _{W^{2,p}(\varOmega )} + C h^{-1/2} \left( \Vert v - R_h(v)\Vert _{L^p(\partial \varOmega )} + \Vert v - I_h(v)\Vert _{L^p(\partial \varOmega )} \right) \\&\quad \le C h^{3/2 - 1/p} \Vert v\Vert _{W^{2,p}(\varOmega )} \\&\qquad + C h^{-1/2 - 1/p} \left( h \Vert \nabla v - \nabla I_h(v)\Vert _{L^p( \varOmega )} + \Vert v - I_h(v)\Vert _{L^p( \varOmega )} \right) \\&\quad \le C h^{3/2 - 1/p} \Vert v\Vert _{W^{2,p}(\varOmega )} \end{aligned} \end{aligned}$$

with some constant \(C>0\) which may change from step to step but is always independent of h. To construct a function \(w_h \in V_h\) with the desired properties, it remains to extend \({\text {tr}}({\tilde{w}}_h)\) suitably to a function in \(V_h\). This can be accomplished, e.g., by employing the discrete harmonic extension operator \(E_h : {\text {tr}}(V_h) \rightarrow V_h\), which, according to [27, Lemma 3.2], satisfies

$$\begin{aligned} \Vert E_h(v_h)\Vert _{H^1(\varOmega )} \le C \Vert v_h\Vert _{H^{1/2}(\partial \varOmega )}\quad \forall v_h \in {\text {tr}}(V_h) \end{aligned}$$

for some constant \(C>0\) independent of h and \(v_h\).

Step 3 (Proof in the \(v_s\)-\(v_r\)-Case) For a function v with a non-negative trace that can be decomposed into two parts \(v_s\) and \(v_r\) which satisfy the conditions in point 2 of Theorem 2.3 for some \(4< p < \infty \), we can proceed completely analogously to Step 2 to construct a function \({\tilde{w}}_h \in V_h\) with \(R_h(v) - v \le {\tilde{w}}_h \le R_h(v)\) on \(\partial \varOmega \) which satisfies either \({\tilde{w}}_h(x_i) - R_h(v)(x_i) + v(x_i) = 0\) or one of the cases 2 and 3 at each node \(x_i\), \(i=0,\ldots ,N\). Let us again consider case 2, fix an \(\varepsilon \in (0,1/2)\), write \(q:=2/(1 + 2 \varepsilon ) \in (1,2)\), and assume w.l.o.g. that the line segment \(\sigma _{i-1}\) is contained in \(\mathbb {R}\times \{0\}\) so that \(a = ({\bar{a}}, 0)\), \(x_{i-1} = ({\bar{x}}_{i-1} , 0)\), and \(x_{i} = ({\bar{x}}_{i}, 0)\) with \({\bar{a}}, {\bar{x}}_{i-1}, {\bar{x}}_i \in \mathbb {R}\). Then, we may use the same calculation as in (13), the regularity properties of \(v_s\) and \(v_r\), and Morrey’s inequality to obtain

$$\begin{aligned} \begin{aligned} 0&\le {\tilde{w}}_h(x_i) - R_h(v)(x_i) + v(x_i) \\&= \int _0^1 \Big ( \nabla v(a + t (x_i - a)) - \nabla v (a) \Big ) \cdot (x_i - a)\mathrm {d}t \\&= \int _0^1 \Big ( \partial _1 v ({\bar{a}} + t ({\bar{x}}_i - {\bar{a}}), 0) - \partial _1 v ( {\bar{a}}, 0) \Big ) ({\bar{x}}_i - {\bar{a}}) \mathrm {d}t \\&\le \int _{{\bar{a}}}^{{\bar{x}}_{i}} \Big ( \partial _1 v_s (t, 0) - \partial _1 v_s ( {\bar{a}}, 0) \Big ) \mathrm {d}t + C h^{2 - 2/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )} \\&\le \int _{{\bar{a}}}^{{\bar{x}}_{i}} \left| \partial _1^2 v_s (t, 0)(t - {\bar{x}}_{i}) \right| \mathrm {d}t + C h^{2 - 2/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )} \\&\le \left( \int _{{\bar{a}}}^{{\bar{x}}_{i}} (t - {\bar{x}}_{i})^{\frac{q}{q-1}} \mathrm {d}t\right) ^{\frac{q-1}{q}} \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i-1})}+ C h^{2 - 2/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )} \\&\le C h^{2 - 1/q} \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i-1})}+ C h^{2 - 2/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )}. \end{aligned} \end{aligned}$$

If we use exactly the same strategy in case 3, then it follows that \({\tilde{w}}_h\) satisfies

$$\begin{aligned} \begin{aligned} 0&\le {\tilde{w}}_h(x_i) - R_h(v)(x_i) + I_h(v)(x_i) \\&\le C h^{2 - 1/q} \left( \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i-1})}^q + \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i})}^q \right) ^{1/q} \\&\quad + C h^{2 - 2/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )} \end{aligned} \end{aligned}$$

for all \(i=0,\ldots , N\), where \(I_h\) again denotes the Lagrange interpolation operator. By integration, we may now again deduce (using the estimate \((a + b)^q \le C( a^q + b^q)\))

$$\begin{aligned} \begin{aligned}&\int _{\sigma _i} | {\tilde{w}}_h - R_h(v) + I_h(v) |^q \mathrm {d}\mathcal {H}^1 \\&\quad \le C h^{2 q } \left( \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i-1})}^q + \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i})}^q + \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\sigma _{i+1})}^q \right) \\&\qquad + C h^{2q + 1 - 2q/p} \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )}^q \end{aligned} \end{aligned}$$

and, by summation,

$$\begin{aligned} \begin{aligned}&\Vert {\tilde{w}}_h - R_h(v) + I_h(v) \Vert _{L^q(\partial \varOmega )} \\&\quad \le C h^{2 - 2/p} \left( h^{2q/p} \Vert {\text {tr}}( v_s)\Vert _{W^{2, q}(\partial \varOmega )}^q + \Vert v_r\Vert _{C^{1,1-2/p}(\varOmega )}^q \right) ^{1/q}. \end{aligned} \end{aligned}$$

Combining the above with Lemma 3.2, inverse estimates (cf. [27, Equation (3.1)]), the definition of q, and standard results for the Lagrange interpolant yields

$$\begin{aligned} \begin{aligned} \Vert {\tilde{w}}_h\Vert _{H^{1/2}(\partial \varOmega )}&\le \Vert {\tilde{w}}_h - R_h(v) + I_h(v) \Vert _{H^{1/2}(\partial \varOmega )} + \Vert I_h(v_r) - R_h(v_r)\Vert _{H^{1/2}(\partial \varOmega )} \\&\quad \ \ + \Vert I_h(v_s) - R_h(v_s)\Vert _{H^{1/2}(\partial \varOmega )} \\&\le h^{-1/q}\Vert {\tilde{w}}_h - R_h(v) + I_h(v) \Vert _{L^q(\partial \varOmega )} + C h^{3/2 - 1/p} + C h^{3/2 - \varepsilon } \\&\le C h^{2 -1/q - 2/p} + C h^{3/2 - 1/p} + C h^{3/2 - \varepsilon } \\&\le C h^{3/2- 2/p -\varepsilon } + C h^{3/2 - 1/p} + C h^{3/2 - \varepsilon } \end{aligned} \end{aligned}$$

with a constant C which may depend on \(\varepsilon \) but is independent of h. The claim now follows completely analogously to Step 2.\(\square \)

By combining Lemmas 3.4 and 3.5, we now arrive at the following main result of this section:

Theorem 3.6

(Supercloseness) Suppose that u solves (S) for some \(f \in L^2(\varOmega )\). Then, the following holds true for the Ritz projection \(R_h(u)\) and the finite element solution \(u_h\) of (S\(_h\)):

  1. 1.

    If u satisfies \(u \in W^{2,p}(\varOmega )\) for some \(2< p < 4\), then there exists a constant \(C>0\) independent of h with

    $$\begin{aligned} \Vert R_h(u) - u_h \Vert _{H^1(\varOmega )} \le C h^{3/2 - 1/p}. \end{aligned}$$
    (14)
  2. 2.

    If u admits a decomposition \(u = u_s + u_r\) as in point 2 of Theorem 2.3 for some \(4< p < \infty \), then, for every \(\varepsilon \in (0, 1/2)\), there exists a constant \(C>0\) independent of h with

    $$\begin{aligned} \Vert R_h(u) - u_h \Vert _{H^1(\varOmega )} \le C h^{3/2 - 2/p - \varepsilon }. \end{aligned}$$
    (15)

Note that Theorem 3.6 indeed shows that the Ritz projection \(R_h(u)\) of the exact solution u of (S) is superclose to the finite element approximation \(u_h\) characterized by (S\(_h\)) as, for the considered ranges of p, (14) and (15) yield estimates of the form \(\Vert R_h(u) - u_h \Vert _{H^1(\varOmega )} = \mathcal {O}(h^\gamma )\) with an exponent \(\gamma \) that is strictly greater than one. The \(H^1\)-error between \(R_h(u)\) and \(u_h\) thus decays faster than that between u and \(u_h\) which, in the considered situation, can be expected to be at most of order \(\mathcal {O}(h)\). We would like to point out that the behavior, that we observe here, agrees very well with intuition. Since the additional constraint in (S\(_h\)) is only present on the boundary, it is only natural that the finite element solution \(u_h\) is very close to the Ritz projection \(R_h(u)\) which may be interpreted as the solution of an associated unconstrained problem.

4 Consequences for \(W^{1,p}\)-, \(L^p\)-, and \(H^{1/2}\)-error estimates

As the behavior of the error \(R_h(u) - u\) is known by Lemma 3.2, Theorem 3.6 gives rise to estimates for the quantity \(u - u_h\) in a straightforward manner. The results that are obtained along these lines are collected in the following two corollaries:

Corollary 4.1

Suppose that u solves (S) for some \(f \in L^2(\varOmega )\). Assume further that there exists a \(2< p < 4\) with \(u \in W^{2,p}(\varOmega )\). Then, for every \(1< q < \infty \), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \begin{array}{c} \begin{aligned} \begin{array}{ll} \Vert u - u_h\Vert _{W^{1,\frac{4p}{p+2}}(\varOmega )} \le C h, &{}\quad \Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \le C h^{1/2 - 1/p}, \\ \Vert u - u_h\Vert _{L^q(\varOmega )} \le C h^{3/2 - 1/p}, &{}\quad \Vert u - u_h\Vert _{L^\infty (\varOmega )} \le C |\ln (h)|^{1/2}h^{3/2 - 1/p}, \end{array} \end{aligned} \\ \Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - 1/p}. \end{array} \end{aligned} \end{aligned}$$

Proof

From Theorem 3.6, the inverse estimates in [7, Theorem 4.5.11], Lemma 3.2, and the triangle inequality, it follows that

$$\begin{aligned} \begin{aligned} \Vert u - u_h\Vert _{W^{1,\frac{4p}{p+2}}(\varOmega )}&\le \Vert R_h(u) - u_h\Vert _{W^{1,\frac{4p}{p+2}}(\varOmega )} + \Vert u - R_h(u)\Vert _{W^{1,\frac{4p}{p+2}}(\varOmega )} \\&\le C h^{ \frac{p+2}{2p} - 1}\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C \Vert u - R_h(u)\Vert _{W^{1,p}(\varOmega )} \\&\le C h^{\frac{p+2}{2p} - 1 + 3/2 - 1/p} + C h \\&\le C h. \end{aligned} \end{aligned}$$

This proves the first estimate. Similarly, we may compute (using standard error estimates for the Lagrange interpolant \(I_h(u)\), see [7, Theorem 4.4.20], Sobolev embeddings, again [7, Theorem 4.5.11], and Lemma 3.2) that

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{W^{1,\infty }(\varOmega )} + \Vert I_h(u) - R_h(u)\Vert _{W^{1,\infty }(\varOmega )} + \Vert u - I_h(u)\Vert _{W^{1,\infty }(\varOmega )} \\&\quad \le C h^{- 1}\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C h^{- 2/p} \Vert I_h(u) - R_h(u)\Vert _{W^{1,p}(\varOmega )} + Ch^{1- 2/p} \\&\quad \le C h^{1/2 - 1/p} + C h^{1 - 2/p} + Ch^{1- 2/p} \le C h^{1/2 - 1/p} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{L^q(\varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{L^q(\varOmega )} + C\Vert I_h(u) - R_h(u)\Vert _{L^\infty (\varOmega )} + C\Vert u - I_h(u)\Vert _{L^\infty (\varOmega )} \\&\quad \le C \Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C h^{-2/p}\Vert I_h(u) - R_h(u)\Vert _{L^p(\varOmega )} + C h^{2 - 2/p} \\&\quad \le C h^{3/2 - 1/p} + C h^{2 - 2/p} \le C h^{3/2 - 1/p} \end{aligned} \end{aligned}$$

holds for all \(1< q < \infty \). Further, we may use the discrete Sobolev inequality in [7, Lemma 4.9.2] and exactly the same arguments as above to obtain

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{L^\infty (\varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{L^\infty (\varOmega )} + \Vert I_h(u) - R_h(u)\Vert _{L^\infty (\varOmega )} + \Vert u - I_h(u)\Vert _{L^\infty (\varOmega )} \\&\quad \le C |\ln (h)|^{1/2} \Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C h^{2 - 2/p} \\&\quad \le C |\ln (h)|^{1/2} h^{3/2 - 1/p} . \end{aligned} \end{aligned}$$

It remains to prove the \(H^{1/2}(\partial \varOmega )\)-error estimate. To this end, we define \(\psi _h\) to be the unique element of \(V_h\), which is one at every boundary node and zero at every interior node, and \({\text {supp}}(\psi _h)\) to be the support of \(\psi _h\). We may now use the classical trace theorem and Hölder’s inequality to infer that

$$\begin{aligned} \begin{aligned}&\Vert u - I_h(u)\Vert _{H^{1/2}(\partial \varOmega )}^2 \\&\quad \le C \Vert \psi _h (u - I_h(u))\Vert _{H^{1}(\varOmega )}^2 \\&\quad \le C \left( \Vert \nabla \psi _h\Vert _{L^\infty (\varOmega )}^2 \Vert u - I_h(u)\Vert _{L^2({\text {supp}}(\psi _h))}^2 + \Vert \nabla (u - I_h(u))\Vert _{L^2({\text {supp}}(\psi _h))}^2\right) \\&\quad \le C h^2 |u|_{H^2({\text {supp}}(\psi _h))}^2 \\&\quad \le C h^2 |u|_{W^{2,p}({\text {supp}}(\psi _h))}^{2} \mathcal {L}^2({\text {supp}}(\psi _h))^{1 - 2/p} \\&\quad \le C h^{3 - 2/p}. \end{aligned} \end{aligned}$$
(16)

Now we may proceed as before (using Lemma 3.2, [27, Equation (3.1)], the trace theorem, and again [19, Theorem 1.5.1.10] with parameter \(\varepsilon := h^p\)) to obtain

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )}\\&\quad \le \Vert R_h(u) - u_h\Vert _{H^{1/2}(\partial \varOmega )} + \Vert R_h(u) - I_h(u)\Vert _{H^{1/2}(\partial \varOmega )} + \Vert I_h(u) - u\Vert _{H^{1/2}(\partial \varOmega )}\\&\quad \le C\Vert R_h(u) - u_h\Vert _{H^{1}(\varOmega )} + C h^{-1/2} \Vert R_h(u) - I_h(u)\Vert _{L^p(\partial \varOmega )} + C h^{3/2 - 1/p}\\&\quad \le C h^{3/2 - 1/p}. \end{aligned} \end{aligned}$$

This proves the claim.\(\square \)

Corollary 4.2

Suppose that u solves (S) for some \(f \in L^2(\varOmega )\). Assume further that u admits a decomposition \(u = u_s + u_r\) as in point 2 of Theorem 2.3 for some \(4< p < \infty \). Then, for all \(\varepsilon \in (0, 1/2)\), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \begin{array}{ll} \Vert u - u_h\Vert _{W^{1,\frac{4p}{p+4}}(\varOmega )} \le C h^{1 - \varepsilon }, &{}\quad \Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \le C h^{1/2 - 2/p - \varepsilon }, \\ \Vert u - u_h\Vert _{L^\infty (\varOmega )} \le C h^{3/2 - 2/p - \varepsilon }, &{}\quad \Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - 2/p - \varepsilon }. \end{array} \end{aligned} \end{aligned}$$

Proof

The proof is completely along the lines of that of the last corollary and only requires some minor modifications. We include it for the sake of completeness: Note that the regularity properties of \(u_s\) and \(u_r\) imply that \(u \in W^{2,q}(\varOmega )\) holds for all \(q \in (2,4)\), cf. [17, Theorem 6.5]. Consider now an arbitrary but fixed \(\varepsilon \in (0, 1/2)\). Then, we may invoke Theorem 3.6 and compute (using the same ideas as before)

$$\begin{aligned} \begin{aligned} \Vert u - u_h\Vert _{W^{1,\frac{4p}{p+4}}(\varOmega )}&\le \Vert R_h(u) - u_h\Vert _{W^{1,\frac{4p}{p+4}}(\varOmega )} + \Vert u - R_h(u)\Vert _{W^{1,\frac{4p}{p+4}}(\varOmega )} \\&\le C h^{ \frac{p+4}{2p} - 1}\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C h \\&\le C h^{\frac{p+4}{2p} - 1 + 3/2 - 2/p - \varepsilon } + C h \le C h^{1 - \varepsilon } \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{W^{1,\infty }(\varOmega )} + \Vert I_h(u) - R_h(u)\Vert _{W^{1,\infty }(\varOmega )} + \Vert u - I_h(u)\Vert _{W^{1,\infty }(\varOmega )} \\&\quad \le C h^{- 1}\Vert R_h(u) - u_h\Vert _{H^1(\varOmega )} + C h^{- \frac{p+4}{2p}} \Vert I_h(u) - R_h(u)\Vert _{W^{1, \frac{4p}{p+4}}(\varOmega )} + Ch^{1- \frac{p+4}{2p}} \\&\quad \le C h^{1/2 - 2/p - \varepsilon } + Ch^{1- \frac{p+4}{2p}} + Ch^{1- \frac{p+4}{2p}} \le C h^{1/2 - 2/p - \varepsilon }. \end{aligned} \end{aligned}$$

Analogously (by [7, Lemma 4.9.2]),

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{L^\infty (\varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{L^\infty (\varOmega )} + \Vert I_h(u) - R_h(u)\Vert _{L^\infty (\varOmega )} + \Vert u - I_h(u)\Vert _{L^\infty (\varOmega )} \\&\quad \le C |\ln (h)|^{1/2} \Vert R_h(u) {-} u_h\Vert _{H^1(\varOmega )} {+} C h^{-\frac{p+4}{2p}}\Vert I_h(u) {-} R_h(u)\Vert _{L^{\frac{4p}{p+4}}(\varOmega )} {+} C h^{2 -\frac{p+4}{2p}} \\&\quad \le C |\ln (h)|^{1/2} h^{3/2 - 2/p - \varepsilon } + C h^{2 -\frac{p+4}{2p}} \\&\quad \le C h^{3/2 - 2/p - 2\varepsilon }. \end{aligned} \end{aligned}$$

(Note that the coefficient of \(\varepsilon \) in the exponent is completely unimportant here since we may always redefine \(\varepsilon \).) Finally, we may compute (using the trace theorem, Lemma 3.2, and again (16))

$$\begin{aligned} \begin{aligned}&\Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )} \\&\quad \le \Vert R_h(u) - u_h\Vert _{H^{1/2}(\partial \varOmega )} + \Vert R_h(u) - u\Vert _{H^{1/2}(\partial \varOmega )} \\&\quad \le C \Vert R_h(u) - u_h\Vert _{H^{1 }(\varOmega )} + \Vert R_h(u_s) - u_s\Vert _{H^{1/2}(\partial \varOmega )} \\&\qquad + \Vert R_h(u_r) - I_h(u_r)\Vert _{H^{1/2}(\partial \varOmega )} + \Vert u_r - I_h(u_r)\Vert _{H^{1/2}(\partial \varOmega )} \\&\quad \le C h^{3/2 - 2/p - \varepsilon } + Ch^{3/2 - \varepsilon } + Ch^{-1/2} \Vert R_h(u_r) - I_h(u_r)\Vert _{L^p(\partial \varOmega )} + C h^{3/2 - 1/p} \\&\quad \le C h^{3/2 - 2/p - \varepsilon }. \end{aligned} \end{aligned}$$

This completes the proof.\(\square \)

The error estimates in Corollary 4.2 turn out to be of particular interest when the exponent p can be chosen arbitrarily large. Indeed, in this limit case, we obtain the following important result:

Theorem 4.3

(Optimal FE-estimates under regularity assumptions) Suppose that u solves (S) for some \(f \in L^2(\varOmega )\). Assume further that u admits a decomposition \(u = u_s + u_r\) with the properties in point 2 of Theorem 2.3 for all \(4< p < \infty \). Then, for all \(\varepsilon \in (0, 1/2)\), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \begin{array}{c} \begin{aligned} \begin{array}{ll} \Vert u - u_h\Vert _{W^{1,8/3 - \varepsilon }(\varOmega )} \le C h, &{}\qquad \Vert u - u_h\Vert _{W^{1,4 }(\varOmega )} \le C h^{1 - \varepsilon }, \\ \Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \le C h^{1/2 - \varepsilon }, &{}\quad \Vert u - u_h\Vert _{L^\infty (\varOmega )} \le C h^{3/2 - \varepsilon }, \end{array} \end{aligned} \\ \Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - \varepsilon }. \end{array} \end{aligned} \end{aligned}$$
(17)

Proof

In the considered situation, we may apply Corollary 4.2 with an arbitrarily large \(p > 4\) and Corollary 4.1 with a p which is arbitrarily close to four. The assertions of the theorem now follow immediately by invoking these results and by noting that, for all \(\varepsilon \in (0, 1/2)\), we have (due to the \(W^{1,\infty }\)-estimate and the \(W^{1,q}\)-estimate for all exponents \(2< q < 4\))

$$\begin{aligned} \Vert u - u_h\Vert _{W^{1,4}(\varOmega )} \le C \left( \Vert u - u_h\Vert _{W^{1, 4 - \varepsilon }(\varOmega )} \right) ^{\frac{4-\varepsilon }{4}} \le C h^{(1 - \varepsilon )\frac{4-\varepsilon }{4}} = C h^{1 - 5\varepsilon /4 + \varepsilon ^2/4} \end{aligned}$$

with some constant C independent of h. This completes the proof.\(\square \)

Several things are noteworthy regarding the last three results:

Remark 4.4

  1. 1.

    Since the overall regularity of u can, in general, not be expected to exceed

    $$\begin{aligned} u \in W^{2, 4 - \varepsilon }(\varOmega ) \quad \forall \varepsilon \in (0, 1/2) \quad \text {and}\quad u \in H^{5/2 - \varepsilon }(\varOmega ) \quad \forall \varepsilon \in (0, 1/2) \end{aligned}$$

    in the situation of Theorem 4.3 (cf. the comments in Remark 2.4) and since we consider piecewise linear ansatz functions, the \(W^{1,8/3 - \varepsilon }\)-, the \(W^{1,4}\)-, the \(W^{1, \infty }\)-, the \(L^\infty \)-, and the \(H^{1/2}\)-error estimate in (17) are optimal. Compare, e.g., with the classical results for the Lagrange interpolation operator in [7, Theorem 4.4.20] and with Lemma 3.2 in this context. The orders of convergence in Theorem 4.3 are also observed in numerical experiments, see Sect. 7.

  2. 2.

    Note that the regularity assumptions in the last three results fit precisely to what we have proved in Theorem 2.3. This implies in particular that Corollaries 4.1 and 4.2 and Theorem 4.3 remain valid when we replace the appearing assumptions on the regularity of u with the respective assumptions on f and the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\) in Theorem 2.3. The estimates that are obtained along these lines in the situation of Theorem 4.3 are collected in Theorem 6.1.

  3. 3.

    The error estimates in Corollaries 4.1 and 4.2 and Theorem 4.3 are similar in nature to the results for the \(H^1\)-error in [16] in that they only rely on (realistic) assumptions on the Sobolev regularity properties of the exact solution u and do not require any additional conditions on the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\), cf. [8]. At least to the authors’ best knowledge, for \(W^{1,p}\)-error estimates with \(p>2\), such results have not been available so far in the literature. The same seems to be the case for the \(L^\infty \)- and the \(W^{1, \infty }\)-error estimate in (17). Note that we have derived our \(L^\infty \)-error estimates without invoking the discrete maximum principle and without the associated assumptions on \(\mathcal {T}_h\), cf. [11, 13, 34].

  4. 4.

    The \(H^{1/2}\)-error estimate in (17) has already been obtained in [41, Theorem 2.2] in dimensions two and three under the assumption that the solution u is in \(H^{5/2-\varepsilon }(\varOmega )\) for all \(\varepsilon \in (0, 1/2)\). Note that this regularity can only be expected to hold if the right-hand side f satisfies \(f \in H^{1/2-\varepsilon }(\varOmega )\) for all \(\varepsilon \in (0, 1/2)\). The regularity assumptions that we work with in our analysis differ from this in that they also allow for a \(W^{2,p}\)-part of the solution u and are thus realistic for general right-hand sides \(f \in L^p(\varOmega )\), see Theorem 2.3. We further obtain the \(H^{1/2}\)-estimate in (17) with arguments that seem to be more elementary than those in [41]. However, in contrast to the analysis in [41], our approach cannot be extended straightforwardly to the three-dimensional setting since unilateral approximation results analogous to those in Lemma 3.5 are only available in limited form in dimensions \(d \ge 3\), cf. the analysis in [11].

  5. 5.

    In [41], an \(L^2\)-error estimate of order \(3/2 -\varepsilon \) is obtained as a corollary of the \(H^{1/2}\)-error estimate on the boundary. We obtain this order of convergence even in the \(L^{\infty }\)-norm.

As the reader might have noticed, Theorem 4.3 only yields the suboptimal order of convergence \(3/2 - \varepsilon \) for, e.g., the \(L^4\)-error. We thus miss a factor \(h^{1/2}\) in comparison with the approximation properties of the Lagrange interpolation operator. In what follows, we demonstrate that a better estimate can be obtained with a non-standard duality argument, and that the order two (minus epsilon), that is observed in numerical experiments, can also be recovered analytically provided the continuous solution u satisfies condition (A) and the contact sets of the finite element approximations \(u_h\) behave sufficiently well in the limit \(h \searrow 0\).

5 \(L^{4}\)-error estimates of optimal order via an Aubin–Nitsche trick

To estimate the \(L^4\)-error, we use an approach that has been proposed by Mosco in [31, Section 7] for the one-dimensional obstacle problem and consider two dual variational inequalities - one for each of the components \((u - u_h)^+ = \max (0, u - u_h)\) and \((u- u_h)^- = \min (0, u - u_h)\).

5.1 A duality argument for the component \((u - u_h)^+\) under condition (A)

To formulate our first dual problem, we introduce the following notation:

Definition 5.1

Given a \(u \in H^2(\varOmega )\) which solves (S) for some \(f \in L^2(\varOmega )\) and which satisfies condition (A), we define:

  1. 1.

    \(\mathcal {A}^\circ \subset \partial \varOmega \) to be the relative interior of the contact set \(\{x \in \partial \varOmega \mid u(x) = 0\}\),

  2. 2.

    \(\mathcal {A}_h^\circ \subset \partial \varOmega \) to be the union of all (closed) cells of the boundary mesh which intersect the set \(\mathcal {A}^\circ \).

Note that, according to condition (A), at least for all sufficiently small h, the number of connected components of \(\mathcal {A}_h^\circ \) is finite and equal to the number of connected components of \(\mathcal {A}^\circ \). Given a u which solves (S) and satisfies (A) and a solution \(u_h\) of (S\(_h\)), we now consider the following auxiliary problem:

figure d

Here,

$$\begin{aligned} L := \Big \{ v \in H^1(\varOmega ) \, \Big | \, {\text {tr}}(v) \ge 0 \ \mathcal {H}^1\text {-a.e. on } \mathcal {A}_h^\circ \Big \}. \end{aligned}$$

Note that the solution z of (D) depends on h (since \(\mathcal {A}_h^\circ \) and \(u_h\) do). From standard results and the analysis in [19], we may deduce:

Lemma 5.2

Suppose that u solves (S) for some \(f \in L^2(\varOmega )\), that condition (A) holds, and that \(u_h\) is the solution of (S\(_h\)). Then, the problem (D) admits a unique solution \(z \in H^1(\varOmega )\) for all \(h>0\). This solution satisfies \(z \le 0\) \(\mathcal {L}^2\)-a.e. in \(\varOmega \) and \({\text {tr}}(z) = 0\) \(\mathcal {H}^1\)-a.e. on \(\mathcal {A}_h^\circ \), and, for every \(\varepsilon \in (0,1/2)\), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \Vert z\Vert _{H^1(\varOmega )} \le C \Vert \max (0, u - u_h)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )}. \end{aligned}$$
(18)

Moreover, for all \(\varepsilon \in (0,1/2)\) and all sufficiently small \(h > 0\), z is in \(W^{2, (4-\varepsilon )/3}(\varOmega )\) and satisfies

$$\begin{aligned} \Vert z\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C \Vert \max (0, u - u_h)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \end{aligned}$$
(19)

with some constant \(C>0\) independent of h.

Proof

The unique solvability of (D) for all \(h>0\) follows from [25, Theorem II-2.1]. Further, we may employ Stampacchia’s lemma, see [3, Theorem 5.8.2], and use the test function \(v := z^- \in L\) in (D) to deduce that

$$\begin{aligned} 0 \ge - ( \max (0, u - u_h)^{3}, z^+)_{L^2(\varOmega )} \ge ( z , z^+)_{H^1(\varOmega )} = \Vert z^+ \Vert _{H^1(\varOmega )}^2. \end{aligned}$$

This proves that we indeed have \(z \le 0\) \(\mathcal {L}^2\)-a.e. in \(\varOmega \) and, as a consequence, that \({\text {tr}}(z) = 0\) holds \(\mathcal {H}^1\)-a.e. on \(\mathcal {A}_h^\circ \). Moreover, by choosing the test functions \( v= 0\) and \(v=2z\) in (D), and by exploiting the Sobolev embeddings, we obtain

$$\begin{aligned} \begin{aligned} \Vert z\Vert _{H^1(\varOmega ) }^2&= ( -\max (0, u - u_h)^{3}, z)_{L^2(\varOmega )} \\&\le \Vert \max (0, u - u_h)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \Vert z\Vert _{L^{(4 - \varepsilon )/(1 - \varepsilon )}(\varOmega )} \\&\le C \Vert \max (0, u - u_h)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \Vert z\Vert _{H^1(\varOmega )} \quad \forall \varepsilon \in (0, 1/2), \end{aligned} \end{aligned}$$

where C is the embedding constant of \(H^1(\varOmega ) \hookrightarrow L^{(4 - \varepsilon )/(1 - \varepsilon )}(\varOmega )\). This yields (18). It remains to prove the \(W^{2, (4-\varepsilon )/3}\)-regularity of z and (19). To this end, we note that the non-positivity of z in \(\varOmega \), the condition \({\text {tr}}(z) \ge 0 \) on \(\mathcal {A}_h^\circ \), and the properties of the set \(\mathcal {A}_h^\circ \) imply that z is the unique weak solution of the problem

$$\begin{aligned} \begin{array}{l@{\quad }l} - \varDelta z = -\max (0, u - u_h)^{3} - z &{} \mathcal {L}^2\text {-a.e. in }\varOmega , \\ z = 0 &{} \mathcal {H}^1\text {-a.e. on } \mathcal {A}_h^\circ , \\ \partial _n z = 0 &{}\mathcal {H}^1\text {-a.e. on } \partial \varOmega {\setminus } \mathcal {A}_h^\circ . \end{array} \end{aligned}$$

Since \(\mathcal {A}_h^\circ \) and its complement can be written as the union of at most finitely many straight line segments which meet at an angle \(\pi /2\) or \(\pi \), we may again invoke [19, Theorem 4.4.3.7] to deduce that \(z \in W^{2, (4-\varepsilon )/3}(\varOmega )\) holds for all \(\varepsilon \in (0, 1/2)\). To obtain the estimate (19), let us assume that \(\mathcal {A}^\circ \ne \emptyset \) and \(\mathcal {A}^\circ \ne \partial \varOmega \) (else the proof is trivial). In this case, condition (A) implies that \(\mathcal {A}^\circ \) consists of finitely many connected components, that the relative boundary of \(\mathcal {A}^\circ \) in \(\partial \varOmega \) consists of finitely many points \(b_1,\ldots , b_N\), \(N \in \mathbb {N}\), and that we may find a \(\delta > 0\) with \({\text {dist}}(b_i, b_j) > 4 \delta \) for all \(i \ne j\) and \({\text {dist}}(b_i, \{(0,0), (0,1), (1, 0), (1,1)\}) > 4 \delta \) for all \(b_i\) which are not themselves corner-points of the square \(\varOmega \). Choose rotationally symmetric cut-off functions \(\psi _i \in C_c^\infty (\mathbb {R}^2)\), \(i=1,\ldots ,N\), such that

$$\begin{aligned} 0 \le \psi _i \le 1,\quad {\text {supp}}(\psi _i) \subset B_{2\delta }(b_i),\quad \psi _i \equiv 1 \text { in } B_{\delta }(b_i) \end{aligned}$$

holds for all \(i=1,\ldots , N\), where \(B_r(b)\) denotes the closed ball of radius \(r > 0\) around a \(b \in \mathbb {R}^2\), and decompose z into the parts \(z_0,z_1,\ldots , z_N\) defined by

$$\begin{aligned} z_i := \psi _i z \text { for } i=1,\ldots ,N,\quad z_0 := \psi _0 z,\quad \psi _0 := 1 - \sum _{i=1}^N \psi _i. \end{aligned}$$

Suppose further that h is so small that the set \(\mathcal {A}_h^\circ {\setminus } \mathcal {A}^\circ \) is contained in the union of the balls \(B_{\delta }(b_i)\), \(i=1,\ldots ,N\). (This is the case for all sufficiently small h due to the definition of \(\mathcal {A}_h^\circ \).) Then, \(\delta \) and the functions \(\psi _i\) are clearly independent of h, and we may compute that

$$\begin{aligned} \begin{array}{l@{\quad }l} - \varDelta z_0 = - z \varDelta \psi _0 - 2 \nabla z \cdot \nabla \psi _0 - \psi _0 \max (0, u - u_h)^{3} - z_0 &{} \mathcal {L}^2\text {-a.e. in }\varOmega , \\ z_0 = 0 &{}\mathcal {H}^1\text {-a.e. on } \mathcal {A}^\circ , \\ \partial _n z_0 = 0 &{} \mathcal {H}^1\text {-a.e. on } \partial \varOmega {\setminus } \mathcal {A}^\circ . \end{array} \end{aligned}$$

Here, we have used that \(\mathcal {A}^\circ \subset \mathcal {A}_h^\circ \), and that the rotational symmetry of the cut-off functions and the choice of \(\delta \) imply \(\partial _n \psi _i \equiv 0\) on \(\partial \varOmega \) for all i. Note that the boundary conditions in the above problem are independent of h. We may thus invoke [19, Theorem 4.3.2.4] and (18) to deduce that, for every \(\varepsilon \in (0,1/2)\), we have

$$\begin{aligned} \begin{aligned}&\Vert z_0\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \\&\quad \le C \left( \Vert z \varDelta \psi _0 + 2 \nabla z \cdot \nabla \psi _0 + \psi _0 \max (0, u - u_h)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} + \Vert z_0\Vert _{W^{1, (4 - \varepsilon )/3}(\varOmega )} \right) \\&\quad \le C \left( \Vert \max (0, u - u_h)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} + \Vert z\Vert _{W^{1, (4 - \varepsilon )/3}(\varOmega )} \right) \\&\quad \le C \Vert \max (0, u - u_h)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )}. \end{aligned} \end{aligned}$$

Here, \(C>0\) is a generic constant which depends on \(\varepsilon \) and \(\psi _0\) but not on h and which may change from step to step. Consider now a point \(b_i\) of the relative boundary of \(\mathcal {A}^\circ \) which is not a corner point of the square \(\varOmega \), w.l.o.g. \(b_i = (a, 0)\) for some \(a \in (0, 1)\). Then, it follows from our choice of \(\delta \) that \((a - 4 \delta , a + 4 \delta ) \times \{0\}\) is a subset of \(\partial \varOmega \) and does not contain a further point of the relative boundary of \(\mathcal {A}^\circ \). This implies in particular that exactly one of the sets \((a - 4 \delta , a) \times \{0\}\) and \((a , a + 4 \delta ) \times \{0\}\) is contained in \(\mathcal {A}^\circ \). Let us assume w.l.o.g. that this is the case for \((a , a + 4 \delta ) \times \{0\}\). Then, it follows from the definition of \(\mathcal {A}_h^\circ \) and the properties of \(\{\mathcal {T}_h\}\) that there exist a constant \(C>0\) independent of h and a \(\tau \in [0, C]\) (possibly dependent on h) such that the set \((a - \tau h, a + 4 \delta ) \times \{0\}\) is contained in \(\mathcal {A}_h^\circ \), and we may calculate that the function \(z_i = \psi _i z\) satisfies

$$\begin{aligned} \begin{array}{ll} - \varDelta z_i = - z \varDelta \psi _i - 2 \nabla z \cdot \nabla \psi _i\\ -\quad \psi _i \max (0, u - u_h)^{3} - z_i &{}\quad \mathcal {L}^2\text {-a.e. in }\varOmega , \\ z_i = 0 &{}\quad \mathcal {L}^2\text {-a.e. in }\varOmega {\setminus } B_{2 \delta }(a, 0), \\ z_i = 0 &{}\quad \mathcal {H}^1\text {-a.e. on } (a - \tau h, a + 2 \delta ) \times \{0\}, \\ \partial _n z_i = 0 &{}\quad \mathcal {H}^1\text {-a.e. on } (a - 2 \delta , a - \tau h) \times \{0\}. \end{array} \end{aligned}$$

Here, we have again used the properties of z and the rotational symmetry of \(\psi _i\). Since \(z_i\) and \(\psi _i\) vanish outside of the ball \( B_{2 \delta }(a, 0)\), we may now deduce that the (trivial extension of) the function \({\bar{z}}(x, y) := z_i(x + a - \tau h - 1/2,y)\) satisfies

$$\begin{aligned} \begin{array}{ll} - \varDelta {\bar{z}} = {\bar{g}} &{}\quad \mathcal {L}^2\text {-a.e. in } \varOmega , \\ {\bar{z}} = 0 &{}\quad \mathcal {H}^1\text {-a.e. on } \partial \varOmega {\setminus } (0, 1/2) \times \{0\}, \\ \partial _n {\bar{z}} = 0 &{}\quad \mathcal {H}^1 \text {-a.e. on } (0, 1/2) \times \{0\} \end{array} \end{aligned}$$

with

$$\begin{aligned} {\bar{g}}(x,y) {:=} \Big ( {-} z \varDelta \psi _i {-} 2 \nabla z \cdot \nabla \psi _i {-} \psi _i \max (0, u - u_h)^{3} {-} z_i \Big ) (x + a - \tau h {-} 1/2,y). \end{aligned}$$

By invoking [19, Theorem 4.3.2.4], we may now again deduce that there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \Vert {\bar{z}}\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C \left( \Vert {\bar{g}}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} + \Vert {\bar{z}}\Vert _{W^{1, (4 - \varepsilon )/3}(\varOmega )} \right) . \end{aligned} \end{aligned}$$

If we express \({\bar{z}}\) in terms of \(z_i\) and use the same calculations as before, then we arrive at

$$\begin{aligned} \Vert z_i\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C \Vert \max (0, u - u_h)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \end{aligned}$$

with a constant \(C > 0\) which depends on \(\psi _i\) and \(\varepsilon \) but is independent of h. Using the same arguments as above, we can transform each of the situations occurring at the points \(b_i\), \(i=1,\ldots ,N\), to one of finitely many reference configurations and use [19, Theorem 4.3.2.4] as well as (18) to prove that there exist constants \(C_i\) independent of h with

$$\begin{aligned} \Vert z_i\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C_i \Vert \max (0, u - u_h)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )}\qquad \forall i=0,\ldots ,N. \end{aligned}$$

Note that, if we consider a point \(b_i\) which is a corner of the square \(\varOmega \), then the situation is even simpler than above since, in this case, the boundaries of \(\mathcal {A}^\circ \) and \(\mathcal {A}_h^\circ \) are locally the same and equal to \(\{b_i\}\) so that a translation argument as above is unnecessary. To arrive at (19), it now suffices to invoke the triangle inequality. This completes the proof.\(\square \)

Remark 5.3

Note that the solution z of the auxiliary problem (D) cannot be expected to possess \(W^{2,q}\)-regularity for some \(q\ge 4/3\) since it typically contains a singular part analogous to that in (5).

By choosing a suitable test function in (D) and by exploiting the estimates in Theorem 4.3, we may now deduce:

Proposition 5.4

Suppose that u solves (S) for some \(f \in L^\infty (\varOmega )\) and that (A) is satisfied. Then, for all \(\varepsilon \in (0,1/2)\), there exists a constant \(C>0\) independent of h such that, for all sufficiently small \(h>0\), we have

$$\begin{aligned} \Vert (u-u_h)^+\Vert _{L^{4}(\varOmega )} \le C h^{2 - \varepsilon }. \end{aligned}$$

Proof

Let us denote the finitely many mesh nodes in the relative boundary of the set \(\mathcal {A}_h^\circ \) with \(x_i\), \(i=1,\ldots ,N\), \(N \in \mathbb {N}_0\), and the basis functions of the nodal basis of \(V_h\) that belong to the nodes \(x_i\) with \(\varphi _i\). Note that the number N is independent of h here for all sufficiently small h by our assumptions and the definition of \(\mathcal {A}_h^\circ \). Consider now the function \( v := (u_h - u) + \sum _{i=1}^N C h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _i \), where C is supposed to be a constant independent of h with \({\text {diam}}(T) \le C h\) for all \(T \in \mathcal {T}_h\). We claim that this v is admissible for (D). Indeed, on \(\mathcal {A}^\circ \), we have \(u \equiv 0\) and thus \(v \ge 0\). Further, we know that, for all small h and all \(x \in \mathcal {A}_h^\circ {\setminus } \mathcal {A}^\circ \), we can find an \({\tilde{x}}\) in the relative boundary of \(\mathcal {A}^\circ \) and a \(j \in \{1,\ldots , N\}\) with \(x \in [x_j, {\tilde{x}}]\) and \({\text {dist}}(x_j, {\tilde{x}}) < Ch\), where \([x_j, {\tilde{x}}]\) denotes the line segment between \(x_j\) and \({\tilde{x}}\). This yields

$$\begin{aligned} \begin{aligned} v(x)&\ge (u_h - u)(x) - (u_h - u)({\tilde{x}}) + \sum _{i=1}^N C h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _i(x) \\&\ge - \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} {\text {dist}}(x, {\tilde{x}}) + C h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _j(x) \\&\ge - \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} {\text {dist}}({\tilde{x}}, x_j) \varphi _j(x) + C h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _j(x) \\&\ge 0 \end{aligned} \end{aligned}$$

for all \(x \in \mathcal {A}_h^\circ {\setminus } \mathcal {A}^\circ \). Thus, we indeed have \(v \ge 0\) \(\mathcal {H}^1\)-a.e. on \(\mathcal {A}_h^\circ \). Choosing the function \(v + z \in L\) in (D) and using Lemma 5.2, the Sobolev embedding

$$\begin{aligned} W^{2, \frac{4 - \varepsilon }{3}}(\varOmega ) \hookrightarrow W^{1, \frac{8 - 2 \varepsilon }{2 + \varepsilon }}(\varOmega ) \subset C({\text {cl}}(\varOmega )),\quad \varepsilon \in (0, 1/2), \end{aligned}$$

and Hölder’s inequality now gives (with a generic \(C>0\) independent of h)

$$\begin{aligned} \begin{aligned}&\int _\varOmega \max (0, u - u_h)^{4} \mathrm {d}\mathcal {L}^2 \\&\quad \le \left( z, u_h - u \right) _{H^1(\varOmega )} + \left( z, \sum _{i=1}^N C h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _i \right) _{H^1(\varOmega )} \\&\qquad + \int _\varOmega \max (0, u - u_h)^{3}\left( \sum _{i=1}^N Ch \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \varphi _i \right) \mathrm {d}\mathcal {L}^2 \\&\quad \le \left( z, u_h - u \right) _{H^1(\varOmega )} + C \Vert z\Vert _{W^{1, \frac{8 - 2 \varepsilon }{2 + \varepsilon }}(\varOmega )} \left( h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \sum _{i=1}^N \Vert \varphi _i \Vert _{W^{1, \frac{8 - 2 \varepsilon }{6 - 3 \varepsilon }}(\varOmega )} \right) \\&\qquad + C\Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \left( h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \sum _{i=1}^N \Vert \varphi _i \Vert _{L^{\frac{4 - \varepsilon }{1 - \varepsilon }}(\varOmega )} \right) \\&\quad \le \left( z, u_h - u \right) _{H^1(\varOmega )} + C \Vert z\Vert _{W^{2, \frac{4 - \varepsilon }{3}}(\varOmega )} \left( h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } - 1} \right) \\&\qquad + C\Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \left( h \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} h^{\frac{2 -2 \varepsilon }{4- \varepsilon }}\right) \\&\quad \le \left( z, u_h - u \right) _{H^1(\varOmega )} + C h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } } \Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )}. \end{aligned}\nonumber \\ \end{aligned}$$
(20)

Note that, since z vanishes on \(\mathcal {A}_h^\circ \), since the relative boundary of \(\mathcal {A}_h^\circ \) consists of mesh nodes, and since \(z \le 0\) in \(\varOmega \), the Lagrange interpolant \(I_h(z) \in V_h\) satisfies \(I_h(z) = 0\) on \(\mathcal {A}^\circ \subset \mathcal {A}_h^\circ \) and \(I_h(z) \le 0\) in \(\varOmega \). This implies in combination with (S\(_h\)) and the reformulation (4) of (S) that

$$\begin{aligned} \left( u_h, - I_h(z) \right) _{H^1(\varOmega )} \ge ( -f ,I_h(z) )_{L^2(\varOmega )} \quad \text {and}\quad \left( u, I_h(z) \right) _{H^1(\varOmega )} = ( f ,I_h(z) )_{L^2(\varOmega )}, \end{aligned}$$

i.e., we have

$$\begin{aligned} \left( - I_h(z), u_h - u \right) _{H^1(\varOmega )} \ge 0. \end{aligned}$$

Using the last inequality, standard results for the Lagrange interpolation operator, and again Lemma 5.2, we can continue the estimate in (20) as follows

$$\begin{aligned} \begin{aligned}&\Vert (u - u_h)^+ \Vert _{L^4(\varOmega )}^4\\&\quad \le \left( z - I_h(z), u_h - u \right) _{H^1(\varOmega )} \\&\qquad + C h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } } \Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \\&\quad \le \Vert z - I_h(z)\Vert _{W^{1, \frac{4- \varepsilon }{3}}(\varOmega )} \Vert u_h - u\Vert _{W^{1, \frac{4 - \varepsilon }{1 - \varepsilon }}(\varOmega )} \\&\qquad + C h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } } \Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \\&\quad \le C h \Vert z\Vert _{W^{2, \frac{4- \varepsilon }{3}}(\varOmega )} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )}^{\frac{3 \varepsilon }{4- \varepsilon }}\Vert u_h - u\Vert _{W^{1, 4}(\varOmega )}^{\frac{4-4 \varepsilon }{4- \varepsilon }} \\&\qquad + C h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } } \Vert \max (0, u - u_h)^{3} \Vert _{L^{\frac{4 - \varepsilon }{3}}(\varOmega )} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )} \\&\quad \le C h \Vert (u - u_h)^+ \Vert _{L^{4 - \varepsilon }(\varOmega )}^{3} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )}^{\frac{3 \varepsilon }{4- \varepsilon }}\Vert u_h - u\Vert _{W^{1, 4}(\varOmega )}^{\frac{4-4 \varepsilon }{4- \varepsilon }} \\&\qquad + C h^{\frac{6 - 3 \varepsilon }{4 - \varepsilon } } \Vert (u - u_h)^+ \Vert _{L^{4 - \varepsilon }(\varOmega )}^{3} \Vert u - u_h\Vert _{W^{1, \infty }(\varOmega )}. \end{aligned} \end{aligned}$$
(21)

The above yields, in combination with Theorems 4.3 and 2.3, that there exists a constant C independent of h with

where the Landau symbol refers to the limit \( \varepsilon \searrow 0\). Using the above in (21) and performing the same calculation as before yields

This proves the claim (after redefining \(\varepsilon \)).\(\square \)

5.2 A duality argument for the component \((u - u_h)^-\)

To obtain an \(L^4\)-error estimate for the component \((u - u_h)^-\), we can proceed along roughly the same lines as in the last subsection provided the contact sets

$$\begin{aligned} {{\tilde{\mathcal {A}}}}_h := \{x \in \partial \varOmega \mid u_h(x) = 0\},\quad h > 0, \end{aligned}$$

of the finite element solutions \(u_h\) behave sufficiently well for \(h \searrow 0\). To be more precise, we need the following assumption:

Definition 5.5

(Condition \({(A_h)}\) ) We say that condition \((A_h)\) is satisfied if there exist points \(d_i \in \partial \varOmega \), \(i=1,\ldots , N\), \(N \in \mathbb {N}_0\), and numbers \(\delta _i > 0\) such that the sets \(B_{\delta _i}(d_i) \cap \partial \varOmega \) have non-zero distance to each other and the corners of the square \(\varOmega = (0,1)^2\) and such that the following is true for all sufficiently small h:

  1. 1.

    The sets \(B_{\delta _i}(d_i) \cap \partial \varOmega \) cover the relative boundary of \({{\tilde{\mathcal {A}}}}_h\) and each \(B_{\delta _i}(d_i) \cap \partial \varOmega \) contains precisely one element of the relative boundary of \({{\tilde{\mathcal {A}}}}_h\).

  2. 2.

    Every connected component of \({{\tilde{\mathcal {A}}}}_h\) has a non-empty relative interior.

Roughly speaking, the above condition expresses that the topological properties of the sets \(\{x \in \partial \varOmega \mid u_h(x) = 0\}\) and \(\{x \in \partial \varOmega \mid u_h(x) \ne 0\}\) do not change drastically as h passes to zero, and that the set \(\{x \in \partial \varOmega \mid u_h(x) = 0\}\) does not contain components which are singletons. Suppose, for example, that the contact set \({{\tilde{\mathcal {A}}}}_h\) has the form \([0, \alpha _h] \times \{0\} \cup \{0\} \times [0, \beta _h]\) with some \(\alpha _h, \beta _h \in (0,1)\) for all sufficiently small \(h>0\). Then, condition \((A_h)\) is satisfied if and only if there exists a closed interval \(E \subset (0,1)\) with \(\alpha _h, \beta _h \in E \) for all small enough h. Note that we do not need here that the sequences \(\alpha _h\) and \(\beta _h\) converge or that \({{\tilde{\mathcal {A}}}}_h \) approximates the contact set of the exact solution u for \(h \searrow 0\) (although this is, of course, what is typically observed in numerical experiments, cf. [41, Section 7]).

Analogously to the last section, we may now consider the following auxiliary problem:

figure e

with

$$\begin{aligned} {\tilde{L}} := \Big \{ v \in H^1(\varOmega ) \, \Big | \, {\text {tr}}(v) \ge 0 \ \mathcal {H}^1\text {-a.e. on } {{\tilde{\mathcal {A}}}}_h \Big \}. \end{aligned}$$

By invoking again the results of [19], we obtain:

Lemma 5.6

Suppose that u solves (S) for some \(f \in L^2(\varOmega )\), that \(u_h\) is the solution of (S\(_h\)), and that condition \((A_h)\) is satisfied. Then, (\(\tilde{D}\)) admits a unique solution \({\tilde{z}} \in H^1(\varOmega )\) for all \(h>0\), and this solution satisfies \({\tilde{z}} \le 0\) \(\mathcal {L}^2\)-a.e. in \(\varOmega \) and \({\text {tr}}({\tilde{z}}) = 0\) \(\mathcal {H}^1\)-a.e. on \({{\tilde{\mathcal {A}}}}_h \), and, for every \(\varepsilon \in (0,1/2)\), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \Vert {\tilde{z}}\Vert _{H^1(\varOmega )} \le C \Vert \max (0, u_h - u)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )}. \end{aligned}$$
(22)

Moreover, for all \(\varepsilon \in (0,1/2)\) and all sufficiently small \(h > 0\), \({\tilde{z}}\) is in \(W^{2, (4-\varepsilon )/3}(\varOmega )\) and satisfies

$$\begin{aligned} \Vert {\tilde{z}}\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C \Vert \max (0, u_h - u)^{3} \Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \end{aligned}$$
(23)

with some constant \(C>0\) independent of h.

Proof

The unique solvability of (\(\tilde{D}\)) for all \(h>0\) follows from [25, Theorem II-2.1], and the non-positivity of \({\tilde{z}}\) and the property \({\text {tr}}({\tilde{z}}) = 0\) on \({{\tilde{\mathcal {A}}}}_h\) are obtained completely analogously to the proof of Lemma 5.2. The same is the case for the estimate (22). It remains to prove the \(W^{2, (4 - \varepsilon )/3}\)-regularity of \({\tilde{z}}\) and (23) for all sufficiently small \(h >0\). The former follows immediately from \((A_h)\), [19, Theorem 4.4.3.7] and the same arguments as in Lemma 5.2. To obtain the latter, we assume that h is so small that the conditions in \((A_h)\) hold with some \(d_i \in \partial \varOmega \), \(\delta _i > 0\), \(i=1,\ldots ,N\), \(N \in \mathbb {N}\) (for \(N = 0\) the claim is trivial) and choose rotationally symmetric cut-off functions \(\psi _i \in C_c^\infty (\mathbb {R}^2)\), \(i=1,\ldots ,N\), such that \(0 \le \psi _i \le 1\) holds in \(\mathbb {R}^2\) for all i, such that \(\psi _i\) is identical one in \(B_{\delta _i}(d_i)\) for all i, and such that the sets \({\text {supp}}(\psi _i) \cap \partial \varOmega \) have non-zero distance from each other and the corners of the square \(\varOmega \). In this situation, the properties of the functions \(\psi _i\) imply that we may find another cut-off function \(\phi \in C_c^\infty (\mathbb {R}^2)\) with \(0 \le \phi \le 1\) in \(\mathbb {R}^2\) and \(\phi \equiv 1\) in a neighborhood of the boundary \(\partial \varOmega \) such that the supports of the functions \({{\tilde{\psi }}}_i := \psi _i \phi \) are disjoint. Using these \({{\tilde{\psi }}}_i\), we decompose \({\tilde{z}}\) into the parts \({\tilde{z}}_0, {\tilde{z}}_1,\ldots , {\tilde{z}}_N\) defined by

$$\begin{aligned} {\tilde{z}}_i := {{\tilde{\psi }}}_i {\tilde{z}} \text { for } i=1,\ldots ,N,\quad {\tilde{z}}_0 := {{\tilde{\psi }}}_0 {\tilde{z}},\quad {{\tilde{\psi }}}_0 := 1 - \sum _{i=1}^N {{\tilde{\psi }}}_i. \end{aligned}$$

Since \((A_h)\) implies that the relative boundary of \({{\tilde{\mathcal {A}}}}_h\) is contained in the union of the balls \(B_{\delta _i}(d_i)\), \(i=1,\ldots , N\), that each set \(B_{\delta _i}(d_i) \cap \partial \varOmega \) contains precisely one point of the relative boundary of \({{\tilde{\mathcal {A}}}}_h\), and that the connected components of \({{\tilde{\mathcal {A}}}}_h\) each have a non-empty relative interior, we may argue as in the proof of Lemma 5.2 to deduce that \({\tilde{z}}_0\) satisfies

$$\begin{aligned} \begin{array}{l@{\quad }l} - \varDelta {\tilde{z}}_0 = - {\tilde{z}} \varDelta {{\tilde{\psi }}}_0 - 2 \nabla {\tilde{z}} \cdot \nabla {{\tilde{\psi }}}_0 - {{\tilde{\psi }}}_0 \max (0, u_h - u)^{3} - {\tilde{z}}_0 &{}\quad \mathcal {L}^2 \text {-a.e. in }\varOmega , \\ {\tilde{z}}_0 = 0 &{}\quad \mathcal {H}^1 \text {-a.e. on } \mathcal {B}, \\ \partial _n {\tilde{z}}_0 = 0 &{}\quad \mathcal {H}^1 \text {-a.e. on } \partial \varOmega {\setminus } \mathcal {B}\end{array} \end{aligned}$$

with some closed set \(\mathcal {B}\subset \partial \varOmega \) whose connected components each have a non-empty relative interior and whose relative boundary consists precisely of the points \(d_1,\ldots , d_N\). Note that it is (at least in theory) possible that \(\mathcal {B}\) varies with h since it is not uniquely determined by the above conditions. However, it is easy to check that only two sets \(\mathcal {B}\) and combinations of boundary conditions are possible here. We may thus again invoke [19, Theorem 4.3.2.4] and use (22) to deduce that, for every \(\varepsilon \in (0,1/2)\), we have

$$\begin{aligned} \begin{aligned}&\Vert {\tilde{z}}_0\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \\&\quad \le C \left( \Vert {\tilde{z}} \varDelta {{\tilde{\psi }}}_0 + 2 \nabla {\tilde{z}} \cdot \nabla {{\tilde{\psi }}}_0 + {{\tilde{\psi }}}_0 \max (0, u_h - u)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} + \Vert {\tilde{z}}_0\Vert _{W^{1, (4 - \varepsilon )/3}(\varOmega )} \right) \\&\quad \le C \Vert \max (0, u_h - u)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )} \end{aligned} \end{aligned}$$

with a generic constant \(C>0\) which is independent of h. It remains to estimate the \(W^{2, (4 - \varepsilon )/3}\)-norm of the functions \({\tilde{z}}_i\), \(i=1,\ldots ,N\). So let us consider an arbitrary but fixed point \(d_i\). Since \(d_i\) is not a corner of \(\varOmega \) by \((A_h)\), we may assume w.l.o.g. that \(d_i = (a, 0)\) holds for some \(a \in (0, 1)\). Further, it follows from a straightforward calculation and the properties of \({{\tilde{\psi }}}_i\) that

$$\begin{aligned} \begin{array}{l@{\quad }l} - \varDelta {\tilde{z}}_i = - {\tilde{z}} \varDelta {{\tilde{\psi }}}_i - 2 \nabla {\tilde{z}} \cdot \nabla {{\tilde{\psi }}}_i - {{\tilde{\psi }}}_i \max (0, u_h - u)^{3} - {\tilde{z}}_i &{}\quad \mathcal {L}^2\text {-a.e. in }\varOmega , \\ {\tilde{z}}_i = 0 &{}\quad \mathcal {L}^2\text {-a.e. in }\varOmega {\setminus } {\text {supp}}({{\tilde{\psi }}}_i). \end{array}\nonumber \\ \end{aligned}$$
(24)

Since the support \({\text {supp}}({{\tilde{\psi }}}_i)\) contains precisely one point \(({\tilde{a}}, 0) \in \partial \varOmega \) of the relative boundary of \({{\tilde{\mathcal {A}}}}_h\) by our assumption \((A_h)\), we may complement (24) with one of the boundary conditions

$$\begin{aligned} \begin{array}{l@{\quad }l} {\tilde{z}}_i = 0 &{}\mathcal {H}^1 \text {-a.e. on } \big ( (-\infty , {\tilde{a}}) \times \{0\} \big ) \cap {\text {supp}}({{\tilde{\psi }}}_i ) \subset \partial \varOmega \\ \partial _n {\tilde{z}}_i = 0 &{}\mathcal {H}^1 \text {-a.e. on } \big ( ({\tilde{a}}, \infty ) \times \{0\} \big ) \cap {\text {supp}}({{\tilde{\psi }}}_i ) \subset \partial \varOmega \end{array} \end{aligned}$$

and

$$\begin{aligned} \begin{array}{l@{\quad }l} {\tilde{z}}_i = 0 &{} \mathcal {H}^1 \text {-a.e. on } \big ( ({\tilde{a}}, \infty ) \times \{0\} \big ) \cap {\text {supp}}({{\tilde{\psi }}}_i ) \subset \partial \varOmega \\ \partial _n {\tilde{z}}_i = 0 &{} \mathcal {H}^1 \text {-a.e. on } \big ( (-\infty , {\tilde{a}}) \times \{0\} \big ) \cap {\text {supp}}({{\tilde{\psi }}}_i ) \subset \partial \varOmega . \end{array} \end{aligned}$$

Using exactly the same arguments as in the proof of Lemma 5.2, we may now transform the situation at \(d_i\) into one of finitely many reference configurations and invoke [19, Theorem 4.3.2.4] as well as (22) to deduce that there exists a constant \(C_i\) independent of h with

$$\begin{aligned} \Vert {\tilde{z}}_i\Vert _{W^{2, (4 - \varepsilon )/3}(\varOmega )} \le C_i \Vert \max (0, u_h - u)^{3}\Vert _{L^{(4 - \varepsilon )/3}(\varOmega )}. \end{aligned}$$

Proceeding exactly along the same lines at the other points \(d_i\) and using the triangle inequality, we arrive at (23). This completes the proof.\(\square \)

By choosing the test function \(v = {\tilde{z}} +u - u_h \in {\tilde{L}}\) in (\(\tilde{D}\)), we now obtain:

Proposition 5.7

Suppose that u solves (S) for some \(f \in L^\infty (\varOmega )\), that \(u_h\) is the solution of (S\(_h\)), and that the conditions (A) and \((A_h)\) are satisfied. Then, for all \(\varepsilon \in (0,1/2)\), there exists a constant \(C>0\) independent of h such that, for all sufficiently small \(h>0\), we have

$$\begin{aligned} \Vert (u-u_h)^-\Vert _{L^{4}(\varOmega )} \le C h^{2 - \varepsilon }. \end{aligned}$$
(25)

Proof

Note that the definition of \({{\tilde{\mathcal {A}}}}_h\) implies \(u - u_h = u \ge 0\) on \({{\tilde{\mathcal {A}}}}_h\). The function \(u - u_h\) is thus an element of \({\tilde{L}}\) and we may choose the function \(v = {\tilde{z}} + u - u_h\) in (\(\tilde{D}\)) to obtain

$$\begin{aligned} \Vert \min (0, u - u_h)\Vert _{L^4(\varOmega )}^4 \le ( {\tilde{z}}, u - u_h)_{H^1(\varOmega )}. \end{aligned}$$
(26)

Since the set \({{\tilde{\mathcal {A}}}}_h\) consists of cells of the boundary mesh, since \({\tilde{z}}\) vanishes in \({{\tilde{\mathcal {A}}}}_h\), and since \({\tilde{z}} \le 0\) holds \(\mathcal {L}^2\)-a.e. in \(\varOmega \), we know that the Lagrange interpolant \(I_h({\tilde{z}})\) vanishes in \({{\tilde{\mathcal {A}}}}_h\) and that \(I_h({\tilde{z}})\) is non-positive everywhere. This implies that, for all small enough \(s>0\), we have \(u_h \pm s I_h({\tilde{z}}) \in K_h\) and, as a consequence, that

$$\begin{aligned} (u_h, I_h({\tilde{z}}))_{H^1(\varOmega )} = (f, I_h({\tilde{z}}))_{L^2(\varOmega )} \quad \text {and} \quad (u , -I_h({\tilde{z}}))_{H^1(\varOmega )} \ge (f, -I_h({\tilde{z}}))_{L^2(\varOmega )}. \end{aligned}$$

Using the above in (26) yields

$$\begin{aligned} \begin{aligned} \Vert \min (0, u - u_h)\Vert _{L^4(\varOmega )}^4&\le ( {\tilde{z}}, u - u_h)_{H^1(\varOmega )} \\&\le ( {\tilde{z}} - I_h({\tilde{z}} ), u - u_h)_{H^1(\varOmega )} \\&\le \Vert z - I_h(z)\Vert _{W^{1, \frac{4- \varepsilon }{3}}(\varOmega )} \Vert u_h - u\Vert _{W^{1, \frac{4 - \varepsilon }{1 - \varepsilon }}(\varOmega )} \end{aligned} \end{aligned}$$

for all \(\varepsilon \in (0, 1/2)\). A calculation completely analogous to that at the end of the proof of Proposition 5.4 now yields (25) as claimed.\(\square \)

6 Summary of results and remarks on the error analysis

In summary, we have now proved the following for problems (S) whose right-hand sides are in \(L^\infty (\varOmega )\) and whose solutions satisfy condition (A):

Theorem 6.1

(Optimal FE-estimates under assumption (A)) Suppose that u solves (S) for some \(f \in L^\infty (\varOmega )\) and that condition (A) is satisfied. Then, for all \(\varepsilon \in (0, 1/2)\), there exists a constant \(C>0\) independent of h with

$$\begin{aligned} \begin{aligned} \begin{array}{ll} \Vert u - u_h\Vert _{W^{1,8/3 - \varepsilon }(\varOmega )} \le C h,&{}\quad \Vert u - u_h\Vert _{W^{1,4 }(\varOmega )} \le C h^{1 - \varepsilon }, \\ \Vert u - u_h\Vert _{W^{1,\infty }(\varOmega )} \le C h^{1/2 - \varepsilon },&{}\quad \Vert u - u_h\Vert _{L^\infty (\varOmega )} \le C h^{3/2 - \varepsilon }, \\ \Vert u - u_h\Vert _{H^{1/2}(\partial \varOmega )} \le C h^{3/2 - \varepsilon },&{}\quad \Vert (u-u_h)^+\Vert _{L^{4}(\varOmega )} \le C h^{2 - \varepsilon }. \end{array} \end{aligned} \end{aligned}$$
(27)

If, additionally, the approximate solutions \(u_h\) satisfy \((A_h)\), then we also have

$$\begin{aligned} \Vert (u-u_h)^- \Vert _{L^{4}(\varOmega )} \le C h^{2 - \varepsilon }\quad \forall \varepsilon \in (0, 1/2). \end{aligned}$$

Proof

Combine Theorems 2.3 and 4.3 and Propositions 5.4 and 5.7. \(\square \)

Some remarks are in order regarding the last result:

Remark 6.2

  1. 1.

    The error estimates in Theorem 6.1 are optimal in view of the \(W^{2,p}\)- and \(H^s\)-regularity properties of the exact solution u, cf. Remark 4.4.

  2. 2.

    Recall that condition (A) in Theorem 6.1 may be replaced with the regularity assumption in Theorem 4.3 if only the \(W^{1,8/3 - \varepsilon }(\varOmega )\)-, \(W^{1,4 }(\varOmega )\)-, \(W^{1,\infty }(\varOmega )\)-, \(L^\infty (\varOmega )\)-, and \(H^{1/2}(\partial \varOmega )\)-estimate are considered, see Sect. 4, and that, for right-hand sides f in \(L^p(\varOmega )\), we have the results in Corollaries 4.1 and 4.2.

  3. 3.

    Note that one of the crucial steps in the proofs of Propositions 5.4 and 5.7 is to use the \(W^{1,4}\)-error estimate in Theorem 4.3 to compensate the lack of regularity of the dual solutions in Sects. 5.1 and 5.2. It is quite remarkable here that the exponent in the obtained \(W^{1,p}\)-error estimate (namely, \(p=4\)) and the exponent in the \(W^{2,p}\)-regularity results for (D) and (\(\tilde{D}\)) (namely, \(p=4/3 - \varepsilon \)) are (up to the \(\varepsilon \)) Hölder conjugates of each other and thus fit together perfectly. Even more surprisingly, \(4/3 + \varepsilon \) is also the difference of the exponents in the two \(W^{1,p}\)-error estimates in the first line of (27). This omnipresence of the exponents 4 and 4/3 indicates that the \(L^4\)- and the \(W^{1,4}\)-norm are a natural choice for the finite element error analysis of the problem (S).

  4. 4.

    We expect that it is possible to relax the (certainly not optimal) assumption \((A_h)\) in Proposition 5.7 and Theorem 6.1 by studying in more detail how the constant in [19, Theorem 4.3.2.4] depends on the boundary conditions of the considered problem. Note that the difficult part in the proof of Proposition 5.7 is to obtain the uniform bound on the \(W^{2, (4-\varepsilon )/3}\)-norm in (23). Showing that the dual solution possesses \(W^{2, (4-\varepsilon )/3}\)-regularity is relatively simple.

  5. 5.

    Recall that, for a classical obstacle problem with an essentially bounded right-hand side, it can be shown that the exact solution enjoys \(W^{2,p}\)-regularity for all \(2 \le p < \infty \), cf. [11, 25, 28]. This implies in particular that it is possible to prove \(L^\infty \)-error estimates of order \(\mathcal {O}(h^{2-\varepsilon })\) for arbitrarily small \(\varepsilon > 0\). For the Signorini problem, this is different since the exact solution u cannot be expected to be in \(W^{2,4}(\varOmega )\) even for smooth right-hand sides f. The \(L^4\)-error estimates in Theorem 6.1 thus yield an order of convergence that cannot be recovered with pointwise a priori error estimates.

  6. 6.

    Note that Theorem 6.1 shows that a counterexample, which demonstrates that the \(L^4\)-error \(\Vert u - u_h\Vert _{L^4(\varOmega )}\) is in general not of order \(\mathcal {O}(h^{2-\varepsilon })\), has to be very exotic (if it exists) since the contact sets of u and \(u_h\) have to exhibit a very degenerate behavior for the conditions (A) and \((A_h)\) to be violated.

7 Numerical experiments

We conclude this paper with numerical experiments that confirm our theoretical findings. To construct a model problem that allows us to validate our results and that possesses a known analytic solution, we proceed along the lines of [41, Section 7] and consider the function \({\tilde{u}}: \mathbb {R}\times (0, \infty ) \rightarrow \mathbb {R}\), \(x \mapsto -r^{3/2} \sin (\frac{3}{2} \theta )\). Here, r and \(\theta \) denote polar coordinates centered at (0.5, 0), i.e.,

$$\begin{aligned} r(x_1,x_2) := \left( (x_1 - 0.5)^2 + x_2^2 \right) ^{1/2} \quad \text {and} \quad \theta (x_1, x_2) := \arccos \left( \frac{x_1 - 0.5}{r}\right) . \end{aligned}$$

Note that the function \({\tilde{u}}\) is exactly of the same type as the singular terms on the right-hand side of (6), cf. also the analysis in [19, 20]. Moreover, it is easy to check that \({\tilde{u}}\) is an element of \(H^2(U)\) for all bounded, open \(U \subset \mathbb {R}\times (0, \infty )\), that \({\tilde{u}} = 0\) and \(\partial _n {\tilde{u}} \ge 0\) holds on \((0.5, \infty ) \times \{0\}\), that \({\tilde{u}} \ge 0\) and \(\partial _n {\tilde{u}} = 0\) holds on \((-\infty , 0.5) \times \{0\}\), and that \(\varDelta {\tilde{u}} \) vanishes \(\mathcal {L}^2\)-a.e. in \(\mathbb {R}\times (0, \infty )\). Suppose now that \(\psi : [0, \infty ) \rightarrow \mathbb {R}\) is a \(C^4\)-function satisfying \(\psi \equiv 0\) in \([0.45, \infty )\), \(\psi > 0\) in [0, 0.45), and

$$\begin{aligned} \psi (0) = 1, \quad \psi '(0) =\cdots = \psi ''''(0) = 0. \end{aligned}$$

(In the experiments below, this \(\psi \) was an appropriately defined ninth-order spline.) Then, the properties of \({\tilde{u}}\) and \(\psi \) yield that the map

$$\begin{aligned} u : \varOmega \rightarrow \mathbb {R},\quad x \mapsto 10\, \psi (r) {\tilde{u}} (r, \theta ), \end{aligned}$$
(28)

satisfies

$$\begin{aligned} \begin{aligned} \begin{array}{ll} -\varDelta u + u \in C({\text {cl}}(\varOmega )),&{} \\ u = 0 \text { and } \partial _n u \ge 0&{}\quad \text { on } \partial \varOmega {\setminus } (0.05, 0.5) \times \{0\}, \\ u \ge 0 \text { and } \partial _n u = 0 &{}\quad \text { on } (0.05, 0.5) \times \{0\}, \end{array} \end{aligned} \end{aligned}$$
(29)

where \(\varOmega \) again denotes the unit square \((0,1)^2\) and where we have added the factor ten for scaling reasons. Note that the conditions in (29) imply in particular that the function u solves (S) with right-hand side \(f := -\varDelta u + u \in C({\text {cl}}(\varOmega ))\). What we have constructed in (28) is thus indeed an analytic solution of Signorini’s problem that can be used as a reference in our numerical experiments, cf. Fig. 2.

Fig. 2
figure 2

Solution u (left) and right-hand side f (right) in the situation of Sect. 7

Table 1 Absolute error \(u - u_h\) between the analytic solution u in (28) and the finite element approximation \(u_h\) characterized by (S\(_h\)) for different widths h in various norms
Table 2 Experimental orders of convergence (EOCs) for different widths h w.r.t. various norms, see (30)

The results that we have obtained for the right-hand side f associated with the solution u in (28) by means of the finite element scheme described in Sect. 2.3 can be seen in Tables 1 and 2. Here, we have used Friedrichs-Keller triangulations to discretize the continuous problem (S) and a 16-point Gauss-Legendre-type quadrature rule for triangles to evaluate the various \(W^{s,p}\)-errors and the integrals arising on the right-hand side of (S\(_h\)). The finite-dimensional elliptic variational inequalities obtained from the discretization scheme have been solved by means of the MatlabR2019b-routine \( \texttt {quadprog}\) to a high precision. Note that the choice of mesh widths in our numerical experiments ensures that the point (0.5, 0), where the analytic solution u detaches from the boundary \(\partial \varOmega \) and where \(\nabla ^2 u\) possesses a singularity, never coincides with a node of the underlying mesh. This constitutes the worst case scenario as the critical part of the contact set of u is never resolved properly. Further, it should be noted that the condition (A) is trivially satisfied in (28) by the properties of the analytic solution u. Our numerical experiments indicate that the same is true for the condition \((A_h)\) as it can be observed that the contact sets of the finite element solutions \(u_h\) approximate their continuous counterpart, cf. the comments after Definition 5.5.

As the results in Tables 1 and 2 illustrate, the behavior observed in our numerical experiments agrees very well with the predictions in Theorem 6.1. (Note that this result is indeed applicable here since \(f \in C({\text {cl}}(\varOmega ))\).) In particular, the experimental orders of convergence (EOCs), i.e., the quantities

$$\begin{aligned} (\text {EOC})_{h_k,\Vert \cdot \Vert _*} := \frac{\log \Vert u - u_{h_{k}}\Vert _* - \log \Vert u - u_{h_{k-1}}\Vert _* }{\log h_{k } - \log h_{k-1}}, \end{aligned}$$
(30)

fit very well to the a priori error estimates in (27). Table 2 further shows that the rates of convergence in the \(L^p\)- and the \(W^{1,p}\)-norms break down in the situation of (28) when p is greater than the critical exponent four. This demonstrates that, for instance, the order one (minus epsilon) is in general unobtainable when we consider the \(W^{1,p}\)-error for some \(p>4\).