1 Introduction

This paper is concerned with the parabolic regularity for a weak solution to the \((1,\,p)\)-Laplace equation

$$\begin{aligned} \partial _{t}u-\Delta _{1}u-\Delta _{p}u=0\quad \text {in}\quad \Omega _{T}:=\Omega \times (0,\,T), \end{aligned}$$
(1.1)

where \(\Omega \subset {{\mathbb {R}}}^{n}\) is a bounded Lipschitz domain, and \(T\in (0,\,\infty )\) is a fixed constant. For an unknown function \(u=u(x_{1},\,\dots ,\,x_{n},\,t)\), the time derivative and the spatial gradient of u are respectively denoted by \(\partial _{t}u\) and \(\nabla u=(\partial _{x_{j}}u)_{j=1,\,\dots \,,\,n}\). The divergence operators \(\Delta _{1}\) and \(\Delta _{p}\) are the one-Laplacian and the p-Laplacian, defined as

$$\begin{aligned}\Delta _{s}u:=\mathop {\textrm{div}}\left( |\nabla u|^{s-2}\nabla u \right) \quad \text {for}\quad s\in [1,\,\infty ).\end{aligned}$$

In this paper, the space dimension n and the exponent p are assumed to be

$$\begin{aligned} n\ge 3\quad \text {and}\quad 1<p\le \frac{2n}{n+2}. \end{aligned}$$
(1.2)

We aim to prove that \(\nabla u\) is continuous in \(\Omega _{T}\), provided that the unknown function \(u:\Omega _{T}\rightarrow {{\mathbb {R}}}\) satisfies

$$\begin{aligned} u\in L_{\textrm{loc}}^{s}(\Omega _{T})\quad \text {with}\quad s>s_{\textrm{c}}:=\frac{n(2-p)}{p}. \end{aligned}$$
(1.3)

The higher integrability assumption (1.3) is optimal, since otherwise any improved regularity is in general not expected for the parabolic p-Laplace equation for \(p\in (1,\, \frac{2n}{n+2}]\).

In [28], the author has recently shown the same regularity result for

$$\begin{aligned} \partial _{t}u-\Delta _{1}u-\Delta _{p}u=f\quad \text {in}\quad \Omega _{T}, \end{aligned}$$
(1.4)

where n and p satisfy the supercritical case

$$\begin{aligned} n\ge 2\quad \text {and}\quad \frac{2n}{n+2}<p<\infty , \end{aligned}$$
(1.5)

and the external force term \(f\in L^{r}(\Omega _{T})\) is given with the exponent r suitably large. Compared to [28], this paper, which deals with the subcritical case (1.2), requires \(f\equiv 0\) for a technical issue (see Sect. 1.2).

1.1 Truncation approach

In Sect. 1.1, we mention the basic strategy for showing \(\nabla u\in C^{0}(\Omega _{T};\,{{\mathbb {R}}}^{n})\) in [28]. More detailed explanations are given in Sect. 6.1.

The main difficulty arises from the fact that the uniform parabolicity of \(-\Delta _{1}-\Delta _{p}\) breaks as a gradient vanishes. To explain this, we formally differentiate (1.1) by the space variable \(x_{j}\). The resulting equation is

$$\begin{aligned} \partial _{t}\partial _{x_{j}}u-\mathop {\textrm{div}}(\nabla ^{2} E(\nabla u)\nabla \partial _{x_{j}}u)=0, \end{aligned}$$
(1.6)

where \(E(z)=|z|+|z|^{p}/p\,(z\in {{\mathbb {R}}}^{n})\) is the energy density. The coefficient matrix \(\nabla ^{2}E(\nabla u)\) loses its uniform ellipticity on the facet \(\{\nabla u=0\}\), in the sense that the ratio

$$\begin{aligned} \frac{\text {(the largest eigenvalue of}\,\, \nabla ^{2} E(\nabla u))}{\text {(the smallest eigenvalue of}\,\, \nabla ^{2} E(\nabla u))}=C_{p}\left( 1+|\nabla u|^{1-p}\right) \end{aligned}$$
(1.7)

blows up as \(\nabla u\rightarrow 0\). In this sense, (1.1) is not everywhere uniformly parabolic, which makes it difficult to deduce quantitative continuity estimates for \(\nabla u\), especially on the facet. However, the ratio above will be bounded, if a gradient does not vanish. Hence, we introduce a truncated spatial gradient

$$\begin{aligned} {{\mathcal {G}}}_{\delta }(\nabla u):=(|\nabla u|-\delta )_{+}\frac{\nabla u}{|\nabla u|}, \end{aligned}$$

where \(\delta \in (0,\,1)\) denotes the truncation parameter, and \(a_{+}:=\max \{\,a,\,0\,\}\equiv a\wedge 0\) for \(a\in {{\mathbb {R}}}\). The main purpose in [\({{\mathcal {G}}}_{\delta }\) uniformly approximates the identity map**. In this qualitative way, we complete the proof of the gradient continuity. This truncation approach can be found in the recent study of elliptic regularity for the second-order \((1,\,p)\)-Laplace problem ([26, 27]) and a second-order degenerate problem ([3, 7]; see also [22] for a weaker result).

To achieve our goal rigorously, we have to appeal to approximate (1.1). Here we should note that (1.1) is not uniformly parabolic, especially on the facet. This prevents us from applying a standard difference quotient method, and hence it seems difficult to treat (1.6) in \(L^{2}(0,\,T;\,W^{-1,\,2}(\Omega ))\). For this reason, we have to consider a parabolic approximate equation that is uniformly parabolic, depending on the approximation parameter \(\varepsilon \in (0,\,1)\). In this paper, we relax the energy density \(E(z)=|z|+|z|^{p}/p\) by convoluting with the Friedrichs mollifier \(\rho _{\varepsilon }\) (see [17] as a related item). Therefore, we consider an approximate equation of the form

$$\begin{aligned} \partial _{t}u_{\varepsilon }-\mathop {\textrm{div}}(\nabla E^{\varepsilon } (\nabla u_{\varepsilon }))=0\quad \text {with}\quad E^{\varepsilon } :=\rho _{\varepsilon }*E\in C^{\infty }({{\mathbb {R}}}^{n}). \end{aligned}$$

The proof is completed by showing the \(L^{p}\)-strong convergence of a gradient, and the local Hölder continuity of

$$\begin{aligned} {{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon }):=\left( \sqrt{\varepsilon ^{2}+|\nabla u_{\varepsilon }|^{2}}-2\delta \right) _{+}\frac{\nabla u_{\varepsilon }}{|\nabla u_{\varepsilon }|}, \end{aligned}$$

whose continuity estimate may depend on \(\delta \in (0,\,1)\) but is independent of \(\varepsilon \in (0,\,\delta /8)\).

The detailed computations of the Hölder gradient estimates of \({{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon })\) are already given in [28, Theorem 2.8] for \(p\in (1,\,\infty )\), provided that \(\nabla u_{\varepsilon }\) is uniformly bounded with respect to \(\varepsilon \in (0,\,\delta /8)\) (see Sect. 6). Therefore, it suffices to prove local gradient bounds of \(u_{\varepsilon }\), which is the main purpose of this paper and is proved by following the three steps. Firstly, we show local \(L^{\infty }\)-bounds of u by Moser’s iteration, where (1.3) is used (see also [6, Theorem 2] and [12, Chapter 8, A.2]). Secondly, we verify a comparison principle and a weak maximum principle for \(u_{\varepsilon }\) under some Dirichlet boundary conditions. For this topic, we refer the reader to [4, Chapter 4] and [21, Chapter 3], which materials provide comparison principles for weak solutions. Finally, we prove \(\nabla u_{\varepsilon }\in L_{\textrm{loc}}^{q}\) for any \(q\in (p,\,\infty ]\) by \(u_{\varepsilon }\in L_{\textrm{loc}}^{\infty }\) and Moser’s iteration. The recent item [4, Chapter 9] gives a similar result of gradient bounds for parabolic p-Laplace equations with \(p\in (1,\,2)\). The main difference between this paper and [4, Chapter 9] is that we carefully choose test functions that are always supported in a certain non-degenerate region of \(u_{\varepsilon }\).

1.2 Literature overview

We briefly mention some literature on the parabolic p-Laplace equation, and the source of (1.1). Also, we would like to compare this paper with the author’s recent paper [28].

For the parabolic p-Laplace equation

$$\begin{aligned} \partial _{t}u-\Delta _{p}u=0, \end{aligned}$$
(1.8)

the existence and the regularity of a weak solution u are well-established. The existence theory is found in the monographs [20, 23], based on the Faedo–Galerkin method and the monotonicity of \(\Delta _{p}\). There, (1.8) is treated in \(L^{p^{\prime }}(0,\,T;\,W^{-1,\,p^{\prime }}(\Omega ))\) when \(\frac{2n}{n+2}<p<\infty \), and in \(L^{p^{\prime }}(0,\,T;\,W^{-1,\,p^{\prime }}(\Omega )+L^{2}(\Omega ))\) when \(1<p\le \frac{2n}{n+2}\), where \(p^{\prime }:=p/(p-1)\) denotes the Hölder conjugate exponent of p. The Hölder gradient continuity of u was proved by DiBenedetto–Friedman [10, 11] in 1985 for the supercritical range \(p\in (\frac{2n}{n+2},\,\infty )\) (see also [1, 9, 29] for weaker results). Later in 1991, Choe [6] proved the same regularity result for \(p\in (1,\,\infty )\), under the assumption that u is in \(L_{\textrm{loc}}^{s}(\Omega _{T})\) with the exponent \(s\in (1,\,\infty )\) satisfying \(n(p-2)+sp>0\). In particular, [6] covers the subcritical case (1.2) with the higher integrability assumption (1.3). It is worth mentioning that without (1.3), no improved regularity result is expected even for the p-Laplace problem (see [13]). In these fundamental works, careful scaling arguments in space and time are used, so that the Hölder gradient continuity estimates are quantitatively deduced. This is often called the intrinsic scaling method, which plays an important role when showing various regularity properties for (1.8). As related materials, see the monographs [8, 12], and the recent paper [5].

The sources of \((1,\,p)\)-Laplace problem can be found in the fields of fluid mechanics for \(p=2\) [14, Chapter VI], and materials sciences for \(p=3\) [24]. Among them, the second-order parabolic equation (1.4) can be found when modeling the motion of the Bingham fluid, the non-Newtonian fluid that has both plastic and viscosity properties. In this model, the one-Laplacian \(\Delta _{1}\) reflects the plasticity of a fluid, while the Laplacian \(\Delta =\Delta _{2}\) does the viscosity. As explained in [28, §1.3] (see also [14, Chapter VI]), (1.4) arises when one considers the unknown three-dimensional vector field \(U=(0,\,0,\,u(t,\,x_{1},\,x_{2}))\) denoting the velocity of a Bingham fluid in a pipe cylinder \(\Omega \times {{\mathbb {R}}}\subset {{\mathbb {R}}}^{2}\times {{\mathbb {R}}}\). There, the external force term \(f=-\partial _{x_{3}}\pi \), where \(\pi \) denotes the pressure function, depends at most on t. Mathematical analysis for the \((1,\,p)\)-Laplace equation (1.1) at least goes back to [14], where the methods based on variational inequalities are used. However, the continuity of a spatial gradient for (1.1) has not been well-established, even for \(p=2\).

Motivated by the Bingham fluid model, in [28], the author has recently shown the gradient continuity for (1.4). More precisely, [28] treats the case where the conditions (1.5) and \(f\in L^{r}(\Omega _{T})\) are satisfied with

$$\begin{aligned} \frac{1}{p}+\frac{1}{r}\le 1\quad \text {and}\quad n+2<r\le \infty . \end{aligned}$$

By the former assumption, the continuous inclusion \(L^{q}(\Omega _{T})\hookrightarrow L^{p^{\prime }}(0,\,T;\,W^{-1,\,p^{\prime }}(\Omega ))\) holds. This inclusion plays an important role in constructing the solution u, and it appears that the former condition cannot be removed when showing convergence for (1.9). It is worth noting that the latter assumption is optimal when one considers the gradient continuity for parabolic p-Laplace equations with external force terms [8, Chapters VIII–IX]. For the approximation of (1.4), the following equation is considered in [28, §2];

$$\begin{aligned} \partial _{t}u_{\varepsilon }-\mathop {\textrm{div}}(\nabla E^{\varepsilon } (\nabla u_{\varepsilon }))=f_{\varepsilon }\quad \text {in}\quad \Omega _{T}, \end{aligned}$$
(1.9)

where \(f_{\varepsilon }\in C^{\infty }(\Omega _{T})\) weakly converges to f in \(L^{r}(\Omega _{T})\). In the supercritical case (1.5), the compact embedding \(V_{0}:=W_{0}^{1,\,p}(\Omega )\hookrightarrow \hookrightarrow L^{2}(\Omega )\) and the continuous inclusion \(L^{2}(\Omega )\hookrightarrow W^{-1,\,p^{\prime }}(\Omega )=:V_{0}^{\prime }\) hold. In particular, we are allowed to use the parabolic compact embedding

$$\begin{aligned} \left\{ u\in L^{p}(0,\,T;\,V_{0})\,\,\bigg |\,\, \partial _{t} u\in L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime }) \right\} \hookrightarrow \hookrightarrow L^{p}(0,\,T;\,L^{2}(\Omega )) \end{aligned}$$
(1.10)

by the Aubin–Lions lemma [23, Chapter III, Proposition 1.3]. The strong convergence result for weak solutions to (1.9) is shown in [\(l=1\), these function spaces are denoted by \(C^{m}(U)\) and \(C^{0}(U)\) for short. For a closed interval \(I\subset {{\mathbb {R}}}\), the symbol \(C^{0}(I;\,L^{2}(\Omega ))\) stands for the set of all \(L^{2}(\Omega )\)-valued functions in I that are strongly continuous.

1.4 Main result and outline of the paper

In this paper, we consider a generalized equation of the form

$$\begin{aligned} \partial _{t}u-\mathop {\textrm{div}}\left( \nabla E(\nabla u)\right) =0 \end{aligned}$$
(1.11)

in \(\Omega _{T}\) with \(E=E_{1}+E_{p}\), where \(E_{1}\) and \(E_{p}\) are convex map**s from \({{\mathbb {R}}}^{n}\) to \({{\mathbb {R}}}_{\ge 0}\). For the smoothness of these densities, we require \(E_{1}\in C^{0}({{\mathbb {R}}}^{n})\cap C^{2}({{\mathbb {R}}}^{n}{\setminus } \{0\})\) and \(E_{p}\in C^{1}({{\mathbb {R}}}^{n})\cap C^{2}({{\mathbb {R}}}^{n}{\setminus } \{ 0\})\). The density \(E_{p}\) admits the constants \(0<\lambda _{0}\le \Lambda _{0}<\infty \) satisfying

$$\begin{aligned} |\nabla E_{p}(z)|\le \Lambda _{0}|z|^{p-1} \end{aligned}$$
(1.12)

for all \(z\in {{\mathbb {R}}}^{n}\), and

$$\begin{aligned} \lambda _{0}|z|^{p-2}\textrm{id}_{n} \leqslant \nabla ^{2} E_{p}(z)\leqslant \Lambda _{0}|z|^{p-2}\textrm{id}_{n} \end{aligned}$$
(1.13)

for all \(z\in {{\mathbb {R}}}^{n}\setminus \{ 0\}\). We assume that \(E_{1}\) is positively one-homogeneous. More precisely, \(E_{1}\) satisfies

$$\begin{aligned} E_{1}(k z)=kE_{1}(z) \end{aligned}$$
(1.14)

for all \(z\in {{\mathbb {R}}}^{n}\), \(k\in {{\mathbb {R}}}_{>0}\). For the continuity of the Hessian matrices of \(E_{p}\), we assume that there exists a concave, non-decreasing function \(\omega _{p}:{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) with \(\omega _{p}(0)=0\), such that

$$\begin{aligned} \left\Vert \nabla ^{2}E_{p}(z_{1})-\nabla ^{2}E_{p}(z_{2}) \right\Vert \le C_{\delta ,\,M}\omega _{p}(|z_{1}-z_{2}|/\mu ) \end{aligned}$$
(1.15)

holds for all \(z_{1}\), \(z_{2}\in {{\mathbb {R}}}^{n}\) with \(\mu /32\le |z_{j}|\le 3\mu \) for \(j\in \{\,1,\,2\,\}\), and \(\mu \in (\delta ,\,M-\delta )\). Here \(\delta \) and M are fixed constants such that \(0<2\delta<M<\infty \), and the constant \(C_{\delta ,\,M}\in {{\mathbb {R}}}_{>0}\) depends on \(\delta \) and M. For \(E_{1}\), we require the existence of a concave, non-decreasing function \(\omega _{1}:{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) with \(\omega _{1}(0)=0\), such that

$$\begin{aligned} \left\Vert \nabla ^{2}E_{1}(z_{1})-\nabla ^{2}E_{1}(z_{2}) \right\Vert \le \omega _{1}(|z_{1}-z_{2}|) \end{aligned}$$
(1.16)

holds for all \(z_{1}\), \(z_{2}\in {{\mathbb {R}}}^{n}\) with \(1/32\le |z_{j}|\le 3\) for \(j\in \{\,1,\,2\,\}\). Although the assumptions (1.15)–(1.16) are used not in showing local gradient bounds, they are needed in the proof of a priori Hölder estimates of truncated gradients. Since this paper mainly aims to show local gradient bounds, (1.15)–(1.16) are not explicitly used, except last Sect. 6.

To define a weak solution to (1.11), we introduce standard function spaces. For \(p\in (1,\,\frac{2n}{n+2}]\), we set

$$\begin{aligned} V_{0}:=W_{0}^{1,\,p}(\Omega )\cap L^{2}(\Omega ) \end{aligned}$$

equipped with the norm

$$\begin{aligned} \Vert v\Vert _{V_{0}}:=\Vert \nabla v \Vert _{L^{p}(\Omega )}+\Vert v\Vert _{L^{2}(\Omega )} \end{aligned}$$

for \(v\in V_{0}\). Then, the continuous dual space of \(V_{0}\) is \(V_{0}^{\prime }=W^{-1,\,p^{\prime }}(\Omega )+L^{2}(\Omega ).\)

We set the parabolic function spaces

$$\begin{aligned}\begin{array}{rcl} X^{p}(0,\,T;\,\Omega )&{}:=&{} \left\{ u\in L^{p}(0,\,T;\,V)\,\,\bigg |\,\, \partial _{t} u\in L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime }) \right\} ,\\ X_{0}^{p}(0,\,T;\,\Omega )&{}:=&{} \left\{ u\in L^{p}(0,\,T;\,V_{0})\,\,\bigg |\,\, \partial _{t} u\in L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime }) \right\} , \end{array} \end{aligned}$$

where \(V:=W^{1,\,p}(\Omega )\cap L^{2}(\Omega )\). From the Gelfand triple \(V_{0}\hookrightarrow L^{2}(\Omega )\hookrightarrow V_{0}^{\prime }\), the inclusion \(X_{0}^{p}(0,\,T;\,\Omega )\subset C^{0}([0,\,T];\,L^{2}(\Omega ))\) follows by the Lions–Magenes lemma [23, Chapter III, Proposition 1.2].

Definition 1.1

A function \(u\in X^{p}(0,\,T;\,\Omega )\cap C^{0}([0,\,T];\,L^{2}(\Omega ))\) is called a weak solution to (1.11) when there exists \(Z\in L^{\infty }(\Omega _{T};\,{{\mathbb {R}}}^{n})\) such that

$$\begin{aligned} Z(x,\,t)\in \partial E_{1}(\nabla u(x,\,t))\quad \text{ for } \text{ a.e }\, (x,\,t)\in \Omega _{T}, \end{aligned}$$
(1.17)

and

$$\begin{aligned} \int _{0}^{t}\langle \partial _{t}u,\,\varphi \rangle _{V_{0}^{\prime },\,V_{0}}\,{\mathrm d}t+\iint _{\Omega _{T}}\left\langle Z+\nabla E_{p}(\nabla u)\,\,\bigg |\,\, \nabla \varphi \right\rangle \,{\mathrm d}x {\mathrm d}t=0 \end{aligned}$$
(1.18)

for all \(\varphi \in X_{0}^{p}(0,\,T;\,\Omega )\). Here \(\partial E_{1}\) denotes the subdifferential of \(E_{1}\), defined as

$$\begin{aligned} \partial E_{1}(z):=\left\{ \zeta \in {{\mathbb {R}}}^{n}\,\,\bigg |\,\, E_{1}(w)\ge E_{1}(z)+\langle \zeta \mid w-z\rangle \text { for all }w\in {{\mathbb {R}}}^{n}\right\} \quad \text {for }z\in {{\mathbb {R}}}^{n}. \end{aligned}$$

The main result is the following Theorem 1.2.

Theorem 1.2

Let n, p and \(E=E_{1}+E_{p}\) satisfy (1.2) and (1.12)–(1.16). Assume that a function u is a weak solution to (1.11) in \(\Omega _{T}\). If (1.3) is satisfied, then the spatial gradient \(\nabla u\) is continuous in \(\Omega _{T}\).

The contents of this paper are as follows. In Sect. 2, we briefly mention basic properties of \(E_{1}\) and some composite functions. There, we also note fundamental iteration lemmata, which are fully used in a priori bound estimates. Section 3 mainly provides the strong convergence of a parabolic approximate equation under a suitable Dirichlet boundary condition (Proposition 3.3). There, some basic properties concerning \( E^{\varepsilon } \) are also mentioned. Section 4 aims to verify the local \(L^{\infty }\)-bound of u and \(u_{\varepsilon }\). The former is shown by Moser’s iteration in Sect. 4.1. The latter is proved by the comparison principle (Proposition 4.3) and the weak maximum principle (Corollary 4.4) in Sect. 4.2. Section 5 establishes local \(L^{q}\)-bounds of \(\nabla u_{\varepsilon }\) for \(q\in (p,\,\infty ]\). After deducing local energy estimates in Sect. 5.1, we complete the case \(q\in (p,\,\infty )\) by the condition \(u_{\varepsilon }\in L_{\textrm{loc}}^{\infty }\) (Proposition 5.2), and the remaining one \(q=\infty \) by Moser’s iteration (Proposition 5.3) in Sect. 5.2. The main result in Sect. 5 is that the uniform local bound of \(\nabla u_{\varepsilon }\) follow from that of \(u_{\varepsilon }\) and the uniform \(L^{p}\)-bound of \(\nabla u_{\varepsilon }\) (Theorem 5.4). Section 6 aims to show Theorem 1.2. There, a priori Hölder estimates of \({{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon })\) (Theorem 6.1) is used without proof, since this is already shown in [28, Theorem 2.8]. In Sect. 6.1, however, we would like to mention brief sketches of the proof of Theorem 6.1 for the reader’s convenience. Finally in Sect. 6.2, we give the proof of Theorem 1.2 by Proposition 3.3, Corollary 4.4, Theorems 5.4 and 6.1.

2 Preliminary

2.1 Basic properties of positively one-homogeneous density

We briefly note the basic properties of the positively one-homogeneous function \(E_{1}\).

The subdifferential \(\partial E_{1}\) is given by

$$\begin{aligned}\partial E_{1}(z)=\left\{ \zeta \in {{\mathbb {R}}}^{n}\,\,\bigg |\,\, \langle \zeta \mid z \rangle =E_{1}(z),\, \langle w\mid z\rangle \le 1\text { for all }w\in C_{E_{1}}\right\} ,\end{aligned}$$

where \(C_{E_{1}}:=\{w\in {{\mathbb {R}}}^{n}\mid E_{1}(w)\le 1\}\) (see [2, Theorem 1.8]). In particular, for any vector fields \(\nabla u\in L^{1}(\Omega _{T};\,{{\mathbb {R}}}^{n})\) and \(Z\in L^{\infty }(\Omega _{T};\,{{\mathbb {R}}}^{n})\) that satisfy (1.17), there holds

$$\begin{aligned} \langle Z\mid \nabla u\rangle =E_{1}(\nabla u)\quad \text {a.e.~in }\Omega _{T}, \end{aligned}$$
(2.1)

which is often called Euler’s identity.

By (1.14), it is easy to check that

$$\begin{aligned} \nabla E_{1}(kz)=\nabla E_{1}(z)\quad \text {and}\quad \nabla ^{2}E_{1}(kz)=k^{-1}\nabla ^{2}E_{1}(z) \end{aligned}$$

for all \(z\in {{\mathbb {R}}}^{n}\setminus \{0\}\) and \(k\in (0,\,\infty )\). In particular, we have

$$\begin{aligned}{} & {} \partial E_{1}(z)\subset \left\{ w\in {{\mathbb {R}}}^{n}\,\,\bigg |\,\,|w|\le K_{0} \right\} \quad \text {for all }z\in {{\mathbb {R}}}^{n}, \end{aligned}$$
(2.2)
$$\begin{aligned}{} & {} \quad |\nabla E_{1}(z)|\le K_{0}\quad \text {for all }z\in {{\mathbb {R}}}^{n}\setminus \{ 0\}, \end{aligned}$$
(2.3)
$$\begin{aligned}{} & {} \quad O_{n}\leqslant \nabla ^{2}E_{1}(z)\leqslant \frac{K_{0}}{|z|}\textrm{id}_{n}\quad \text {for all }z\in {{\mathbb {R}}}^{n}\setminus \{0\}, \end{aligned}$$
(2.4)

for some constant \(K_{0}\in (0,\,\infty )\).

2.2 Composite functions

Throughout this paper, we let \(\psi :{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) be a bounded Lipschitz function that is continuously differentiable in \({{\mathbb {R}}}_{>0}\) except at finitely many points. Also, we assume that the derivative \(\psi ^{\prime }\) is non-negative and its support is compactly supported in \({{\mathbb {R}}}_{\ge 0}\). In particular, \(\psi \) is non-decreasing, and becomes constant for sufficiently large \(\sigma \). Corresponding to this \(\psi \), we define the convex function \(\Psi :{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) as

$$\begin{aligned} \Psi (\sigma ):=\int _{0}^{\sigma }\tau \psi (\tau )\,{\mathrm d}\tau \quad \text {for }\sigma \in {{\mathbb {R}}}_{\ge 0}. \end{aligned}$$
(2.5)

By the definition of \(\Psi \) and the monotonicity of \(\psi \), it is clear that

$$\begin{aligned} \Psi (\sigma )\le \sigma ^{2}\psi (\sigma )\quad \text {for all }\sigma \in {{\mathbb {R}}}_{\ge 0}. \end{aligned}$$
(2.6)

In our proof of local bound estimates, we mainly choose \(\psi :{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) as either

$$\begin{aligned} \psi _{\alpha ,\,M}(\sigma ):=(\sigma \wedge M)^{\alpha }\quad \text {for }\sigma \in {{\mathbb {R}}}_{\ge 0} \end{aligned}$$
(2.7)

or

$$\begin{aligned} {\tilde{\psi }}_{\alpha ,\,M}(\sigma ):=\sigma ^{\alpha }\left( 1-\sigma ^{-1}\right) _{+}\wedge M^{\alpha }\left( 1-M^{-1}\right) \quad \text {for }\sigma \in {{\mathbb {R}}}_{\ge 0}, \end{aligned}$$
(2.8)

where \(\alpha \in [0,\,\infty )\) and \(M\in (1,\,\infty )\). For \(\psi _{\alpha ,\,M}\) or \({\tilde{\psi }}_{\alpha ,\,M}\), the correponding \(\Psi \) defined as (2.5) is denoted by \(\Psi _{\alpha ,\,M}\) or \({\tilde{\Psi }}_{\alpha ,\,M}\) respectively. When \(M\rightarrow \infty \), the monotone convergences

$$\begin{aligned} \psi _{\alpha ,\,M}(\sigma )\nearrow \sigma ^{\alpha },\quad {\tilde{\psi }}_{\alpha ,\,M}(\sigma )\nearrow \sigma ^{\alpha }\left( 1-\sigma ^{-1}\right) _{+}, \quad \Psi _{\alpha ,\,M}(\sigma )\nearrow \frac{\sigma ^{\alpha +2}}{\alpha +2} \end{aligned}$$
(2.9)

hold for every \(\sigma \in {{\mathbb {R}}}_{\ge 0}\), which result is used later when showing various local \(L^{\infty }\)-estimates. Also, by direct computations, we can easily notice the following (2.10)–(2.12):

$$\begin{aligned}{} & {} \left( \psi _{\alpha ,\,M}(\sigma )\right) ^{r}\left( \psi _{\alpha ,\,M}^{\prime }(\sigma )\right) ^{1-r}\sigma ^{r} \le \alpha ^{r}\psi _{\alpha ,\,M}(\sigma )\quad \text {for all }\sigma \in {{\mathbb {R}}}_{>0} \setminus \{\,1,\,M\,\}, \end{aligned}$$
(2.10)
$$\begin{aligned}{} & {} \quad {\tilde{\psi }}_{\alpha ,\,M}(\sigma )+\sigma {\tilde{\psi }}_{\alpha ,\,M}^{\prime }(\sigma )\le (\alpha +1)\psi _{\alpha ,\,M}(\sigma )\chi _{\{\sigma>1\}}(\sigma )\quad \text {for all }\sigma \in {{\mathbb {R}}}_{>0} \setminus \{\,1,\,M\,\},\nonumber \\ \end{aligned}$$
(2.11)
$$\begin{aligned}{} & {} \quad \sigma ^{\alpha +2}\le 1+\sigma ^{\alpha +p}+\lim _{M\rightarrow \infty }\left( \sigma ^{2}{\tilde{\psi }}_{\alpha ,\,M}(\sigma )\right) \quad \text {for all }\sigma \in {{\mathbb {R}}}_{\ge 0}. \end{aligned}$$
(2.12)

Here \(r\in (1,\,\infty )\) is a fixed constant.

2.3 Iteration lemmata

Without proofs, we infer two basic lemmata, shown by standard iteration arguments (see [15, Lemma V.3.1] and [28, Lemma 4.2] for the proof).

Lemma 2.1

Fix \(R_{1},\,R_{2}\in {{\mathbb {R}}}_{>0}\) with \(R_{1}<R_{2}\). Assume that a bounded function \(f:[R_{1},\,R_{2} ]\rightarrow {{\mathbb {R}}}_{\ge 0}\) admits the constants \(A,\,\alpha \in {{\mathbb {R}}}_{>0}\), \(B\in {{\mathbb {R}}}_{\ge 0}\), and \(\theta \in (0,\,1)\), such that there holds

$$\begin{aligned}f(r_{1})\le \theta f(r_{2})+\frac{A}{(r_{2}-r_{1})^{\alpha }}+B\end{aligned}$$

for any \(r_{1},\,r_{2}\in [R_{1},\,R_{2}]\) with \(r_{1}<r_{2}\). Then, f satisfies

$$\begin{aligned}f(R_{1})\le C(\alpha ,\,\theta )\left[ \frac{A}{(R_{2}-R_{1})^{\alpha }}+B \right] .\end{aligned}$$

Lemma 2.2

Let \(\kappa \in (1,\,\infty )\) be a constant. Assume that the sequences \(\{Y_{l}\}_{l=0}^{\infty }\subset {{\mathbb {R}}}_{\ge 0}\), \(\{p_{l}\}_{l=0}^{\infty }\subset [1,\,\infty )\) admit the constants A, \(B\in (1,\,\infty )\) and \(\mu \in {{\mathbb {R}}}_{>0}\) such that

$$\begin{aligned}\left\{ \begin{array}{rcl} Y_{l+1}^{p_{l+1}}&{} \le &{} \left( AB^{l}Y_{l}^{p_{l}} \right) ^{\kappa }, \\ p_{l}&{} \ge &{} \mu \left( \kappa ^{l}-1\right) , \end{array} \right. \quad \text {for all }l\in {{\mathbb {Z}}}_{\ge 0},\end{aligned}$$

and \(\kappa ^{l}p_{l}^{-1}\rightarrow \mu ^{-1}\) as \(l\rightarrow \infty \). Then, we have

$$\begin{aligned}\limsup _{l\rightarrow \infty }Y_{l}\le A^{\frac{\kappa ^{\prime }}{\mu }}B^{\frac{(\kappa ^{\prime })^{2}}{\mu }}Y_{0}^{\frac{p_{0}}{\mu }},\end{aligned}$$

where \(\kappa ^{\prime }:=\kappa /(\kappa -1)\in (1,\,\infty )\) denotes the Hölder conjugate exponent of \(\kappa \).

3 Approximation problem

3.1 Approximation of energy density

We would like to explain the approximation of \(E=E_{1}+E_{p}\), based on the Friedrichs mollifier. More precisely, we introduce a non-negative, spherically symmetric function \(\rho \in C_{\mathrm c}^{\infty }({{\mathbb {R}}}^{n})\) such that \(\Vert \rho \Vert _{L^{1}}=1\) and the support of \(\rho \) is the closed unit ball centered at the origin. For the approximation parameter \(\varepsilon \in (0,\,1)\), we define \(\rho _{\varepsilon }(z):=\varepsilon ^{-n}\rho (z/\varepsilon )\) for \(z\in {{\mathbb {R}}}^{n}\), and relax the energy density \(E_{s}\, (s\in \{\,1,\,p\,\})\) by the non-negative function, defined as

$$\begin{aligned} E_{s,\,\varepsilon }(z):=\int _{{{\mathbb {R}}}^{n}}\rho _{\varepsilon }(y)E_{s}(z-y)\,{\mathrm d}y\quad \text {for }z\in {{\mathbb {R}}}^{n}. \end{aligned}$$
(3.1)

Then, by (1.12)–(1.13) and (2.3)–(2.4), the relaxed density \( E^{\varepsilon } :=E_{1,\,\varepsilon }+E_{p,\,\varepsilon }\) satisfy

$$\begin{aligned}{} & {} |\nabla E^{\varepsilon } (z)|\le \Lambda (\varepsilon +|z|^{2})^{(p-1)/2}+K, \end{aligned}$$
(3.2)
$$\begin{aligned}{} & {} \quad \lambda \left( \varepsilon ^{2}+|z|^{2} \right) ^{p/2-1}\textrm{id}_{n}\leqslant \nabla ^{2} E^{\varepsilon } (z)\leqslant \left( \Lambda \left( \varepsilon ^{2}+|z|^{2} \right) ^{p/2-1}+\frac{K}{\sqrt{\varepsilon ^{2}+|z|^{2}}}\right) \textrm{id}_{n} \qquad \qquad \end{aligned}$$
(3.3)
$$\begin{aligned}{} & {} \quad \left\langle \nabla E_{p,\,\varepsilon }(z)-\nabla E_{p,\,\varepsilon }(w)\,\,\bigg |\,\,z-w \right\rangle \ge \lambda \left( \varepsilon ^{2}+|z|^{2}+|w|^{2}\right) ^{p/2-1}|z-w|^{2}, \end{aligned}$$
(3.4)
$$\begin{aligned}{} & {} \quad \left\langle \nabla E^{\varepsilon } (z)-\nabla E^{\varepsilon } (w)\,\,\bigg |\,\,z-w \right\rangle \ge \lambda \left( \varepsilon ^{2}+|z|^{2}+|w|^{2}\right) ^{p/2-1}|z-w|^{2}, \end{aligned}$$
(3.5)

for all z, \(w\in {{\mathbb {R}}}^{n}\). Here \(\lambda \in (0,\,\lambda _{0})\), \(\Lambda \in (\Lambda _{0},\,\infty )\), and \(K\in (K_{0},\,\infty )\) are constants (see [26, §2] for the detailed computations). Letting \(w=0\) and \(\varepsilon \rightarrow 0\) in (3.4), we have

$$\begin{aligned} \left\langle \nabla E_{p}(z)\,\, \bigg |\,\,z\right\rangle \ge \lambda |z|^{p}\quad \text {for all }z\in {{\mathbb {R}}}^{n}, \end{aligned}$$
(3.6)

which follows from \(E_{p}\in C^{1}({{\mathbb {R}}}^{n})\) and \(\nabla E_{p}(0)=0\). Also, letting \(w=0\) in (3.5), we get

$$\begin{aligned} \left\langle \nabla E^{\varepsilon } (z)-\nabla E^{\varepsilon } (0)\,\,\bigg |\,\,z \right\rangle \ge \lambda \left( |z|^{p}-\varepsilon ^{p}\right) \quad \text {for all }z\in {{\mathbb {R}}}^{n}. \end{aligned}$$
(3.7)

As a special case of [26, Lemma 2.8], we can use the following lemma.

Lemma 3.1

The energy density \( E^{\varepsilon } =E_{1,\,\varepsilon }+E_{p,\,\varepsilon }\in C^{\infty }({{\mathbb {R}}}^{n})\), defined as (3.1) for each \(\varepsilon \in (0,\,1)\), satisfies the following.

  • (1) For each fixed \(v\in L^{p}(\Omega _{T};\,{{\mathbb {R}}}^{n})\), we have

    $$\begin{aligned}\nabla E^{\varepsilon } (v)\rightarrow A_{0}(v)\quad \text {in}\quad L^{p^{\prime }}(\Omega _{T};\,{{\mathbb {R}}}^{n})\quad \text {as}\quad \varepsilon \rightarrow 0.\end{aligned}$$

    Here the map** \(A_{0}:{{\mathbb {R}}}^{n}\rightarrow {{\mathbb {R}}}^{n}\) is defined as

    $$\begin{aligned}A_{0}:=\left\{ \begin{array}{cc} \nabla E(z) &{} (z\ne 0), \\ (\rho *\nabla E_{1})(0) &{} (z=0). \end{array} \right. \end{aligned}$$
  • (2) Assume that a sequence \(\{v_{\varepsilon _{k}}\}_{k}\subset L^{p}(\Omega _{T};\,{{\mathbb {R}}}^{n})\), where \(\varepsilon _{k}\rightarrow 0\) as \(k\rightarrow 0\), satisfies

    $$\begin{aligned}v_{\varepsilon _{k}}\rightarrow v_{0}\quad \text {in}\quad L^{p}(\Omega _{T};\,{{\mathbb {R}}}^{n})\quad \text {as}\quad k\rightarrow \infty \end{aligned}$$

    for some \(v_{0}\in L^{p}(\Omega _{T})\). Then, up to a subsequence, we have

    $$\begin{aligned}\left\{ \begin{array}{rclcc} \nabla E_{p,\,\varepsilon _{k}}(v_{\varepsilon _{k}})&{}\rightarrow &{} \nabla E_{p}(v_{0})&{} \text {in}&{} L^{p^{\prime }}(\Omega _{T};\,{{\mathbb {R}}}^{n}),\\ \nabla E_{1,\,\varepsilon _{k}}(v_{\varepsilon _{k}}) &{}{\mathop {\rightharpoonup }\limits ^{}} &{} Z &{} \text {in} &{}L^{\infty } (\Omega _{T};\,{{\mathbb {R}}}^{n}), \end{array} \right. \quad \text {as}\quad k\rightarrow \infty .\end{aligned}$$

    Here the limit \(Z\in L^{\infty }(\Omega _{T};\,{{\mathbb {R}}}^{n})\) satisfies

    $$\begin{aligned}Z(x,\,t)\in \partial E_{1}(v_{0}(x,\,t))\quad \text {for a.e.~}(x,\,t)\in \Omega _{T}.\end{aligned}$$

3.2 Convergence of approximate solutions

Section 3 is concluded by verifying that \(u_{\varepsilon }\), a weak solution to

$$\begin{aligned} \partial _{t}u_{\varepsilon }-\mathop {\textrm{div}}\left( \nabla E^{\varepsilon } (\nabla u_{\varepsilon }) \right) =0 \end{aligned}$$
(3.8)

in \(\Omega _{T}\), converges to a weak solution to (1.11).

Let \(u_{\star }\in X^{p}(0,\,T;\,\Omega )\cap C([0,\,T];\,L^{2}(\Omega ))\) be a fixed function. By carrying out similar arguments in [20, 23], we find the unique weak solution of

$$\begin{aligned} \left\{ \begin{array}{rclcc} \partial _{t}u_{\varepsilon }-\mathop {\textrm{div}}(\nabla E^{\varepsilon } (\nabla u_{\varepsilon }))&{}=&{}0 &{} \text {in} &{}\Omega _{T},\\ u_{\varepsilon } &{}=&{}u_{\star } &{} \text {on} &{} \partial _{\textrm{p}}\Omega _{T}. \end{array} \right. \end{aligned}$$
(3.9)

More precisely, the solution \(u_{\varepsilon }\) is in \(u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\), satisfies (3.8) in \(L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime })\), and \((u_{\varepsilon }-u_{\star })(\,\cdot ,\,0)=0\) in \(L^{2}(\Omega )\). The existence of the weak solution of (3.9) is shown by the Faedo–Galerkin method (see [20, Chapitre 2], [23, §III.4] as related materials). We would like to prove that \(u_{\varepsilon }\) converges to the weak solution of

$$\begin{aligned} \left\{ \begin{array}{rclcc} \partial _{t}u-\mathop {\textrm{div}}(\nabla E(\nabla u))&{}=&{}0 &{} \text {in} &{}\Omega _{T},\\ u &{}=&{}u_{\star } &{} \text {on} &{} \partial _{\textrm{p}}\Omega _{T}, \end{array} \right. \end{aligned}$$
(3.10)

in the sense of Definition 3.2 below.

Definition 3.2

Let \(u_{\star }\in X^{p}(0,\,T;\,\Omega )\cap C^{0}([0,\,T];\,L^{2}(\Omega ))\). A function \(u\in X^{p}(0,\,T;\,\Omega )\) is called the weak solution of (3.10) when the following two properties are satisfied.

  • (1) \(u\in u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\subset C^{0}([0,\,T];\,L^{2}(\Omega ))\) and \((u_{\varepsilon }-u_{\star })(\,\cdot \,,\,0)=0\) in \(L^{2}(\Omega )\).

  • (2) \(u_{\varepsilon }\) is a weak solution to (1.11) in \(\Omega _{T}\) in the sense of Definition 1.1.

By a weak compactness argument and Lemma 3.1, we prove Proposition 3.3.

Proposition 3.3

Fix arbitrary \(u_{\star }\in X^{p}(0,\,T;\,\Omega )\cap C^{0}([0,\,T];\,L^{2}(\Omega ))\) and \(\tau \in (0,\,T)\). Let \(u_{\varepsilon }\in u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\) be the unique weak solution of (3.9) for each \(\varepsilon \in (0,\,1)\). Then, there exists a decreasing sequence \(\{\varepsilon _{k}\}_{k}\subset (0,\,1)\) such that

$$\begin{aligned} \nabla u_{\varepsilon _{k}}\rightarrow \nabla u_{0}\quad \mathrm{a.e.\,\, in}\quad \Omega _{T-\tau }\quad \text{ and }\quad \textrm{strongly}\,\, \textrm{in}\quad L^{p}(\Omega _{T-\tau };\,{{\mathbb {R}}}^{n}), \end{aligned}$$

where the limit function \(u_{0}\in u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\) is the unique weak solution of (3.10) with T replaced by \(T-\tau \).

Compared to Proposition 4.3, [28, Proposition 2.4] provides a similar convegence result for (1.9) in the case (1.5). There the compact embedding (1.10) is used to deal with non-trivial external force terms, as explained in Sect. 1.2. Here we give the proof of Proposition 3.3 without using any compact embedding.

Proof

We set \(T_{0}:=T-\tau /2\), and \(T_{1}:=T-\tau \). We define

$$\begin{aligned}\theta (t):=\frac{(t-T_{0})_{+}}{T_{0}}\in [0,\,1]\quad \text {for}\quad t\in [0,\,T].\end{aligned}$$

To construct \(u_{0}\), we first claim that

$$\begin{aligned}{{\textbf{J}}}_{\varepsilon }:=\Vert u_{\varepsilon }-u_{\star } \Vert _{L^{p}(0,\,T_{1};\,V_{0})}+\Vert \partial _{t}u_{\varepsilon } \Vert _{L^{p^{\prime }}(0,\,T_{1};\,V_{0}^{\prime })}\end{aligned}$$

is bounded, uniformly for \(\varepsilon \in (0,\,1)\). To prove the boundedness of \({{\textbf{J}}}_{\varepsilon }\), we test \(\varphi :=(u_{\varepsilon }-u_{\star }) \theta \) into (3.8). Integrating by parts, we have

$$\begin{aligned}&{{\textbf{L}}}_{1}+{{\textbf{L}}}_{2}\\&:=-\frac{1}{2}\iint _{\Omega _{T_{0}}}(u_{\varepsilon }-u_{\star })^{2}\partial _{t}\theta \,{\mathrm d}x {\mathrm d}t+\iint _{\Omega _{T_{0}}}\left\langle \nabla E^{\varepsilon } (\nabla u_{\varepsilon })-\nabla E^{\varepsilon } (0) \,\,\bigg |\,\,\nabla u_{\varepsilon } \right\rangle \theta \,{\mathrm d}x {\mathrm d}t\\&=-\int _{0}^{T_{0}}\langle \partial _{t}u_{\star },\, (u_{\varepsilon }-u_{\star })\theta \rangle _{V_{0}^{\prime },\,V_{0}}\,{\mathrm d}t+\iint _{\Omega _{T_{0}}}\left\langle \nabla E^{\varepsilon } (\nabla u_{\varepsilon })-\nabla E^{\varepsilon } (0) \,\,\bigg |\,\,\nabla u_{\star }\right\rangle \theta \,{\mathrm d}x{\mathrm d}t\\&=:-{{\textbf{R}}}_{1}+{{\textbf{R}}}_{2}. \end{aligned}$$

By (3.7) and our choice of \(\theta \), we have

$$\begin{aligned}{{\textbf{L}}}_{1}+{{\textbf{L}}}_{2}\ge \frac{1}{2T_{0}}\iint _{\Omega _{T_{0}}}|u_{\varepsilon }-u_{\star }|^{2}\,{\mathrm d}x{\mathrm d}t+\lambda \iint _{\Omega _{T_{0}}}|\nabla u_{\varepsilon }|^{p}\theta \,{\mathrm d}x{\mathrm d}t-\lambda \varepsilon ^{p}|\Omega _{T_{0}}|.\end{aligned}$$

To estimate \({{\textbf{R}}}_{1}\), we use Hölder’s inequality and Young’s inequality to compute

$$\begin{aligned} |{{\textbf{R}}}_{1}|&\le \Vert \partial _{t}u_{\star } \Vert _{L^{p^{\prime }}(0,\,T_{0};\,V_{0}^{\prime })}\left( \int _{0}^{T}\left( \Vert \nabla (u_{\varepsilon }-u_{\star })\theta \Vert _{L^{p}(\Omega )}+\Vert (u_{\varepsilon }-u_{\star })\theta \Vert _{L^{2}(\Omega )} \right) ^{p}\,{\mathrm d}t\right) ^{1/p} \\&\le \frac{\lambda }{3}\iint _{\Omega _{T_{0}}}|\nabla u_{\varepsilon }|^{p}\theta \,{\mathrm d}x{\mathrm d}t+\frac{C_{p}}{\lambda }\Vert \partial _{t}u_{\star }\Vert _{L^{p^{\prime }}(0,\,T_{0};\,V_{0}^{\prime })}^{p^{\prime }}\\&\quad +C_{p}\Vert \partial _{t}u_{\star }\Vert _{L^{p^{\prime }}(0,\,T_{0};\,V_{0}^{\prime })}\Vert \nabla u_{\star }\Vert _{L^{p}(\Omega _{T_{0}})} \\&\quad +\frac{1}{4T_{0}}\iint _{\Omega _{T_{0}}}|u_{\varepsilon }-u_{\star } |^{2}\,{\mathrm d}x{\mathrm d}t+C_{p}T_{0}^{2/p}\Vert \partial _{t}u_{\star } \Vert _{L^{p^{\prime }}(0,\,T_{0};\,V_{0}^{\prime })}^{2}. \end{aligned}$$

By (3.2) and Young’s inequality, we have

$$\begin{aligned} |{{\textbf{R}}}_{2} |&\le C(\Lambda ,\,K)\iint _{\Omega _{T_{0}}}\left( 1+|\nabla u_{\varepsilon }|^{p-1} \right) |\nabla u_{\star }|\theta \,{\mathrm d}x{\mathrm d}t\\&\le \frac{\lambda }{3}\iint _{\Omega _{T_{0}}}|\nabla u_{\varepsilon }|^{p}\theta \,{\mathrm d}x{\mathrm d}t+\frac{C(\Lambda ,\,K)}{\lambda }\iint _{\Omega _{T_{0}}}\left( 1+|\nabla u_{\star }|^{p}\right) \,{\mathrm d}x{\mathrm d}t. \end{aligned}$$

Combining these three estimates, we obtain

$$\begin{aligned}\iint _{\Omega _{T_{0}}}|u_{\varepsilon }-u_{\star }|^{2}\,{\mathrm d}x{\mathrm d}t+\iint _{\Omega _{T_{0}}}|\nabla u_{\varepsilon }|^{p}\theta \,{\mathrm d}x{\mathrm d}t\le {{\hat{C}}},\end{aligned}$$

where \({{\hat{C}}}\) depends on \(\lambda \), \(\Lambda \), K, \(T_{0}\), \(|\Omega |\), \(\Vert \nabla u_{\star } \Vert _{L^{p}(\Omega _{T_{0}})}\), and \(\Vert \partial _{t} u_{\star }\Vert _{L^{p^{\prime }}(0,\,T_{0};\,V_{0}^{\prime })}\). Recalling our choice of \(\theta \), we have

$$\begin{aligned}&\Vert u_{\varepsilon }-u_{\star } \Vert _{L^{p}(0,\,T_{1};\,V_{0})}^{p}\\&\quad =\iint _{\Omega _{T_{1}}}\left( \Vert \nabla (u_{\varepsilon }-u_{\star }) \Vert _{L^{p}(\Omega )}+\Vert u_{\varepsilon }-u_{\star } \Vert _{L^{2}(\Omega )} \right) ^{p}\,{\mathrm d}t\\&\quad \le C_{p}\left( \iint _{\Omega _{T_{1}}}|\nabla (u_{\varepsilon }-u_{\star }) |^{p}{\mathrm d}x{\mathrm d}t+T_{1}^{1-p/2}\left( \iint _{\Omega _{T_{1}}}|u_{\varepsilon }-u_{\star }|^{2}\,{\mathrm d}x{\mathrm d}t \right) ^{p/2} \right) \\&\quad \le {{\tilde{C}}}(p,\,\tau ,\,T,\,{{\hat{C}}})<\infty . \end{aligned}$$

Since \(u_{\varepsilon }\) satisfy (3.8) in \(L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime })\), for any \(\varphi \in L^{p}(0,\,T;\,V_{0})\), we have

$$\begin{aligned} \left|\int _{0}^{T_{1}}\langle \partial _{t}u_{\varepsilon },\,\varphi \rangle _{V_{0}^{\prime },\,V_{0}}\,{\mathrm d}t\right|&\le \Vert \nabla E^{\varepsilon } (\nabla u_{\varepsilon })\Vert _{L^{p^{\prime }}(\Omega _{T_{1}})}\Vert \nabla \varphi \Vert _{L^{p}(\Omega _{T_{1}})}\\&\le {\check{C}}(p,\,\Lambda ,\,K,\,|\Omega _{T_{1}}|,\,{{\tilde{C}}})\Vert \varphi \Vert _{L^{p}(0,\,T_{1};\,V_{0})}, \end{aligned}$$

where (3.2) is used. This yields \(\Vert \partial _{t}u_{\varepsilon }\Vert _{L^{p^{\prime }}(0,\,T;\,V_{0}^{\prime })}\le {\check{C}}\). Therefore, \({{\textbf{J}}}_{\varepsilon }\) is uniformly bounded for \(\varepsilon \in (0,\,\varepsilon _{0})\).

Carrying out the standard weak compactness argument, we find a limit function \(u_{0}\in u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\) such that

$$\begin{aligned} u_{\varepsilon _{j}}-u_{\star }\rightharpoonup u_{0}-u_{\star }\quad \text {in}\quad L^{p}(0,\,T_{1};\,V_{0}) \end{aligned}$$
(3.11)

and

$$\begin{aligned} \partial _{t}u_{\varepsilon }\rightharpoonup \partial _{t}u_{0}\quad \text {in}\quad L^{p^{\prime }}(0,\,T_{1};\,V_{0}^{\prime }) \end{aligned}$$
(3.12)

where \(\{\varepsilon _{j}\}_{j=0}^{\infty }\subset (0,\,1)\) is a decreasing sequence such that \(\varepsilon _{j}\rightarrow 0\) as \(j\rightarrow \infty \). Also, the identity \(u_{0}|_{t=0}=u_{\star }|_{t=0}\) in \(L^{2}(\Omega )\) is straightforwardly shown. From (3.11), we would like to prove

$$\begin{aligned} \nabla u_{\varepsilon _{j}}\rightarrow \nabla u_{0}\quad \text {in}\quad L^{p}(\Omega _{T_{1}}). \end{aligned}$$
(3.13)

By (3.5) and Hölder’s inequality, we have

$$\begin{aligned}&\iint _{\Omega _{T_{1}}}|\nabla u_{\varepsilon _{j}}-\nabla u_{0}|^{p}\,{\mathrm d}x{\mathrm d}t\\&\quad \le \left( \iint _{\Omega _{T_{1}}}\left( \varepsilon _{j}^{2}+|\nabla u_{\varepsilon _{j}}|^{2}+|\nabla u_{0}|^{2} \right) ^{p/2}\,{\mathrm d}x{\mathrm d}t \right) ^{1-p/2}\\&\qquad \cdot \left( \iint _{\Omega _{T_{1}}}\left( \varepsilon _{j}^{2}+|\nabla u_{\varepsilon _{j}}|^{2}+|\nabla u_{0}|^{2} \right) ^{p/2-1}|\nabla u_{\varepsilon _{j}}-\nabla u_{0}|^{2}\,{\mathrm d}x{\mathrm d}t\right) ^{p/2}\\&\quad \le C\left( {{\textbf{I}}}_{1,\,\varepsilon _{j}}+{{\textbf{I}}}_{2,\,\varepsilon _{j}} \right) ^{p/2}, \end{aligned}$$

where

$$\begin{aligned} \begin{array}{rcl} {{\textbf{I}}}_{1,\,\varepsilon _{j}} &{}:=&{}\displaystyle \iint _{\Omega _{T_{1}}}\left\langle \nabla E_{\varepsilon _{j}}(\nabla u_{\varepsilon })\,\,\bigg |\,\,\nabla (u_{\varepsilon _{j}}-\nabla u_{0}) \right\rangle \,{\mathrm d}x{\mathrm d}t, \\ {{\textbf{I}}}_{2,\,\varepsilon _{j}} &{}:=&{}\displaystyle \iint _{\Omega _{T_{1}}}\left\langle \nabla E_{\varepsilon _{j}}(\nabla u_{0})\,\,\bigg |\,\,\nabla (u_{\varepsilon _{j}}-\nabla u_{0}) \right\rangle \,{\mathrm d}x{\mathrm d}t. \end{array} \end{aligned}$$

For \(\delta \in (0,\,T_{1}/2)\), which tends to 0 later, we define a function \(\phi _{\delta }:[0,\,T_{1}]\rightarrow [0,\,1]\) as

$$\begin{aligned} \phi _{\delta }(t) :=\left\{ \begin{array}{cc} 1 &{} (0\le t<T_{1}-\delta ), \\ -\delta ^{-1}(t-T_{1}) &{} (T_{1}-\delta \le t\le T_{1}). \end{array}\right. \end{aligned}$$
(3.14)

We test \(\varphi :=(u_{\varepsilon }-u_{0}) \phi _{\delta }\) into (3.8) with \(\varepsilon =\varepsilon _{j}\), and integrate by parts. Then, we have

$$\begin{aligned}&-\frac{1}{2}\iint _{\Omega _{T_{1}}}|u_{\varepsilon _{j}}-u_{0} |^{2}\partial _{t}\phi _{\delta }\,{\mathrm d}x{\mathrm d}t+\iint _{\Omega _{T_{1}}}\left\langle \nabla E_{\varepsilon _{j}}(\nabla u_{\varepsilon _{j}})\,\,\bigg |\,\,\nabla (u_{\varepsilon _{j}}-u_{0}) \right\rangle \phi _{\delta }\,{\mathrm d}x{\mathrm d}t\\&\quad =-\int _{0}^{T_{1}}\langle \partial _{t}u_{0},\, u_{\varepsilon _{j}}-u_{0}\rangle _{V_{0}^{\prime },\,V_{0}}\phi _{\delta }\,{\mathrm d}t \end{aligned}$$

Discarding the first integral, and letting \(\delta \rightarrow 0\), and then \(j\rightarrow \infty \), we obtain

$$\begin{aligned}\limsup _{j\rightarrow \infty } {{\textbf{I}}}_{1,\,\varepsilon _{j}}\le \limsup _{j\rightarrow \infty }\left( -\int _{0}^{T_{1}}\langle \partial _{t}u_{0},\,(u_{\varepsilon _{j}}-u_{\star })-(u_{0}-u_{\star }) \rangle _{V_{0}^{\prime },\,V_{0}}\,{\mathrm d}t\right) =0,\end{aligned}$$

where the last identity follows from (3.11). The strong convergence \(\nabla E_{\varepsilon _{j}}(\nabla u_{0})\rightarrow A_{0}(\nabla u_{0})\) in \(L^{p^{\prime }}(\Omega _{T_{1}})\) and the weak convergence \(\nabla u_{\varepsilon _{j}}\rightharpoonup \nabla u_{0}\) in \(L^{p}(\Omega _{T_{1}})\) follow from from Lemma 3.1 (1) and (3.11) respectively. These convergence results yield \({{\textbf{I}}}_{2,\,\varepsilon _{j}}\rightarrow 0\). As a consequence, we have

$$\begin{aligned}\limsup _{j\rightarrow \infty }\iint _{\Omega _{T_{1}}}|\nabla u_{\varepsilon _{j}}-\nabla u_{0}|^{p}\,{\mathrm d}x{\mathrm d}t\le C\left( \sum _{l=1}^{2}\limsup _{j\rightarrow \infty }{{\textbf{I}}}_{l,\,\varepsilon _{j}}\right) ^{p/2}\le 0,\end{aligned}$$

which completes the proof of (3.13). In particular, we may let \(\nabla u_{\varepsilon _{j}}\rightarrow \nabla u_{0}\) a.e. in \(\Omega _{T_{1}}\), by taking a subsequence if necessary. Also, we are allowed to apply Lemma 3.1 (2). From this and (3.12), we conclude that \(u_{0}\) is a weak solution of (3.10).

The uniqueness of \(u_{0}\) easily follows from monotone properties. More precisely, letting \(\varepsilon \rightarrow 0\) in (3.4), we have

$$\begin{aligned}\langle \nabla E_{p}(z)-\nabla E_{p}(w)\mid z-w\rangle \ge \lambda \left( |z|^{2}+|w|^{2}\right) ^{p/2-1}|z-w|^{2}>0\end{aligned}$$

for all z, \(w\in {{\mathbb {R}}}^{n}\) with \(z\ne w\). We also recall

$$\begin{aligned}\langle \zeta _{1}-\zeta _{2}\mid z_{1}-z_{2}\rangle \ge 0\end{aligned}$$

for all \(z_{j}\in {{\mathbb {R}}}^{n}\), \(\zeta _{j}\in \partial E_{1}(z_{j})\) with \(j\in \{\,1,\,2\,\}\), which is often called the monotonicity of \(\partial E_{1}\). From these inequalities, we straightforwardly conclude that the weak solution of (3.10) is unique. For the detailed discussions, see [28, Proposition 2.4]. \(\square \)

4 Local bounds of solutions

In Sect. 4, we would like to show the local boundedness of u and \(u_{\varepsilon }\).

4.1 Local \(L^{\infty }\) estimate by Moser’s iteration

The local bound of u follows from (1.3) (see also [6, Theorem 2] and [12, Appendix A]).

Proposition 4.1

Under the assumptions in Theorem 1.2, we have

for any fixed \(Q_{R}=Q_{R}(x_{0},\,t_{0})\Subset \Omega _{T}\) with \(R\in (0,\,1)\).

Proof

We first prove a reversed Hölder estimate for \(U:=\sqrt{1+|u|^{2}}\). More precisely, we claim that for any \(\beta \in [s,\,\infty )\), and \(r_{1},\,r_{2}\in (0,\,R]\) with \(r_{1}<r_{2}\), there holds

$$\begin{aligned} \iint _{Q_{r_{1}}}U^{\kappa \beta +p-2}\,{\mathrm d}x{\mathrm d}t \le \left[ \frac{C(\beta -1)^{\gamma }}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}U^{\beta }\,{\mathrm d}x{\mathrm d}t \right] ^{\kappa }, \end{aligned}$$
(4.1)

provided \(U\in L^{\beta }(Q_{r_{2}})\). Here \(\kappa :=1+p/n\), \(\gamma :=p(1+1/(n+p))\) are fixed constants, and the constant \(C\in (1,\,\infty )\) depends at most on n, p, \(\lambda \), \(\Lambda \), and K. To prove (4.1), we introduce a truncation parameter \(M\in (1,\,\infty )\). For given \(r_{1}\) and \(r_{2}\), we choose and fix \(\eta \in C_{\mathrm c}^{1}(B_{r_{2}}(x_{0});\,[0,\,1])\) and \(\phi _{\mathrm c}\in C^{1}([t_{0}-r_{2}^{2},\,t_{0}];\,[0,\,1])\) satisfying

$$\begin{aligned} \eta |_{B_{r_{1}}}=1,\quad \phi _{\mathrm c}|_{I_{r_{1}}}=1,\quad \Vert \nabla \eta \Vert _{L^{\infty }(B_{r_{2}})}^{2}+\Vert \partial _{t}\phi _{\mathrm c}\Vert _{L^{\infty }(I_{r_{2}})}\le \frac{c_{0}}{(r_{2}-r_{1})^{2}}, \end{aligned}$$
(4.2)

and \(\phi _{\mathrm c}(t_{0}-r_{2}^{2})=0\), where \(c_{0}\in (1,\,\infty )\) is a universal constant. We let \(\phi _{\mathrm h}:[t_{0}-r_{2}^{2},\,t_{0}]\rightarrow [0,\,1]\) be a non-increasing Lipschitz function satisfying \(\phi _{\mathrm h}(t_{0}-r_{2}^{2})=1\) and \(\phi _{\mathrm h}(t_{0})=0\), and we write \(\phi :=\phi _{\mathrm c}\phi _{\mathrm h}\). Thanks to the Steklov average method, we may test \(\varphi :=\eta ^{p}\phi \psi _{\alpha ,\,M}(U)u\) into (1.18), where \(\psi _{\alpha ,\,M}\) is defined as (2.7) with \(\alpha :=\beta -2>0\). Then, we obtain

$$\begin{aligned}&-\iint _{Q_{r}}\eta ^{p}\Psi _{\alpha ,\,M}(U)\partial _{t}\phi \,{\mathrm d}x {\mathrm d}t+\iint _{Q_{r}}\left\langle Z+\nabla E_{p}(\nabla u)\,\,\bigg |\,\, \nabla u \right\rangle \eta ^{p}\psi _{\alpha ,\,M}(U) \phi \,{\mathrm d}x {\mathrm d}t\\ {}&\qquad +\iint _{Q_{r}}\left\langle Z+\nabla E_{p}(\nabla u)\,\,\bigg |\,\, \nabla u \right\rangle \eta ^{p}\psi _{\alpha ,\,M}^{\prime }(U)|u|^{2}U^{-1} \phi \,{\mathrm d}x {\mathrm d}t\\ {}&\quad =-\iint _{Q_{r}}\left\langle Z+\nabla E_{p}(\nabla u)\,\,\bigg |\,\, \nabla \eta \right\rangle \eta ^{p-1}\psi _{\alpha ,\,M}(U)u \phi \,{\mathrm d}x {\mathrm d}t\\ {}&\quad \le C\iint _{Q_{r}}\left( 1+|\nabla u|^{p-1} \right) |\nabla \eta |\eta ^{p-1}\psi _{\alpha ,\,M}(U)|u|\phi \,{\mathrm d}x {\mathrm d}t\\ {}&\quad \le \frac{\lambda }{2}\iint _{Q_{r}}\eta ^{p}|\nabla u|^{p}\psi _{\alpha ,\,M}(U)\phi \,{\mathrm d}x {\mathrm d}t+C\iint _{Q_{r}}\left( |\nabla \eta |^{p}|u|^{p}+\eta ^{p}\right) \psi _{\alpha ,\,M}(U)\phi \,{\mathrm d}x {\mathrm d}t. \end{aligned}$$

By (1.6), (2.1)–(2.2), and (3.6), we get

$$\begin{aligned}&-\iint _{Q_{r}} \eta ^{p}\Psi _{\alpha ,\,M}(U)\phi _{\mathrm c} \partial _{t}\phi _{\mathrm h}\,{\mathrm d}x {\mathrm d}t+\frac{c_{1}}{2}\iint _{Q_{r}}\eta ^{p}|\nabla u|^{p}\psi _{\alpha ,\,M}(U)\phi _{\mathrm c}\phi _{\mathrm h}\,{\mathrm d}x {\mathrm d}t\\&\quad \le \iint _{Q_{r}} \eta ^{p}\Psi _{\alpha ,\,M}(U)\phi _{\mathrm h} \partial _{t}\phi _{\mathrm c}\,{\mathrm d}x {\mathrm d}t+C\iint _{Q_{r}}\left( |\nabla \eta |^{p}U^{p}+\eta ^{p}\right) \psi _{\alpha ,\,M}(U)\,{\mathrm d}x {\mathrm d}t. \end{aligned}$$

Choosing \(\phi _{\mathrm h}\) suitably and recalling our choice of \(\eta \) and \(\phi _{\mathrm c}\), we easily obtain

$$\begin{aligned}&\mathop {\mathrm {ess~sup}}_{t_{0}-r_{2}^{2}<t<t_{0}}\int _{B_{r_{2}}}\eta ^{p}\phi _{\mathrm c}\Psi _{\alpha ,\,M}(U) \,{\mathrm d}x+\iint _{Q_{r_{2}}}\eta ^{p}\phi _{\mathrm c}|\nabla u|^{p}\psi _{\alpha ,\,M}(U)\,{\mathrm d}x{\mathrm d}t\nonumber \\&\qquad \le C\left[ \iint _{Q_{r_{2}}}\left( \frac{\psi _{\alpha ,\,M}(U)U^{2}}{(r_{2}-r_{1})^{2}}+\frac{\psi _{\alpha ,\,M}(U)U^{p}}{(r_{2}-r_{1})^{p}}\right) \,{\mathrm d}x{\mathrm d}t \right] \end{aligned}$$

Also, it is easy to deduce

$$\begin{aligned}&\iint _{Q_{r_{1}}}\Psi _{\alpha ,\,M}(U)^{p/n}\psi _{\alpha ,\,M}(U)U^{p} \,{\mathrm d}x{\mathrm d}t\\&\le C_{n,\,p}\left( \mathop {\mathrm {ess~sup}}_{t_{0}-r_{2}^{2}<t<t_{0}}\int _{B_{r_{2}}}\eta ^{p}\Psi _{\alpha ,\,M}(U) \phi _{\mathrm c}\,{\mathrm d}x \right) ^{p/n}\iint _{Q_{r_{2}}}\left|\nabla \left( \eta \psi _{\alpha ,\,M}(U)^{1/p}U\right) \right|^{p}\phi _{\mathrm c} \,{\mathrm d}x{\mathrm d}t \end{aligned}$$

by Hölder’s inequality and the Sobolev embedding \(W_{0}^{1,\,p}(B_{r_{2}})\hookrightarrow L^{\frac{np}{n-p}}(B_{r_{2}})\). By direct computations, we notice

$$\begin{aligned}&\iint _{Q_{r_{2}}}\left|\nabla \left( \eta \psi _{\alpha ,\,M}(U)^{1/p}U\right) \right|^{p}\phi _{\mathrm c} \,{\mathrm d}x{\mathrm d}t\\&\quad \le C_{p}\left[ \iint _{Q_{r_{2}}}|\nabla \eta |^{p}\psi _{\alpha ,\,M}(U)U^{p}\phi _{\mathrm c}\,{\mathrm d}x {\mathrm d}t+(1+\alpha )^{p}\iint _{Q_{r_{2}}}\eta ^{p}|\nabla u|^{p}\psi _{\alpha ,\,M}(U)\phi _{\mathrm c}\,{\mathrm d}x {\mathrm d}t \right] , \end{aligned}$$

where we have used (2.10) with \(r=p\), and \(|\nabla U|\le |\nabla u|\). Combining these three estimates, we get

$$\begin{aligned}\iint _{Q_{r_{1}}}\Psi _{\alpha ,\,M}(U)^{p/n}\psi _{\alpha ,\,M}(U)U^{p} \,{\mathrm d}x{\mathrm d}t\le \left[ \frac{C(\alpha +1)^{p}}{(r_{2}-r_{1})^{2}} \iint _{Q_{r_{2}}}\left( U^{\beta }+1\right) \,{\mathrm d}x{\mathrm d}t \right] ^{\kappa },\end{aligned}$$

where Young’s inequality is used. Letting \(M\rightarrow \infty \) and making use of Beppo Levi’s monotone convergence theorem and (2.9), we conclude (4.1).

We define the sequences \(\{R_{l}\}_{l=0}^{\infty }\subset (R/2,\,R]\), \(\{q_{l}\}_{l=0}^{\infty }\subset [s,\,\infty )\), and \(\{Y_{l}\}_{l=0}^{\infty }\subset {{\mathbb {R}}}_{\ge 0}\) as

$$\begin{aligned}R_{l}:=\frac{1+2^{-l}}{2}R,\quad q_{l}:=\kappa ^{l}\mu +s_{\textrm{c}},\quad Y_{l}:=\left( \iint _{Q_{R_{l}}}U^{q_{l}}\,{\mathrm d}x {\mathrm d}t \right) ^{1/q_{l}},\end{aligned}$$

where \(\mu :=s-s_{\mathrm c}\in {{\mathbb {R}}}_{>0}\). Then, by \((q_{l}-1)^{\gamma }=\mu ^{\gamma }\left( \kappa ^{l}+(s_{\mathrm c}-1)/\mu \right) ^{\gamma }\), it is easy to check that

$$\begin{aligned} R_{l}-R_{l+1}=2^{-l-2}R\quad \text {and}\quad (q_{l}-1)^{\gamma }\le (2\mu )^{\gamma }{\tilde{\kappa }}^{\gamma l} \end{aligned}$$
(4.3)

hold for every \(l\in {{\mathbb {Z}}}_{\ge 0}\), where the constant \({\tilde{\kappa }}\in (\kappa ,\,\infty )\) depends at most on \(\kappa \), s, and \(s_{\mathrm c}\). Hence, (4.1) with \(\beta :=q_{l}\ge q_{0}=s>s_{\mathrm c}\ge 2\) yields \(Y_{l+1}^{q_{l+1}}\le (AB^{l}Y_{l}^{q_{l}})^{\kappa } \) for all \(l\in {{\mathbb {Z}}}_{\ge 0}\), where \(A:=CR^{-2}\in (1,\,\infty )\) for some constant \(C\in (1,\,\infty )\), and \(B:=4{\tilde{\kappa }}^{\gamma }\in (1,\,\infty )\). By Lemma 2.2, we have

which completes the proof. \(\square \)

4.2 Comparison principle for parabolic approximate equations

We would like to show the comparison principle and the weak maximum principle for (3.8). The weak maximum principle implies that an approximate solution \(u_{\varepsilon }\) will be bounded if it admits a Dirichlet boundary datum in \(L^{\infty }\). Hence, combining with Proposition 4.1, we may let \(u_{\varepsilon }\) be locally bounded, which is used in Sect. 5.

We straightforwardly fix some terminology.

Definition 4.2

Let u, \(v\in X^{p}(0,\,T;\,\Omega )\cap C^{0}([0,\,T];\,L^{2}(\Omega ))\).

  • (1) It is said that \(u\le v\) on \(\partial _{\textrm{p}}\Omega _{T}\), when there hold \((u-v)_{+}\in X_{0}^{p}(0,\,T;\,\Omega )\) and \((u-v)_{+}|_{t=0}=0\) in \(L^{2}(\Omega )\).

  • (2) A function u is called a weak subsolution to (3.8) in \(\Omega _{T}\) when

    $$\begin{aligned} \int _{0}^{T} \langle \partial _{t}u,\, \varphi \rangle _{{V_{0}^{\prime }},\,V_{0}}\,{\mathrm d}t+\iint _{\Omega _{T}}\left\langle \nabla E^{\varepsilon } (\nabla u)\,\,\bigg |\,\, \nabla \varphi \right\rangle \,{\mathrm d}x {\mathrm d}t\le 0 \end{aligned}$$
    (4.4)

    holds for all non-negative \(\varphi \in X_{0}^{p}(0,\,T;\,\Omega )\).

  • (3) A function v is called a weak supersolution to (3.8) in \(\Omega _{T}\) when

    $$\begin{aligned} \int _{0}^{T} \langle \partial _{t}v,\, \varphi \rangle _{{V_{0}^{\prime }},\,V_{0}}\,{\mathrm d}t+\iint _{\Omega _{T}}\left\langle \nabla E^{\varepsilon } (\nabla v)\,\,\bigg |\,\, \nabla \varphi \right\rangle \,{\mathrm d}x {\mathrm d}t\ge 0 \end{aligned}$$
    (4.5)

    holds for all non-negative \(\varphi \in X_{0}^{p}(0,\,T;\,\Omega )\).

Proposition 4.3

Let u and v be respectively a subsolution and a supersolution to (3.8). If \(u\le v\) on \(\partial _{\textrm{p}}\Omega _{T}\) holds in the sense of Definition 4.2, then \(u\le v\) a.e. in \(\Omega _{T}\).

Proof

For \(\delta \in (0,\,T/2)\), we define \(\phi _{\delta }\) as (3.14) with \(T_{1}\) replaced by T. Thanks to the Steklov average, we may test \((u-v)_{+}\phi _{\delta }\) into (4.4)–(4.5). Integrating by parts, we have

$$\begin{aligned}0\ge -\iint _{\Omega _{T}}|u-v|^{2}\partial _{t}\phi _{\delta }\,{\mathrm d}x{\mathrm d}t+\iint _{\Omega _{T}}\left\langle \nabla E^{\varepsilon } (\nabla u)-\nabla E^{\varepsilon } (\nabla v)\,\, \bigg |\,\,\nabla (u-v)_{+} \right\rangle \phi _{\delta }\,{\mathrm d}x {\mathrm d}t.\end{aligned}$$

Discarding the first integral and letting \(\delta \rightarrow 0\), we obtain

$$\begin{aligned}0 \ge \iint _{\Omega _{T}}\left\langle \nabla E^{\varepsilon } (\nabla u)-\nabla E^{\varepsilon } (\nabla v)\,\, \bigg |\,\,\nabla u-\nabla v \right\rangle \chi _{\{u>v\}} \,{\mathrm d}x {\mathrm d}t\end{aligned}$$

by Beppo Levi’s monotone convergence theorem. Since the map** \({{\mathbb {R}}}^{n}\ni z\mapsto \nabla E^{\varepsilon } (z)\in {{\mathbb {R}}}^{n}\) is strictly monotone, the inequality above yields \(\nabla (u-v)_{+}=0\) a.e. in \(\Omega _{T}\). Recalling \((u-v)_{+}\in L^{p}(0,\,T;\,W_{0}^{1,\,p}(\Omega ))\), we have \(u\le v\) a.e. in \(\Omega _{T}\). \(\square \)

As a consequence, we can deduce the following Corollary 4.4.

Corollary 4.4

Let \(u_{\star }\in L^{\infty }(\Omega _{T})\cap X_{0}^{p}(0,\,T;\,\Omega )\), and assume that \(u_{\varepsilon }\in u_{\star }+X_{0}^{p}(0,\,T;\,\Omega )\) is the weak solution of (3.8). Then, \(u_{\varepsilon }\in L^{\infty }(\Omega _{T})\), and

$$\begin{aligned}\Vert u_{\varepsilon }\Vert _{L^{\infty }(\Omega _{T})} \le \Vert u_{\star }\Vert _{L^{\infty }(\Omega _{T})}.\end{aligned}$$

Proof

We abbrebiate \(M:=\Vert u_{\star }\Vert _{L^{\infty }(\Omega _{T})}\in [0,\,\infty )\). It is clear that the constant functions \(\pm M\) are weak solutions to (3.8). Since \(0\le (u_{\varepsilon }-M)_{+}\le (u_{\varepsilon }-u_{\star })_{+}\) holds a.e. in \(\Omega _{T}\), it is easy to check \(u_{\varepsilon }\le M\) on \(\partial _{\textrm{p}}\Omega _{T}\) in the sense of Definition 4.2 (see [16, Lemma 1.25]). Similarly, there holds \(-M\le u_{\varepsilon }\) on \(\partial _{\textrm{p}}\Omega _{T}\). By Proposition 4.3, we have \(-M\le u_{\varepsilon }\le M\) in \(\Omega _{T}\), which completes the proof. \(\square \)

Remark 4.5

The proofs of Propositions 3.3, 4.3, and Corollary 4.4 work even when the domain \(\Omega _{T}=\Omega \times (0,\,T)\) is replaced by its parabolic subcylinder \(Q_{R}(x_{0},\,t_{0})=B_{R}(x_{0})\times (t_{0}-R^{2},\,t_{0}]\Subset \Omega _{T}\). Therefore, we can apply these results with \(\Omega _{T}\) replaced by \(Q_{R}(x_{0},\,t_{0})\), which are to be used in the proof of Theorem 1.2.

5 Regularity for gradients of approximate solutions

In Sect. 5, we consider a bounded weak solution to (3.8) in a parabolic subcylinder \(\tilde{Q}\Subset \Omega _{T}\), and fix \(Q_{R}(x_{0},\,t_{0})=B_{R}(x_{0})\times (t_{0}-R^{2},\,t_{0}]\Subset {{\tilde{Q}}}\). Throughout Sects. 56, we assume

$$\begin{aligned} \mathop {\mathrm {ess~sup}}_{Q_{R}(x_{0},\,t_{0})}\,|u_{\varepsilon }|\le M_{0} \end{aligned}$$
(5.1)

for some \(M_{0}\in (0,\,\infty )\). We aim to prove that a gradient \(\nabla u_{\varepsilon }\) is locally in \(L^{q}\) for any \(q\in (p,\,\infty ]\), whose estimate may depend on \(M_{0}\) but is independent of \(\varepsilon \in (0,\,1)\).

5.1 Weak formulations and energy estimates

As well as \(V_{\varepsilon }:=\sqrt{\varepsilon ^{2}+|\nabla u_{\varepsilon }|^{2}}\), we also consider another function \(W_{\varepsilon }\), defined as

$$\begin{aligned}W_{\varepsilon }:=\sqrt{1+\sum _{j=1}^{n}w_{\varepsilon ,\,j}^{2}}\le 1+V_{\varepsilon },\end{aligned}$$

where for each \(j\in \{\,1,\,\dots ,\,n\,\}\), we set

$$\begin{aligned}w_{\varepsilon ,\,j}:=(\partial _{x_{j}}u_{\varepsilon }-1)_{+}-(-\partial _{x_{j}}u_{\varepsilon }-1)_{+}.\end{aligned}$$

We note that \(V_{\varepsilon }\) and \(W_{\varepsilon }\) are compatible, in the sense that there hold

$$\begin{aligned} \left\{ \begin{array}{rcl} V_{\varepsilon } \le c_{n}W_{\varepsilon }\le c_{n}(1+V_{\varepsilon }) &{} \text {in} &{} Q=Q_{R},\\ W_{\varepsilon }\le \sqrt{2} V_{\varepsilon } &{} \text {in} &{}D\subset Q_{R}, \end{array} \right. \end{aligned}$$
(5.2)

where \(D:=\{Q_{R}\mid |\nabla u_{\varepsilon }|>1\}\) (see [25, §4.1]). In particular, we are allowed to use

$$\begin{aligned} {\tilde{\lambda }}W_{\varepsilon }^{p-2}\textrm{id}_{n} \leqslant \nabla ^{2} E^{\varepsilon } (\nabla u_{\varepsilon }) \leqslant {\tilde{\Lambda }}W_{\varepsilon }^{p-2}\textrm{id}_{n} \quad \text {in } D, \end{aligned}$$
(5.3)

where \(\hat{\lambda }=\hat{\lambda }(n,\,p,\,\lambda )\in (0,\,\lambda )\) and \(\hat{\Lambda }=\hat{\Lambda }(n,\,p,\,\Lambda ,\,K)\in (\Lambda ,\,\infty )\) are constants.

Combining with (5.1), we can carry out standard parabolic arguments, including the difference quotient method, Moser’s iteration, and De Giorgi’s truncation (see [8, Chapter VIII]). As a consequence, we are allowed to let

$$\begin{aligned} \nabla u_{\varepsilon }\in L^{\infty }(Q_{R};\,{{\mathbb {R}}}^{n}) \quad \text {and} \quad \nabla ^{2}u_{\varepsilon }\in L^{2}(Q_{R};\,{{\mathbb {R}}}^{n\times n}). \end{aligned}$$
(5.4)

Thanks to this improved regularity, there holds

$$\begin{aligned} -\iint _{Q_{R}} \partial _{x_{j}}u_{\varepsilon }\partial _{t}\varphi \,{\mathrm d}x{\mathrm d}t+\iint _{Q_{R}}\left\langle \nabla ^{2} E^{\varepsilon } (\nabla u_{\varepsilon })\nabla \partial _{x_{j}}u_{\varepsilon }\,\,\bigg |\,\,\nabla \varphi \right\rangle \,{\mathrm d}x{\mathrm d}t=0 \end{aligned}$$
(5.5)

for any \(\varphi \in C_{\mathrm c}^{1}(Q_{R})\). Moreover, we may extend the test function \(\varphi \) in the class \(X_{0}^{2}(I_{R};\,B_{R}):=\left\{ \varphi \in L^{2}(I_{R};\,W_{0}^{1,\,2}(B_{R}))\,\,\bigg |\,\,\partial _{t}\varphi \in L^{2}(I_{R};\,W^{-1,\,2}(B_{R}))\right\} \subset C(\overline{I_{R}};\,L^{2}(B_{R}))\) with \(\varphi |_{t=t_{0}-r^{2}}=\varphi |_{t=t_{0}}=0\) in \(L^{2}(B_{R})\). From (5.5), we deduce basic energy estimates concerning \(V_{\varepsilon }\) and \(W_{\varepsilon }\) (Lemma 5.1).

Lemma 5.1

For \(\alpha \in [0,\,\infty )\), \(M\in (1,\,\infty )\), let \(\psi _{\alpha ,\,M}\) and \({\tilde{\psi }}_{\alpha ,\,M}\) be given by (2.7) and (2.8) respectively. Let \(u_{\varepsilon }\) be a weak solution to (3.8) in \({{\tilde{Q}}}\). Fix \(Q_{R}(x_{0},\,t_{0})\Subset {{\tilde{Q}}}\), and let (5.4) be in force. Fix \(\eta \in C_{\mathrm c}^{1}(B_{R}(x_{0});\,[0,\,1])\), and \(\phi _{\textrm{c}}\in C^{1}([t_{0}-R^{2},\,t_{0}];\,[0,\,1])\) that satisfies \(\phi _{\mathrm c}(t_{0}-R^{2})=0\). Then, there hold

$$\begin{aligned}&\iint _{Q_{R}}V_{\varepsilon }^{p-2}\left( |\nabla ^{2} u_{\varepsilon }|^{2}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })+|\nabla V_{\varepsilon }|^{2}{\tilde{\psi }}_{\alpha ,\,M}^{\prime }(V_{\varepsilon })V_{\varepsilon } \right) \eta ^{2}\phi _{\textrm{c}}\,{\mathrm d}x{\mathrm d}t\nonumber \\&\le C\iint _{Q_{R}}\left( V_{\varepsilon }^{p}|\nabla \eta |^{2}+V_{\varepsilon }^{2}|\partial _{t}\phi _{\textrm{c}} |\right) {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })\,{\mathrm d}x{\mathrm d}t, \end{aligned}$$
(5.6)

and

$$\begin{aligned}&\mathop {\mathrm {ess~sup}}_{\tau \in I_{R}}\int _{B_{R}\times \{\tau \}}{\Psi }_{\alpha ,\,M}(W_{\varepsilon })\eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x\nonumber \\&\quad +\iint _{Q_{R}}W_{\varepsilon }^{p-2}\left( \psi _{\alpha ,\,M}(V_{\varepsilon })+\psi _{\alpha ,\,M}^{\prime }(W_{\varepsilon })W_{\varepsilon } \right) \eta ^{2}\phi _{\textrm{c}}\,{\mathrm d}x{\mathrm d}t\nonumber \\&\le C\iint _{Q_{R}}\left( W_{\varepsilon }^{p}|\nabla \eta |^{2}+W_{\varepsilon }^{2}|\partial _{t}\phi _{\textrm{c}} |\right) \psi _{\alpha ,\,M}(W_{\varepsilon })\,{\mathrm d}x{\mathrm d}t, \end{aligned}$$
(5.7)

where \(C\in (1,\,\infty )\) depends at most on n, p, \(\lambda \), \(\Lambda \), K.

Proof

Let \(\zeta \in C_{\mathrm c}^{1}(Q_{R})\) be non-negative, and assume that a composite function \(\psi :{{\mathbb {R}}}_{\ge 0}\rightarrow {{\mathbb {R}}}_{\ge 0}\) satisfy all the conditions given in Sect. 2.2. We set \(\phi :=\phi _{\textrm{c}}\phi _{\textrm{h}}\), where we choose an arbitrary Lipschitz function \(\phi _{\textrm{h}}:[t_{0}-R^{2},\,t_{0} ]\rightarrow [0,\,1]\) that is non-increasing and satisfies \(\phi _{\textrm{h}}(t_{0})=0\).

To prove (5.6), we test \(\varphi =\zeta \psi (V_{\varepsilon })\partial _{x_{j}}u_{\varepsilon }\) into (5.5). This function is admissible by the method of Steklov averages. Summing over \(j\in \{\,1,\,\dots \,,\,n\,\}\), we have

$$\begin{aligned}&-\iint _{Q_{R}}\Psi (V_{\varepsilon })\partial _{t}\zeta \,{\mathrm d}x{\mathrm d}t+\iint _{Q_{R}}\left\langle A_{\varepsilon }\nabla [\Psi (V_{\varepsilon })]\,\,\bigg |\,\,\nabla \zeta \right\rangle \,{\mathrm d}x {\mathrm d}t\nonumber \\&\qquad +\iint _{Q_{R}}\left[ \left\langle A_{\varepsilon }\nabla V_{\varepsilon } \,\,\bigg |\,\,\nabla V_{\varepsilon } \right\rangle \psi ^{\prime }(V_{\varepsilon })V_{\varepsilon }+\sum _{j=1}^{n}\left\langle A_{\varepsilon }\nabla \partial _{x_{j}}u_{\varepsilon } \,\,\bigg |\,\,\nabla \partial _{x_{j}}u_{\varepsilon } \right\rangle \psi (V_{\varepsilon })\right] \zeta \,{\mathrm d}x {\mathrm d}t\nonumber \\ {}&\quad =0, \end{aligned}$$
(5.8)

where \(A_{\varepsilon }:=\nabla ^{2} E^{\varepsilon } (\nabla u_{\varepsilon })\), and \(\Psi \) is defined as (2.5). Here we choose \(\psi :={\tilde{\psi }}_{\alpha ,\,M}\) and \(\zeta :=\eta ^{2}\phi \). Then, (5.8) yields

$$\begin{aligned}&-\iint _{Q_{R}}{\tilde{\Psi }}_{\alpha ,\,M}(V_{\varepsilon })\eta ^{2}\phi _{\mathrm c}\partial _{t}\phi _{\mathrm h}\,{\mathrm d}x{\mathrm d}t\\&\qquad +\lambda \iint _{Q_{R}}V_{\varepsilon }^{p-2}\left( {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\nabla ^{2}u_{\varepsilon }|^{2} +{\tilde{\psi }}_{\alpha ,\,M}^{\prime }(V_{\varepsilon })V_{\varepsilon }|\nabla V_{\varepsilon }|^{2} \right) \eta ^{2}\phi _{\mathrm c}\phi _{\mathrm h}\,{\mathrm d}x{\mathrm d}t\\&\quad \le \iint _{Q_{R}}{\tilde{\Psi }}_{\alpha ,\,M}(V_{\varepsilon })\eta ^{2}\phi _{\mathrm h}\partial _{t}\phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t+2(\Lambda +K)\iint _{Q_{R}}V_{\varepsilon }^{p-1}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\nabla V_{\varepsilon }||\nabla \eta |\eta \phi \,{\mathrm d}x\\&\quad \le \frac{\lambda }{2}\iint _{Q_{R}}V_{\varepsilon }^{p-2}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\nabla ^{2}u_{\varepsilon }|^{2}\eta ^{2}\phi _{\mathrm c}\phi _{\mathrm h}\,{\mathrm d}x{\mathrm d}t \\&\qquad +\frac{2(\Lambda +K)^{2}}{\lambda }\iint _{Q_{R}} V_{\varepsilon }^{p} {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\nabla \eta |^{2}\,{\mathrm d}x{\mathrm d}t+\iint _{Q_{R}}V_{\varepsilon }^{2}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\partial _{t}\phi _{\mathrm c}|\,{\mathrm d}x{\mathrm d}t, \end{aligned}$$

where we have used \(|\nabla V_{\varepsilon }|\le |\nabla ^{2}u_{\varepsilon }|\), Young’s inequality, and (2.6). Discarding the first integral, and choosing \(\phi _{\mathrm h}\) suitably, we easily conclude (5.6).

To prove (5.7), we test \(\varphi =\zeta \psi (W_{\varepsilon })w_{\varepsilon ,\,j}\) into (5.5), where we let \(\psi :=\psi _{\alpha ,\,M}\) and \(\zeta :=\eta ^{2}\phi \). Since all the integrands range over \(\{|\partial _{x_{j}}u_{\varepsilon }|>1\}\subset D\), and therefore we may replace \(\nabla \partial _{x_{j}}u_{\varepsilon }\) by \(\nabla w_{\varepsilon ,\,j}\), and apply (5.3). By similar computations, we have

$$\begin{aligned}&-\iint _{Q_{R}}\Psi _{\alpha ,\,M}(W_{\varepsilon })\eta ^{2}\phi _{\mathrm c} \partial _{t}\phi _{\mathrm h}\,{\mathrm d}x{\mathrm d}t\\&\quad +\frac{{\hat{\lambda }}}{2}\iint _{Q_{R}}W_{\varepsilon }^{p-2}\left( \psi _{\alpha ,\,M}(W_{\varepsilon })+\psi _{\alpha ,\,M}^{\prime }(W_{\varepsilon })W_{\varepsilon } \right) |\nabla W_{\varepsilon } |^{2}\,\eta ^{2}\phi _{\mathrm c}\phi _{\mathrm h}\,{\mathrm d}x{\mathrm d}t \\&\le \frac{2{\hat{\Lambda }}^{2}}{{\hat{\lambda }}}\iint _{Q_{R}}W_{\varepsilon }^{p}\psi _{\alpha ,\,M}(W_{\varepsilon })|\nabla \eta |^{2}\,{\mathrm d}x{\mathrm d}t+\iint _{Q_{R}}V_{\varepsilon }^{2}\psi _{\alpha ,\,M}(V_{\varepsilon })|\partial _{t}\phi _{\mathrm c}|\,{\mathrm d}x{\mathrm d}t. \end{aligned}$$

Choosing \(\phi _{\mathrm h}\) suitably, we obtain (5.7). \(\square \)

5.2 Reversed Hölder inequalities and local gradient bounds

We would like to show \(L^{q}\)-bounds of \(\nabla u_{\varepsilon }\) for each \(q\in (p,\,\infty ]\), whose estimate depends on \(M_{0}\) but is uniformly for \(\varepsilon \in (0,\,1)\).

The case \(q\in (p,\,\infty )\) is completed by (5.1) and (5.6).

Proposition 5.2

Let n and p satisfy (1.2). Assume that \(u_{\varepsilon }\) is a weak solution to (3.8) in \({{\tilde{Q}}}\). Fix \(Q_{R}(x_{0},\,t_{0})\Subset {{\tilde{Q}}}\) with \(R\in (0,\,1)\), and let (5.1) and (5.4) be in force. Then for each fixed \(q\in (p,\,\infty )\), there holds

$$\begin{aligned} \iint _{Q_{R/2}} V_{\varepsilon }^{q}\,{\mathrm d}x{\mathrm d}t \le C\iint _{Q_{R}}\left( V_{\varepsilon }^{p}+1\right) \,{\mathrm d}x{\mathrm d}t, \end{aligned}$$
(5.9)

where \(C\in (1,\,\infty )\) depends at most on n, p, q, \(\lambda \), \(\Lambda \), K, \(M_{0}\), and r.

Proof

It suffices to prove

$$\begin{aligned} \iint _{Q_{r_{1}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t\le \frac{C(n,\,p,\,\lambda ,\,\Lambda ,\,K,\,\alpha ,\,M_{0})}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}\left( V_{\varepsilon }^{\alpha +p}+1\right) \,{\mathrm d}x{\mathrm d}t\nonumber \\ \end{aligned}$$
(5.10)

for \(\alpha \in [0,\,\infty )\), \(M\in (1,\,\infty )\), and \(r_{1},\,r_{2}\in (0,\,R]\) with \(r_{1}<r_{2}\), provided \(V_{\varepsilon }\in L^{\alpha +p}(Q_{r_{2}})\). In fact, letting \(M\rightarrow \infty \) in (5.10) and recalling (2.12), we have

$$\begin{aligned}&\iint _{Q_{r_{1}}}V_{\varepsilon }^{\alpha +2}\,{\mathrm d}x{\mathrm d}t \\&\quad \le \iint _{Q_{r_{1}}}\left( V_{\varepsilon }^{\alpha +p}+1\right) \,{\mathrm d}x{\mathrm d}t+\limsup _{M\rightarrow \infty }\iint _{Q_{r_{1}}} {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t\\&\quad \le \frac{C(n,\,p,\,\lambda ,\,\Lambda ,\,K,\,\alpha ,\,M_{0})}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}\left( V_{\varepsilon }^{\alpha +p}+1\right) \,{\mathrm d}x{\mathrm d}t \end{aligned}$$

by Beppo Levi’s monotone convergence theorem. By this estimate and an iteration argument in finitely many steps, for any \(m\in {{\mathbb {N}}}\), we can deduce (5.9) with \(q=p+(2-p)m\). Therefore, by Hölder’s inequality, it is easy to verify (5.9) for arbitrary \(q\in (p,\,\infty )\).

To prove (5.10), we choose \(\eta \) and \(\phi _{\mathrm c}\) satisfying (4.2). Integrating by parts, we obtain

$$\begin{aligned}&\iint _{Q_{r_{1}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t=\iint _{Q_{r_{2}}} {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon }) V_{\varepsilon }^{2} \eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t \\&\quad \le \iint _{Q_{r_{2}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon }) \eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t +C_{n}\iint _{Q_{r}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon }) |u_{\varepsilon }||\nabla \eta |\eta \phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t\\&\qquad +C_{n}\int _{Q_{r_{2}}}\left( {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })|\nabla ^{2}u_{\varepsilon }|+{\tilde{\psi }}_{\alpha ,\,M}^{\prime }(V_{\varepsilon })|\nabla V_{\varepsilon }|V_{\varepsilon }\right) |u_{\varepsilon }|\eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t\\&\quad \le \frac{C(n,\,M_{0})}{r_{2}-r_{1}}\iint _{Q_{r_{2}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })\,{\mathrm d}x{\mathrm d}t \\&\qquad +C_{n}M_{0}\left[ \iint _{Q_{r_{2}}}V_{\varepsilon }^{p-2}\left( |\nabla ^{2} u_{\varepsilon }|^{2}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })+|\nabla V_{\varepsilon }|^{2}{\tilde{\psi }}_{\alpha ,\,M}^{\prime }(V_{\varepsilon })V_{\varepsilon } \right) \eta ^{2}\phi _{\textrm{c}}\,{\mathrm d}x{\mathrm d}t \right] ^{1/2}\\&\qquad \cdot \left[ \iint _{Q_{r_{2}}}V_{\varepsilon }^{2-p}\left( {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })+{\tilde{\psi }}_{\alpha ,\,M}^{\prime }(V_{\varepsilon })V_{\varepsilon } \right) \eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x{\mathrm d}t \right] ^{1/2}, \end{aligned}$$

where we have used (5.1) and the Cauchy–Schwarz inequality. By (2.11), (4.2), (5.6) and Young’s inequality, we have

$$\begin{aligned}&\iint _{Q_{r_{1}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t\\&\quad \le \frac{1}{4}\iint _{Q_{r_{2}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t+\frac{C}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}\left( {\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })\left( V_{\varepsilon }^{p}+1\right) +V_{\varepsilon }^{\alpha +2-p}\right) \,{\mathrm d}x{\mathrm d}t\\&\quad \le \frac{1}{4}\iint _{Q_{r_{2}}}{\tilde{\psi }}_{\alpha ,\,M}(V_{\varepsilon })V_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t+\frac{C}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}\left( V_{\varepsilon }^{\alpha +p}+1\right) \,{\mathrm d}x{\mathrm d}t, \end{aligned}$$

where the constant \(C\in (1,\,\infty )\) depends at most on n, p, \(\lambda \), \(\Lambda \), K, \(\alpha \), and \(M_{0}\). The desired estimate (5.10) follows from Lemma 2.1. \(\square \)

From (5.7), we complete the case \(q=\infty \) by Moser’s iteration.

Proposition 5.3

Let n and p satisfy (1.2). Fix an exponent q satisfying

$$\begin{aligned}q_{\mathrm c}:=\frac{n(2-p)}{2}<q<\infty ,\quad \text {and}\quad q\ge 2.\end{aligned}$$

Let \(u_{\varepsilon }\) be a weak solution to (3.8) in \({{\tilde{Q}}}\). Fix \(Q_{R}(x_{0},\,t_{0})\Subset {{\tilde{Q}}}\) with \(R\in (0,\,1)\), and let (5.4) be in force. Then, there holds

(5.11)

Here the constant \(C\in (1,\,\infty )\) depends at most on n, p, q, \(\lambda \), \(\Lambda \), and K.

Proof

To prove (5.11), we claim that for every \(\beta \in [2,\,\infty )\), there holds

$$\begin{aligned} \iint _{Q_{r_{1}}}W_{\varepsilon }^{\kappa \beta +p-2} \le \left[ \frac{C(\beta -1)^{\gamma }}{(r_{2}-r_{1})^{2}} \iint _{Q_{r_{2}}}W_{\varepsilon }^{\beta }\,{\mathrm d}x{\mathrm d}t\right] ^{\kappa }, \end{aligned}$$
(5.12)

where \(\kappa :=1+2/n\), \(\gamma :=2(1+1/(n+2))\), and the constant \(C\in (1,\,\infty )\) depends at most on n, p, \(\lambda \), \(\Lambda \), K. Fix \(r_{1},\,r_{2}\in (0,\,R]\) with \(r_{1}<r_{2}\), and choose \(\eta \) and \(\phi _{\textrm{c}}\) satisfying (4.2). By (2.10) with \(r=2\), (5.7), Hölder’s inequality and the continuous embedding \(W_{0}^{1,\,2}(B_{r_{2}})\hookrightarrow L^{\frac{2n}{n-2}}(B_{r_{2}})\), we obtain

$$\begin{aligned}&\iint _{Q_{r_{1}}}\Psi _{\alpha ,\,M}(W_{\varepsilon })^{\frac{2}{n}}\psi _{\alpha ,\,M}(W_{\varepsilon })W_{\varepsilon }^{p} \\&\quad \le C_{n}\left( \mathop {\mathrm {ess~sup}}_{t_{0}-r_{2}^{2}<\tau <t_{0}}\int _{B_{r_{2}}\times \{\tau \}}\Psi _{\alpha ,\,M}(W_{\varepsilon })\eta ^{2}\phi _{\mathrm c}\,{\mathrm d}x\right) ^{\frac{2}{n}} \iint _{Q_{r_{2}}}\left|\nabla \left( \eta \psi _{\alpha ,\,M}(W_{\varepsilon })^{\frac{1}{2}}W_{\varepsilon }^{\frac{p}{2}}\right) \right|^{2}\phi _{\mathrm c} \,{\mathrm d}x{\mathrm d}t\\&\quad \le C(n,\,p,\,\lambda ,\,\Lambda ,\,K)\left[ \frac{(1+\alpha )^{2}}{(r_{2}-r_{1})^{2}}\iint _{Q_{r_{2}}}\psi _{\alpha ,\,M}(W_{\varepsilon })W_{\varepsilon }^{2}\,{\mathrm d}x{\mathrm d}t\right] ^{\kappa }, \end{aligned}$$

where we note \(W_{\varepsilon }\ge 1\) and therefore \(W_{\varepsilon }^{p}\le W_{\varepsilon }^{2}\). Letting \(M\rightarrow \infty \) and recalling (2.9), we conclude (5.12) by Beppo Levi’s monotone convergence theorem.

We set the sequences \(\{q_{l}\}_{l=0}^{\infty }\subset [q,\,\infty )\), \(\{R_{l}\}_{l=0}^{\infty }\subset (R/2,\,R]\), \(\{Y_{l}\}_{l=0}^{\infty }\subset {{\mathbb {R}}}_{\ge 0}\) as

$$\begin{aligned}q_{l}:=\mu \kappa ^{l}+q_{\mathrm c},\quad R_{l}:=\frac{1+2^{-l}}{2}R,\quad \text {and}\quad Y_{l}:=\left( \iint _{Q_{R_{l}}}W_{\varepsilon }^{q_{l}}\,{\mathrm d}x{\mathrm d}t\right) ^{1/q_{l}}\end{aligned}$$

for each \(l\in {{\mathbb {Z}}}_{\ge 0}\), where \(\mu :=q-q_{\mathrm c}\). Then, similarly to Proposition 4.1, we can find the constant \({\tilde{\kappa }}={\tilde{\kappa }}(\kappa ,\,q,\,q_{\mathrm c})\in (\kappa ,\,\infty )\) such that (4.3) holds for every \(l\in {{\mathbb {Z}}}_{\ge 0}\). Hence, (5.12) with \(\beta :=q_{l}\ge q_{0}=q\ge 2\) yields

$$\begin{aligned}Y_{l+1}^{q_{l+1}}\le \left( AB^{l}Y_{l}^{q_{l}}\right) ^{\kappa },\quad q_{l}\ge \mu \left( \kappa ^{l}-1\right) \quad \text {for all }l\in {{\mathbb {Z}}}_{\ge 0},\end{aligned}$$

where we set \(A:=CR^{-2}\in (1,\,\infty )\) for some constant \(C\in (1,\,\infty )\), and \(B:=4{\tilde{\kappa }}^{\gamma }\in (1,\,\infty )\). By applying Lemma 2.2 and recalling (5.2), we have

which completes the proof. \(\square \)

Section 5 is completed by showing Theorem 5.4.

Theorem 5.4

Let n and p satisfy (1.2). Assume that \(u_{\varepsilon }\) is a weak solution to (3.8) in \({{\tilde{Q}}}\). Fix \(Q_{R}(x_{0},\,t_{0})\Subset {{\tilde{Q}}}\) with \(R\in (0,\,1)\), and let (5.1) and (5.4), and \(\Vert \nabla u_{\varepsilon }\Vert _{L^{p}(Q_{R})}\le M_{1}\) be in force. Here the constant \(M_{1}\in (1,\,\infty )\) is independent of \(\varepsilon \in (0,\,1)\). Then, for each \(r\in (0,\,R)\), there exists a constant \(C\in (1,\,\infty )\), depending at most on n, p, \(\lambda \), \(\Lambda \), K, \(M_{0}\), \(M_{1}\), R, and r, such that

$$\begin{aligned}\mathop {\mathrm {ess~sup}}_{Q_{r}}V_{\varepsilon }\le C.\end{aligned}$$

Proof

Choose and fix \(q\in (q_{\mathrm c},\,\infty )\cap [2,\,\infty )\). By Proposition 5.2, there exists a constant \({{\tilde{C}}}\in (1,\,\infty )\), depending at most on n, p, q, \(\lambda \), \(\Lambda \), K, \(M_{0}\), \(M_{1}\), R, and r, such that we have

$$\begin{aligned}\Vert V_{\varepsilon } \Vert _{L^{q}(Q_{{\tilde{R}}})}\le {{\tilde{C}}},\quad \text {where}\quad {{\tilde{R}}}:=\frac{R+r}{2}.\end{aligned}$$

Combining this bound with Proposition 5.3, we conclude Theorem 5.4. \(\square \)

6 The proof of main theorem

6.1 A priori continuity estimates of truncated gradients

We infer a basic result of a priori Hölder estimates.

Theorem 6.1

Fix \(\delta \in (0,\,1)\), and let \(\varepsilon \in (0,\,\delta /8)\). Assume that \(u_{\varepsilon }\) is a weak solution to (3.8) in \({{\tilde{Q}}}\), and let (5.4) be in force for a fixed \(Q_{R}=Q_{R}(x_{0},\,t_{0})\Subset {{\tilde{Q}}}\). Also, let the positive number \(\mu _{0}\) satisfy

$$\begin{aligned} \mathop {\mathrm {ess~sup}}_{Q_{r}} V_{\varepsilon }\le \delta +\mu _{0}, \end{aligned}$$
(6.1)

where \(Q_{r}=Q_{r}(x_{*},\,t_{*})\Subset Q_{R}\) with \(r\in (0,\,1)\). Then, there hold

$$\begin{aligned}\left|{{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon }) \right|\le \mu _{0}\quad \text {in}\quad Q_{r_{0}}(x_{*},\,t_{*}),\end{aligned}$$

and

$$\begin{aligned}\left|{{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon }(X_{1}))-{{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon }(X_{2})) \right|\le C\left( \frac{d_{\textrm{p}}(X_{1},X_{2})}{r_{0}}\right) ^{\alpha }\mu _{0}\end{aligned}$$

for all \(X_{1}=(x_{1},\,t_{1})\), \(X_{2}=(x_{2},\,t_{2})\in Q_{r_{0}/2}(x_{*},\,t_{*})\). Here the radius \(r_{0}\in (0,\,r/4)\), the exponent \(\alpha \in (0,\,1)\), and the constant \(C\in (1,\,\infty )\) depend at most on n, p, \(\lambda \), \(\Lambda \), K, \(\mu _{0}\), and \(\delta \).

As long as (6.1) is guaranteed, Theorem 6.1 is shown as a special case of [\(z\mapsto \nabla ^{2} E^{\varepsilon } (z)\) over an annulus region \(\{\delta \le |z|\le M\}\) for some fixed constant M. Hence, (1.15) and (1.16), which are never used in Sects. 35, are required in the proof of [28, Proposition 2.10].

In any possible cases, it is proved that the limit

exists for every \((x_{0},\,t_{0})\in Q_{r_{0}}(x_{*},\,t_{*})\). Here the radius \(r_{0}\) and the exponent \(\alpha \) are determined by [28, Propositions 2.9–2.10], depending on n, p, \(\lambda \), \(\Lambda \), K, \(\omega _{1}\), \(\omega _{p}\), \(\mu _{0}\), and \(\delta \). Moreover, this limit satisfies

for any radius \(\rho \in (0,\,r_{0}]\). Theorem 6.1 follows from this growth estimate.

To carry out classical arguments, including De Giorgi’s truncation and a comparison argument in [28, Propositions 2.9–2.10], we often use the assumption \(\delta <\mu \). Here \(\mu \) is a positive parameter that satisfies

$$\begin{aligned}\mathop {\mathrm {ess~sup}}_{Q}\,|{{\mathcal {G}}}_{\delta ,\,\varepsilon }(\nabla u_{\varepsilon })|\le \mu \end{aligned}$$

for a fixed cylinder \(Q\Subset Q_{r}(x_{*},\,t_{*})\). The condition \(\delta <\mu \) is not restrictive, since otherwise \({{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon })\equiv 0\) in Q, and hence there is nothing to show in the proof of Theorem 6.1. This truncation trick is substantially different from the intrinsic scaling arguments. It should be emphasized that the intrinsic scaling argument relies on some uniform parabolicity of the p-Laplace operator, which can be measured by an analogous ratio defined as in the left-hand side of (1.7). To the contrary, for (3.8), its uniform parabolicity basically depends on the value of \(V_{\varepsilon }\). For this reason, in the proof of [28, Theorem 2.8 and Propositions 2.9–2.10], we avoid rescaling a weak solution \(u_{\varepsilon }\). Instead, we make the truncation with respect to the modulus \(V_{\varepsilon }\), which enables us to treat (3.8) as it is some sort of uniform parabolic equation, depending on the truncation parameter \(\delta \in (0,\,1)\). This is verified by the assumption \(\delta <\mu \), which plays an important role in deducing non-trivial energy estimates [28, Lemmata 3.2–3.3].

6.2 Proof of main theorem

We conclude the paper by giving the proof of Theorem 1.2.

Proof

We fix parabolic cylinders \(Q_{(0)}\Subset Q_{(1)}\Subset Q_{(2)} \Subset Q_{(3)} \Subset Q_{(4)}\Subset \Omega _{T}\) arbitrarily, and we claim the local Hölder continuity of \({{\mathcal {G}}}_{2\delta }(\nabla u)\) in \(Q_{(0)}\). We abbreviate

$$\begin{aligned}d:=\min \left\{ \,\mathop {\textrm{dist}_{\mathrm p}}\left( \partial _{\textrm{p}}Q_{(k+1)},\,\partial _{\textrm{p}}Q_{(k)}\right) \,\,\bigg |\,\, k=0,\,1,\,2,\,3\,\right\} >0.\end{aligned}$$

By Proposition 4.1, we have \(\Vert u\Vert _{L^{\infty }(Q_{(3)})}\le M_{0}\), where the constant \(M_{0}\in (1,\,\infty )\) depends at most on n, p, s, \(\Vert u\Vert _{L^{s}(Q_{(4)})}\), and d. For each fixed \(\varepsilon \in (0,\,\delta /8)\), let \(u_{\varepsilon }\) be the weak solution of (3.9) with \(\Omega _{T}\) replaced by \(Q_{(3)}\). By Proposition 3.3, there exists a constant \(M_{1}\) such that we have

$$\begin{aligned}\Vert \nabla u_{\varepsilon }\Vert _{L^{p}(Q_{(2)})}\le M_{1}.\end{aligned}$$

Also, we are allowed to choose a sequence \(\{\varepsilon _{j}\}_{j}\), satisfying \(\varepsilon _{j}\rightarrow 0\) as \(j\rightarrow \infty \), such that \(\nabla u_{\varepsilon }\rightarrow \nabla u\) a.e. in \(Q_{(2)}\). In particular, it follows that

$$\begin{aligned} {{\mathcal {G}}}_{2\delta ,\,\varepsilon _{j}}(\nabla u_{\varepsilon _{j}})\rightarrow {{\mathcal {G}}}_{2\delta }(\nabla u) \end{aligned}$$
(6.2)

a.e. in \(Q_{(2)}\). We should note \(\Vert u\Vert _{L^{\infty }(Q_{(2)})}\le M_{0}\) by Corollary 4.4. Therefore, we can apply Theorem 5.4 to find the constant \(\mu _{0}\in (1,\,\infty )\), which depends at most on n, p, \(\lambda \), \(\Lambda \), K, \(M_{0}\), \(M_{1}\), and d, such that (6.1) holds for any \(Q_{r}(x_{*},\,t_{*})\Subset Q_{(1)}\) with \(r\in (0,\,1)\). By Theorem 6.1 and a standard covering argument, we can apply the Arzelá–Ascoli theorem to \({{\mathcal {G}}}_{2\delta ,\,\varepsilon }(\nabla u_{\varepsilon })\in C^{0}(Q_{(0)};\,{{\mathbb {R}}}^{n})\). As a consequence, we may let the convergence (6.2) hold uniformly in \(Q_{(0)}\) by taking a subsequence if necessary. Hence, for each fixed \(\delta \in (0,\,1)\), \({{\mathcal {G}}}_{2\delta }(\nabla u)\) is Hölder continuous in \(Q_{(0)}\Subset \Omega _{T}\), which completes the proof of \({{\mathcal {G}}}_{2\delta }(\nabla u)\in C^{0}(\Omega _{T};\,{{\mathbb {R}}}^{n})\).

By the definition of \({{\mathcal {G}}}_{\delta }\), it is easy to check that \(\{{{\mathcal {G}}}_{2\delta }(\nabla u)\}_{\delta \in (0,\,1)}\subset C^{0}(\Omega _{T};\,{{\mathbb {R}}}^{n})\) is a Cauchy net, and therefore this has a uniform convergence limit \(v_{0}\in C^{0}(\Omega _{T};\,{{\mathbb {R}}}^{n})\) as \(\delta \rightarrow 0\). Combining with \({{\mathcal {G}}}_{2\delta }(\nabla u)\rightarrow \nabla u\) a.e. in \(\Omega _{T}\), we conclude \(\nabla u=v_{0}\in C^{0}(\Omega _{T};\,{{\mathbb {R}}}^{n})\). \(\square \)