Introduction

Martensitic materials are well known to exhibit rich microstructures that are associated with the solid–solid phase transformation [1, 2]. It is widely believed that the formation of those patterns leads to an energy barrier which plays a crucial role in the hysteresis and reversibility of the phase transformation itself [3, 4]. Mathematically, these microstructures are typically modelled within the framework of the phenomenological theory of martensite [5], i.e., as minimizers of continuum elasticity functionals of the form

$$\begin{aligned} \int _{\Omega }W(\nabla u,T)\,\mathrm{{d}}x, \end{aligned}$$
(1.1)

where \(\Omega\) denotes the reference domain and W is a free elastic energy density which reflects the crystallography of the phase transformation and depends on the deformation gradient and the temperature. Here we work at fixed temperature below the transformation temperature, and we drop it from the notation. This modelling ansatz has proven successful to predict certain properties of the microstructures such as orientations of interfaces. However, it is often not sufficient to predict length scales of multiscale patterns. Therefore, one often introduces a regularizing surface energy term that penalizes interfaces between regions of different martensitic variants. Starting with the work by Kohn and Müller in the early 90s [6, 7], there has been a huge body of literature on the understanding of minimizers of such functionals in simplified settings, in particular scaling laws for the minimal energy (see for example [8,9,10,11,12] and references therein). These results generally indicate that for certain materials one expects rather uniform structures close to interfaces, while for others, complicated multiscale branching patterns are predicted. This is in accordance with experimental results (see e.g. [13]). However, only in very few and restricted situations, there are results on more quantitative properties of minimizers, such as periodicity or self-similarity (see [14,15,16]).

In this note, we report results on the modelling and quantitative study of special microstructures, so called needle-type patterns, building in particular on the recent works [17,18,19]. Needle-like patterns consist of laminated structures where one variant thins out close to the interface. These patterns are often found in experiments, and appear to be rather stable. It is remarkable that they occur in very different materials, including shape-memory alloys (for instance, NiAl) and perovskites (for instance, YBaCuO), at very different length-scales ranging from a few nanometers to tenths of micrometers, and at both, martensite/martensite and martensite/austenite interfaces (see e.g. [13, 20,21,22]).

It has been found that models based on finite elasticity are well-suited to describe such needle patterns, while linearized elasticity is not able to predict the relevant length scales. This has been observed in various numerical experiments (see [17] and the references discussed there) and has been made rigorous in terms of an energy scaling law for the minimal energy [18]. Therefore, we deal only with numerical experiments in the context of finite elasticity. We focus on three materials in which needle-like patterns have often been observed, namely NiAl, CuAlNi and YBaCuO. We study the dependence of the needle shape on various parameters, in particular the elastic constants (the anisotropy ratio and the Poisson ratio), the eigenstrain, and the volume fraction of the laminate. To obtain quantitative results, a careful and physically reasonable choice of the free energy density W is essential. Therefore, after setting the notation in section “Geometry of Needles”, we will discuss our choice of the free energy density in section “Energy Density”. The numerical implementation is described in section “Shape Optimization, Numerical Implementation”, and the results are presented in section “Results”.

Geometry of Needles

We focus on the structure of needle-type interfaces, and make an ansatz which in particular excludes branching of the needles. It is widely believed that branched patterns of two martensitic variants are favored if the transition layer is sufficiently wide compared to the height of the laminate, see e.g. [13]. This can be seen already in the following back-of-the-envelope computation for scalar-valued toy models [7]: Assume that a planar microstructure is mainly described by the out-of plane component of the displacement \(u:R\rightarrow \mathbb {R}\) in a rectangular transition layer \(R:=(0,L)\times (0,H)\), where we assume the interface to be at the left edge of the rectangle (at \(\{x=0\}\)). Within this ansatz, an energy based on the phenomenological theory from [5] is essentially estimated by the Kohn–Müller-type energy (see e.g. [23] for a computation)

$$\begin{aligned} J(u)= & {} \int _R(\partial _1u)^2+\min \left\{ (\partial _2u-\theta )^2,\left( \partial _2u+1-\theta \right) ^2\right\} \,\mathrm{{d}}x\nonumber \\{} & {} +\varepsilon |\partial _2\partial _2u|(R), \end{aligned}$$

where the preferred gradients \((0,\theta )\) and \((0,-1+\theta )\) represent the two variants of martensite, and the second term is to be understood distributionally. Roughly speaking, for functions u with \(\partial _2u\in \{-1+\theta ,\theta \}\) almost everywhere, this term can be computed by integrating \(N(x_1)\) over (0, L), where \(N(x_1)\) denotes the number of interfaces on the slice \(\{x_1\}\times (0,H)\) (the number of times that \(\partial _2u\) flips from \(\theta\) to \(-1+\theta\) or vice versa). To model the mesoscopic interface, we assign a boundary condition \(u(0,\cdot )=0\), and to model the laminate away from the interface, we assign periodic boundary conditions in vertical direction \(u(\cdot ,0)=u(\cdot ,H)\) and a Dirichlet boundary condition corresponding to a simple laminate, i.e.,

$$\begin{aligned} u(L, x_2)={\left\{ \begin{array}{ll} \theta x_2&{}\text { for } x_2\in (0,\theta ),\\ (-1+\theta )x_2+1-\theta &{}\text { for } x_2\in (\theta ,1). \end{array}\right. } \end{aligned}$$

There is a large body of literature on this model, and the scaling law of the minimal energy in various settings is well-understood. It has been shown that many properties of more general vectorial models (see e.g. [8, 10, 11, 24] and the references therein) are already captured by such toy models. We refer to [25] for a recent discussion in relation with experimental images.

The test functions that have been used to prove the upper bound in the energy scaling laws quantify the expectation that for small values of the surface energy \(\varepsilon\), branched patterns are expected, while for large values of \(\varepsilon\), optimal structures are more uniform. We focus here on the case of small \(\varepsilon\) where complex pattern formation is expected. More precisely, within the model described above, for branching-type patterns with periodic boundary conditions (see the sketch in Fig. 1) the energy scales like \(\varepsilon L+\frac{\theta ^2H^3}{L}\) (see e.g. [10, Lemma 5.1] with \(\eta =\theta H\) for a building block and the branching-type constructions described thereafter, as well as the references given there). On the other hand, simple needle-type structures have an energy scaling like \(\varepsilon L+\frac{\theta ^2H^3}{\ell }+\theta ^2\ell H\), where \(\ell\) is the tapering length of the needle, or equivalently of the transition region (see also [4, 23] for an analogous computation in the context of linearized elasticity). Therefore, in terms of energy scaling, needles are competitive compared to branching if \(H\sim \ell \sim L\), and can thus be even energetically preferred, depending on the precise constants.

Fig. 1
figure 1

Sketch of a laminate ending in a needle (left) and of a branching pattern (right). As both patterns are periodic, in the following we focus only on one period. The macrointerface is drawn vertical for simplicity, the true slope has an angle of order \(\delta \theta\) with the vertical

In this note, we continue and complement the recent studies in [17, 19] and consider a martensitic macrotwin where a laminate of two variants of martensite of volume fractions \(\theta\) and \(1-\theta\) meets a rotated martensitic variant at a straight interface (see Fig. 1). The model is vectorial and geometrically nonlinear. The two variants are given by eigenstrains \(U^1\) and \(U^2\), which we can assume to be of the form (see [26, Section 5])

$$\begin{aligned} U^1:=\text {Id}+\delta e_1\otimes e_2\text {\quad and \quad } U^2:=\text {Id}-\delta e_1\otimes e_2 \end{aligned}$$
(2.1)

where \(\delta\) measures the strain; relevant values are given in Table 1. These variants can form a laminate with (vertical) normal \(e_2\), and a macrotwin as in Fig. 1 with a normal vector parallel to \(e_1+\delta \theta e_2\), i.e., a vector rotated by an angle of order \(\delta \theta\) with respect to the (horizontal) vector \(e_1\) (for details see e.g. [18, Lemma 2.1]).

Table 1 Material parameters in three dimensions and after plane-stress reduction (rounded)

To determine the optimal needle geometry within this ansatz, we perform a two-level optimization: Given the boundary curves of the needle, we minimize the elastic energy in the transition layer [see (4.1)], the result only depends on the shape of the needle (see Fig. 2). The optimal needle structure is then the one for which this minimal elastic energy is itself minimized. The specific boundary conditions and the class of ansatz functions are presented in section “Shape Optimization, Numerical Implementation”. An essential ingredient here is the free energy density, that is discussed in section “Energy Density”.

As mentioned above, we restrict ourselves to the setting of nonlinear elasticity. Although the linear theory has proven useful to explain higher order effects (see e.g. [27] and the references therein), it has been found that approximations in terms of linearized elasticity are not appropriate to understand the qualitative behaviour in terms of the tapering length scale (see [18] for a heuristic as well as a rigorous argument). Roughly speaking, the main factor determining the length scale of the pattern seems to be the change in rotation: Close to the macrotwin interface the relevant rotation is the one which is determined by the other rank-one connection between the martensitic variants, which differs from the one used in the laminates by an angle of order \(\delta \theta\).

Fig. 2
figure 2

Sketch of the shape-optimization problem in a periodic cell. The thick white curves are the boundaries of the needle which are optimized, together with the length \(\ell\) and the shift \(\Delta\) of the needle

Energy Density

We shall write the energy density \(W^i:\mathbb {R}^{2\times 2}\rightarrow [0,\infty ]\) of the i-th martensitic variant, with eigenstrain \(U^i\) as in (2.1), in terms of the energy density \(W_A\) of the austenite, which we take as reference configuration, as

$$\begin{aligned} W^i(F):=W_A(F(U^i)^{-1}). \end{aligned}$$
(3.1)

Therefore, it suffices to identify a suitable function \(W_A\). We start listing our requirements on this function, then discuss some proposals present in the literature and motivate our choice.

Requirements on\(W_A\) The energy density \(W_A\) needs to satisfy the invariance property

$$\begin{aligned} W_A(QR^TFR)=W_A(F) \text { whenever }\,\, Q\in \textrm{SO}(2),\, F\in \mathbb {R}^{2\times 2}, \,R\in P_A \end{aligned}$$
(3.2)

where \(P_A\subseteq \text {O}(2)\) is the point group of the austenite. In our case \(P_A\) is the cubic group. We require the growth condition

$$\begin{aligned} W_A(F)\rightarrow \infty \text { if either }\det F\rightarrow 0 \,\text { or } F\rightarrow \infty \end{aligned}$$
(3.3)

with \(W_A(F)=\infty\) if \(\det F\le 0\), in order to accommodate for non-interpenetrability of matter. The energy density \(W_A\) should be minimized at the identity. After adding an irrelevant constant and using (3.2) this is the same as

$$\begin{aligned} W_A(F)\ge 0, \text { with } W_A(F)=0 \text { if and only if } F\in \textrm{SO}(2). \end{aligned}$$
(3.4)

We expect microstructure to only appear due to the presence of the different phases or variants. In particular, if the material is entirely in the austenitic phase, then energy minimization should not generate spontaneous microstructure or spurious oscillations. Therefore we require \(W_A\) to be polyconvex, in the sense that

$$\begin{aligned} W_A(F)=g(F,\det F) \text { for all }F\in \mathbb {R}^{2\times 2} \end{aligned}$$
(3.5)

for some convex and lower-semicontinuous function \(g:\mathbb {R}^{2\times 2}\times \mathbb {R}\rightarrow [0,\infty ]\). We recall that polyconvexity implies lower semicontinuity of variational problems of the form \(\int _\Omega (W_A(Du) -f\cdot u) \mathrm{{d}}x\), and is therefore crucial in proving existence of minimizers. Further, polyconvexity implies the Legendre–Hadamard condition, which in turn is equivalent to rank-one convexity, i.e., to the fact that the function

$$\begin{aligned} t\mapsto W_A(F+t a\otimes b) \end{aligned}$$
(3.6)

is convex for any \(F\in \mathbb {R}^{2\times 2}\), \(a,b\in \mathbb {R}^2\). We refer to [28] for a mathematical background on these concepts.

Finally, we require \(W_A\) to reproduce the experimentally known elastic constants of austenite around the identity, in the sense that

$$\begin{aligned} W_A(\textrm{Id}+ \epsilon ) = \frac{1}{2} \sum _{ijkl} \mathbb {C}_{ijkl} \epsilon _{ij}\epsilon _{kl} + o(|\epsilon |^2) \text { for } \epsilon \in \mathbb {R}^{2\times 2}_\textrm{sym}, \epsilon \rightarrow 0 \end{aligned}$$
(3.7)

where \(\mathbb {C}\in \mathbb {R}^{2\times 2\times 2\times 2}\) is the elasticity tensor. In the cubic case this reduces to

$$\begin{aligned} W_A(\textrm{Id}+ \epsilon )= & {} \frac{1}{2} \mathbb {C}_{12} (\epsilon _{11}+\epsilon _{22})^2 \nonumber \\{} & {} + 2\mathbb {C}_{44}\epsilon _{12}^2 + \mathbb {C}'(\epsilon _{11}^2+\epsilon _{22}^2)+ o(|\epsilon |^2), \end{aligned}$$
(3.8)

where the elastic constants \(\mathbb {C}_{12}\), \(\mathbb {C}_{44}\) and \(\mathbb {C}':=(\mathbb {C}_{11}-\mathbb {C}_{12})/2\) are all positive in the relevant situation (see Table 1). The cubic anisotropy of the material is normally quantified by the parameter A defined by

$$\begin{aligned} A:=\frac{2\mathbb {C}_{44}}{\mathbb {C}_{11}-\mathbb {C}_{12}}=\frac{\mathbb {C}_{44}}{\mathbb {C}'}, \end{aligned}$$
(3.9)

one can easily check that for isotropic materials \(A=1\).

In summary, our aim is an expression for \(W_A\) which has the invariances stated in (3.2), is minimized on rotations as in (3.4), is polyconvex in the sense of (3.5), linearizes to (3.8) around the identity, and has sufficient growth and regularity to permit easy and stable computations.

Some candidates from the literature In the literature, one often uses a linear elastic energy of the form \(\frac{1}{2} \sum _{ijkl}\mathbb {C}_{ijkl} \epsilon _{ij}\epsilon _{kl}\) evaluated on the nonlinear elastic strain \(\frac{1}{2} (F^TF-\textrm{Id})\), resulting in expressions of the type

$$\begin{aligned} W_\mathbb {C}(F):=\frac{1}{8} \sum _{ijkl} \mathbb {C}_{ijkl} (F^TF-\textrm{Id})_{ij} (F^TF-\textrm{Id})_{kl}, \end{aligned}$$
(3.10)

see for example [29, Eq. (40)] or [30, Section 2]. However, these expressions are not rank-one convex, and therefore not polyconvex. To see this, we fix any unit vector b and consider the rank-one line

$$\begin{aligned} t\mapsto F_t:=\textrm{Id}+t b\otimes b. \end{aligned}$$
(3.11)

One computes \(F_t^TF_t-\textrm{Id}=(t^2+2t) b\otimes b\), so that (3.10) gives

$$\begin{aligned} W_\mathbb {C}(F_t)=\frac{1}{8} (t^2+2t)^2 \mathbb {C}b\otimes b\otimes b\otimes b. \end{aligned}$$
(3.12)

Since \(\mathbb {C}\) defines a quadratic form which is strictly positive on nonzero symmetric matrices, \(\mathbb {C}b\otimes b\otimes b\otimes b>0\). However, the polynomial \(p(t):=(t^2+2t)^2 = (t+2)^2 t^2\) is not convex. Therefore \(W_\mathbb {C}\) is not rank-one convex, and not polyconvex, for any sensible choice of the elastic coefficients \(\mathbb {C}\). One expects that this lack of convexity leads to numerical instabilities if this part of the space of strains (which is relatively far from the identity, as one can see from \(\det F_{-1/2}=1/2\)) is explored in a numerical simulation; apparently this was not the case in the cited papers.

A polyconvex expression with cubic symmetry was proposed by Kambouchev et al. in [31, Eq. (9)]. In 3D, they  define

$$\begin{aligned} W_\textrm{KFR}^{(3)}(F) := & -2\alpha _1\ln \det F +\alpha _2(\det F)^2 \\ & +\alpha _3\sum _{i=1}^3|Fe_i|^4 +\alpha _4 |F|^2 \end{aligned}$$
(3.13)

and show that if the four coefficients are nonnegative then \(W_\textrm{KFR}^{(3)}\) is polyconvex. Condition (3.4) implies \(-\alpha _1+\alpha _2+2\alpha _3+\alpha _4=0\). Comparing with the cubic linearization (3.8) they obtain \(\alpha _3=(\mathbb {C}'-\mathbb {C}_{44})/4\) and \(\alpha _4=(2\mathbb {C}_{44}-\mathbb {C}')/2\), which are nonnegative only if the anisotropy ratio fulfills

$$\begin{aligned} \frac{1}{2} \le A \le 1 \end{aligned}$$
(3.14)

(see [31, Eq. (14)]). This condition is fulfilled by many metals but not by the shape-memory alloys of interest here, which have larger values of A (see Table 1). In two dimensions, one would equivalently write

$$\begin{aligned} W_\textrm{KFR}^{(2)}(F) := & -2\alpha _1\ln \det F +\alpha _2(\det F)^2 \\ & +\alpha _3 \big(|Fe_1|^4+|Fe_2|^4\big) +\alpha _4 |F|^2 \end{aligned}$$
(3.15)

with similar results. We show that the relevant condition \(A\le 1\) is indeed necessary for polyconvexity of (3.15). It suffices to consider matrices of the form \(F_t=\textrm{Id}+t e_2\otimes e_1\), which give \(W_\textrm{KFR}^{(2)}(F_t)=\alpha _2+\alpha _3(1+(1+t^2)^2) +\alpha _4(2+t^2)\). For large t the leading-order term is \(\alpha _3t^4\), which is convex only if \(\alpha _3\ge 0\). Therefore the condition \(\alpha _3\ge 0\), which implies \(A\le 1\), is necessary for rank-one convexity and, hence, for polyconvexity.

A variant of this formula was used in [17], precisely,

$$\begin{aligned}W_{2020}(F) := & \,\, a_1 \, |F|^4 + a_2 \, (\det F)^2 - 2a_3 \, \log (\det F) \\ & + a_4 \, \big (|Fe_1|^4+|Fe_2|^4\big ).\end{aligned}$$
(3.16)

The change was that the first term is fourth-order and not quadratic, so that it matches the order of the second and the fourth one. The expression in (3.16) is obviously polyconvex if the four coefficients are nonnegative. Even more, one can show that it is polyconvex provided \(a_1\ge 0\), \(a_2\ge 0\), \(a_3\ge 0\), and \(a_4\ge -\frac{2}{3} a_1\). To see this, it suffices to first check (for example, computing the Hessian) that the function \(g:\mathbb {R}^2\rightarrow \mathbb {R}\), \(g(x,y):=x^4+y^4+6x^2y^2\), is convex, and then to observe that by the chain rule this implies convexity of \(F\mapsto g(|Fe_1|,|Fe_2|)=3|F|^4- 2|Fe_1|^4-2|Fe_2|^4\).

The difference from (3.15) to (3.16) however improves the range of admissible values of A. In fact, condition (3.4) implies \(4a_1+a_2-a_3+2a_4=0\). Linearizing (3.16), we obtain that (3.8) is equivalent to

$$\begin{aligned} \begin{pmatrix} \mathbb {C}_{44}\\ \mathbb {C}'\\ \mathbb {C}_{12} \end{pmatrix} = 4\begin{pmatrix} 2a_1+a_4\\ 2a_1+2a_4 \\ 2a_1+a_2 \end{pmatrix}, \end{aligned}$$
(3.17)

which readily shows that the largest value of A that can be attained in the polyconvex range is \(A=2\). This improvement over (3.14) is not sufficient for the present purposes. Indeed, in [17], using values for the elastic constants appropriate for NiAl listed in Table 1, the coefficients \(a_1=11.56\mathrm {\,GPa}\), \(a_2=-17.44\mathrm {\,GPa}\), \(a_3=10.04\mathrm {\,GPa}\), \(a_4=-9.38\mathrm {\,GPa}\) were obtained (up to an irrelevant global factor of 2). Clearly they are not in the range in which the expression above is guaranteed to be polyconvex. In the numerical computations presented in [17], as in the case of the papers based on (3.10), this did not cause any difficulty.

In closing this review we show that the expression in (3.16) is not rank-one convex whenever \(a_4<-\frac{2}{3} a_1\), which is the same as \(A>2\). Consider for some \(\beta >0\) the rank-one line

$$\begin{aligned} F_t= \begin{pmatrix} 1/\beta &{} -1/\beta \\ 0 &{} \beta \end{pmatrix} \left( \textrm{Id}+ t \begin{pmatrix} 1\\ -1 \end{pmatrix}\otimes \begin{pmatrix} 1&1 \end{pmatrix}\right) = \begin{pmatrix} (1+2t)/\beta &{} (2t-1)/\beta \\ -\beta t &{} \beta (1-t) \end{pmatrix}. \end{aligned}$$
(3.18)

Clearly \(\det F_t=1\) for all t and \(\beta\), so that the two terms depending on the determinant are irrelevant, and \(W_{2020}(F_t)\) is a fourth-order polynomial in t. One computes

$$\begin{aligned} \left. \frac{d^2}{dt^2}W_{2020}(F_t)\right| _{t=0}= \frac{64a_1+96a_4}{\beta ^4}+48a_1+56a_4+(16a_1+12a_4)\beta ^4. \end{aligned}$$
(3.19)

If \(a_4<-\frac{2}{3}a_1\) then the first coefficient is negative, and choosing \(\beta\) sufficiently small leads to a negative second derivative in a rank-one direction.

Construction of \(W_A\) In [19] we developed a new expression that is able to obtain polyconvexity for arbitrarily large values of the cubic anisotropy parameter A. It depends on four independent small parameters \(\rho _1\), \(\rho _2\), \(\rho _3\), \(\rho _4\in (0,1)\) via the functions \(h_\rho :[0,\infty )\rightarrow \mathbb {R}\),

$$\begin{aligned} h_\rho (x):={\left\{ \begin{array}{ll} 0, &{} \text { if } x<1-\rho ,\\ (x-(1-\rho ))^2, &{} \text { if } x\ge 1-\rho ; \end{array}\right. } \end{aligned}$$
(3.20)

\(g_\rho :[0,\infty )^2\rightarrow \mathbb {R}\),

$$\begin{aligned} g_\rho (x,y):={\left\{ \begin{array}{ll} (1-(1-\rho )^2) x^2, &{} \text { if } y< (1-\rho ) x,\\ x^2+y^2-2(1-\rho ) xy, &{} \text { if } (1-\rho ) x \le y \le \frac{1}{1-\rho } x,\\ (1-(1-\rho )^2) y^2, &{} \text { if } y>\frac{1}{1-\rho }x; \end{array}\right. } \end{aligned}$$
(3.21)

and \(f_\rho :\mathbb {R}\rightarrow [0,\infty ]\),

$$\begin{aligned} f_\rho (x):={\left\{ \begin{array}{ll} \infty , &{} \text { if } x\le 0,\\ (\log (\rho /x))^2, &{}\text { if } 0<x<\rho ,\\ 0, &{} \text { if } x\ge \rho . \end{array}\right. } \end{aligned}$$
(3.22)

One can easily check that all three functions are convex, continuously differentiable (whenever they are finite), lower semicontinuous, and that the first two are nondecreasing. Finally, for any \(\rho \in (0,1)^4\) and \(b\in (0,\infty )^4\) we set

$$\begin{aligned} \begin{aligned} W_A(F):=&b_1\left( \frac{1}{2} g_{\rho _1}\Bigl (\Bigl |\frac{Fe_1+Fe_2}{\sqrt{2}}\Bigr |,\Bigl |\frac{Fe_1-Fe_2}{\sqrt{2}}\Bigr |\Bigr )-\rho _1 \det F \right) \\&+b_2\left( 2\rho _2(1-\rho _2)+ h_{\rho _2}(|Fe_1|)+h_{\rho _2}(|Fe_2|)-2\rho _2 \det F\right) \\&+b_3 \left( 2\rho _3(1-\rho _3)+2 h_{\rho _3}(|F|/\sqrt{2})-2\rho _3 \det F\right) \\&+b_4f_{\rho _4}(\det F). \end{aligned} \end{aligned}$$
(3.23)

This function is polyconvex, has the desired invariance and minimizers. Computing the Taylor series to second order and comparing to (3.8) one can see that

$$\begin{aligned} \begin{pmatrix} \mathbb {C}_{44}\\ \mathbb {C}'\\ \mathbb {C}_{12} \end{pmatrix} = (\textrm{Id}+ M_\rho ) \begin{pmatrix} b_1\\ b_2\\ b_3 \end{pmatrix}, \text { where } M_\rho :=\begin{pmatrix} 0&{}2\rho _2 &{} 2\rho _3\\ \rho _1 &{} \rho _2 &{} 2\rho _3\\ -\rho _1 &{} -2\rho _2 &{} -3\rho _3 \end{pmatrix}. \end{aligned}$$
(3.24)

The matrix \(\textrm{Id}+ M_\rho\) is, for \(\rho\) small, invertible. Further, for any positive value of the three elastic constants one obtains that if the \(\rho\)’s are chosen sufficiently small then the resulting b’s are nonegative, and \(W_A\) is polyconvex. The smallness of \(\rho\) depends, however, on \(\mathbb {C}\). Examples are given in Table 2 below.

Table 2 Choices of admissible parameters entering the energy (3.23) for the materials of interest here

Shape Optimization, Numerical Implementation

Parametrization of Subdomains

In what follows we will decompose the energy in the two phase components with fixed topology using the domain decomposition approach developed in [17, 19]. To this end we consider a physical reference configuration \(\Omega = \Omega ^1\cup \Omega ^2\) where the actual domains of the martensitic variants \(\Omega ^1\), \(\Omega ^2\) are parametrized over subdomains \(\hat{\Omega }^1\), \(\hat{\Omega }^2\) of a computational reference configuration \(\hat{\Omega }\) via a parameter-dependent bijective transformation \(\psi [\alpha ]:\hat{\Omega }\rightarrow \Omega\) with \(\Omega ^i = \psi [\alpha ](\hat{\Omega }^i)\), where \(\alpha\) is a small set of parameters (Fig. 3).

Fig. 3
figure 3

The computational domain \(\hat{\Omega }\) (left) is transformed by \(\psi [\alpha ]\) into the physical reference domain \(\Omega\) (middle), which is then deformed elastically by \(\phi\) to the deformed configuration \(\phi (\Omega )\) (right)

Given the two eigenstrains \(U^1\) and \(U^2\) [see (2.1)], the elastic energy decomposes correspondingly

$$\begin{aligned} {\mathcal E}[\phi ,\Omega ^1, \Omega ^2] :=\int _{\Omega ^1} W^1(D\phi ) \textrm{d}x + \int _{\Omega ^2} W^2(D\phi ) \textrm{d}x, \end{aligned}$$
(4.1)

with \(W^i\) defined by (3.1) and (3.23).

Given the parametrization \(\psi [\alpha ]\) of the physical domains the energy can be rewritten as

$$\begin{aligned} \hat{\mathcal {E}}[\alpha ,\hat{\phi }]&:=\mathcal E[\phi , \psi [\alpha ](\hat{\Omega }^1),\psi [\alpha ](\hat{\Omega }^2)] \nonumber \\&= \sum _{i=1,2} \int _{\psi [\alpha ](\hat{\Omega }^i)} W^i({D}\phi ) \textrm{d}x \nonumber \\&= \sum _{i=1,2} \int _{\hat{\Omega }^i} W^i \left( \hat{{D}}\hat{\phi }(\hat{x}) \left( \hat{{D}}\psi [\alpha ](\hat{x})\right) ^{-1} \right) \det \hat{{D}}\psi [\alpha ](\hat{x}) \textrm{d}\hat{x}, \end{aligned}$$
(4.2)

where \({\hat{x}}\) are computational reference coordinates representing the change of variables \(x = \psi [\alpha ](\hat{x})\) and \(\hat{D}\) is the Jacobian in these coordinates, and the associated elastic deformation \(\hat{\phi } = \phi \circ \psi [\alpha ]\) in reference coordinates. For the differentiation we have applied the chain rule \(\hat{{D}}\hat{\phi }(\hat{x}) = {D}\phi (\psi [\alpha ](\hat{x})) \hat{D}\psi [\alpha ](\hat{x})\).

We minimize \(\hat{\mathcal {E}}\) with respect to \(\alpha\) and with respect to \(\hat{\phi }\). Minimizing over \(\hat{\phi }\) for fixed \(\alpha\) and thus for fixed geometry reflects the solution of the elastic state equation in reference coordinates \({\hat{x}}\). Let us denote the minimizer by \(\hat{\phi }[\alpha ]\). Minimizing

$$\begin{aligned} \mathcal {J}[\alpha ] :=\hat{\mathcal {E}}[\alpha ,\hat{\phi }[\alpha ]] \end{aligned}$$
(4.3)

over \(\alpha\) turns into solving the actual free boundary problem. Thus, in the presented form the free boundary problems can be considered as a joint minimization with respect to \(\alpha\) and \(\hat{\phi }\). The solution to this elastic shape optimization problem will be described below in section “Optimization”.

The physical domain parametrization \(\psi [\alpha ]\) primarily parametrizes the geometry of the phase boundaries. The parametrization of the interior of the phases can be chosen freely as long as the map** \(\psi [\alpha ]\) is bijective and for implementational purposes sufficiently regular. In what follows, we explicitly describe \(\psi [\alpha ]\) on \(\partial \hat{\Omega }^1\) which in particular also determines the transformation of \(\partial \hat{\Omega }^2\).

In the computational reference configuration \(\partial \hat{\Omega }^1\) is assumed to be polygonal. For each edge \({\hat{e}}\) from \(\hat{a}=(\hat{a}_1,\hat{a}_2)\) to \(\hat{b}=(\hat{b}_1,\hat{b}_2)\) we consider \(\psi [\alpha ]\vert _{{\hat{e}}}\) as a graph over \({\hat{x}}_1\) (which reflects that the needles are horizontally stretched).

Using quadratic polynomials for the parametrization of the upper and lower edge of the needle tip we obtain

$$\begin{aligned} \left. \psi [\alpha ]\right| _{{\hat{e}}} (\hat{x}_1,\hat{x}_2) = a + (b-a) \frac{\hat{x}_1-\hat{a}_1}{\hat{b}_1-\hat{a}_1} + \begin{pmatrix} 0 \\ 1 \end{pmatrix} \kappa (\hat{x}_1-\hat{a}_1)(\hat{x}_1-\hat{b}_1)\,. \end{aligned}$$

The above parameters a and b are given functions of the needle length \(\ell\) and the shift parameter \(\Delta\). Besides \(\ell\) and \(\Delta\) the parameter vector \(\alpha\) includes the parameters \(\kappa\) for the upper and the lower edge.

To define \(\psi [\alpha ]\) not only on \(\partial \hat{\Omega }^1\) and \(\partial \hat{\Omega }\) but also in the interior of the phases (and thus on the whole domain \(\hat{\Omega }\)) we use a harmonic extension approach. We define \(\psi [\alpha ]: \hat{\Omega }\rightarrow \Omega\) by

$$\begin{aligned} \psi [\alpha ] = \mathop {\mathrm {arg\,min}}\limits _{\zeta \in \mathcal {D[\alpha ]}} \int _{\hat{\Omega }} \vert D \zeta \vert ^2 \textrm{d}x, \end{aligned}$$
(4.4)

where \(\mathcal {D[\alpha ]} = \big \{\zeta : \hat{\Omega }\rightarrow \Omega \,\big \vert \, \zeta \vert _{{\hat{e}}} = \psi [\alpha ]\vert _{{\hat{e}}} \text{ on } \text{ all } \text{ edges } e \subset \partial \hat{\Omega }^1 \big \}\).

If \(\psi [\alpha ] - \textrm{Id}\) is small and sufficiently regular on \(\partial \hat{\Omega }^1\), the extended \(\psi [\alpha ]\) is ensured to be bijective.

Finite Element Discretization

We consider piecewise affine finite elements on the computational domain \(\hat{\Omega }\), where \(\partial \hat{\Omega }^1\) is composed of edges of the triangulation \(\mathcal {T}_h\) with vertices \({\hat{x}}_h\), the index h indicating the grid size. The corresponding (vector valued) finite element space is denoted by \(\mathcal {V}_h\).

The extension (4.4) of the phase boundary parametrization on \(\partial \hat{\Omega }^1\) then reads as follows:

$$\begin{aligned} \psi _h[\alpha ] = \mathop {\mathrm {arg\,min}}\limits _{\zeta _h \in \mathcal {D}_h[\alpha ]} \int _{\hat{\Omega }} \vert D \zeta _h \vert ^2 \textrm{d}x, \end{aligned}$$
(4.5)

where \(\mathcal {D}_h[\alpha ] = \{\zeta _h \in \mathcal {V}_h \,\vert \, \zeta _h ({\hat{x}}_h) = \psi [\alpha ] ({\hat{x}}_h) \text{ for } \text{ all } \text{ vertices } {\hat{x}}_h \text{ on } \partial \hat{\Omega }^1 \}\). The solution operator of (4.5) is represented by a fixed matrix map** vertex values on vertices \({\hat{x}}_h\) on \(\partial \hat{\Omega }^1\) to values on all vertices of the triangulation \(\mathcal {T}_h\). This matrix can be straightforwardly computed based on the finite element stiffness matrix associated to Poisson’s problem.

This turns the Euler–Lagrange-equation of

$$\begin{aligned} \hat{\mathcal {E}}_h[\alpha ,\hat{\phi }_h] = \sum _{i=1,2} \int _{\hat{\Omega }^i} {{\,\textrm{W}\,}}^i \left( \hat{{D}}\hat{\phi }_h(\hat{x}) \left( \hat{{D}}\psi _h[\alpha ](\hat{x})\right) ^{-1} \right) \det \hat{{D}}\psi _h[\alpha ](\hat{x}) \textrm{d}\hat{x} \end{aligned}$$
(4.6)

into a standard conforming finite element discretization of the Euler–Lagrange-equation of (4.2) and requires the solution of a nonlinear system of equations in \(\mathcal {V}_h\), performed here via Newton’s method. For fixed \(\alpha\), the minimizer of \(\hat{\mathcal {E}}_h[\alpha ,\cdot ]\) is then denoted by \(\hat{\phi }_h[\alpha ]\) and an evaluation of \(\hat{\mathcal {E}}_h\) then immediately yields the value of the discrete shape functional \(\mathcal {J}_h[\alpha ] :=\hat{\mathcal {E}}_h[\alpha ,\hat{\phi }_h[\alpha ]]\).

Optimization

So far, the free boundary problem is described as an optimization problem of the parametrization \(\psi [\alpha ]\) of the phase boundary as a function of the parameter vector \(\alpha\) subject to the constraint that \(\hat{\phi }\) minimizes the elastic energy given in (4.2). As described above, the finite element discretization of this optimization problem turns into a minimization problem with nonlinear constraints. For the solution of the discrete state equation we use the finite element library FEniCS. For details we refer to [36,37,38,39,40,41,42,43,44].

To evaluate the discrete shape gradient \(\partial _\alpha \mathcal {J}_h[\alpha ]\) via the adjoint calculus we use the additional package Dolfin Adjoint [45,46,47]. As input this only requires the solution of the forward map \(\alpha \mapsto \phi _h[\alpha ]\) via solving the state equation. In fact, Dolfin Adjoint tracks all steps of this computation and automatically configures the adjoint derivative. Based on this shape gradient, a limited memory BFGS method with constraints [48] is used for the actual optimization (with SciPy minimization parameters \(ftol=10^{-16}\), \(gtol = 10^{-12}\) and \(maxiter=200\)).

Results

In this section, we present numerical results for the optimal shapes in different materials.

Figure 4 shows the needle geometries for the three different materials NiAl, CuAlNi and YBaCuO in the physical reference configuration. One clearly observes a significant influence of the material constants on the needle length. If one rescales the optimal needle geometries so that they have the same length, they become very similar. In the rest of this section we discuss the effect of various parameters on the length of the needle.

Fig. 4
figure 4

Top: optimal needle geometries for NiAl, CuAlNi and YBaCuO (from top to bottom, scaled anisotropically with a factor of 0.25 in x-direction) are plotted in physical reference configuration. The length of the needle tip is indicated by triangle pointers and a dotted line. The right boundary of the computational domain is marked by a small white gap with a horizontal continuation on the right. Bottom: Fixing the tip of all needles at the same position and scaling them all to the same length, the different needle contours (NiAl solid red, CuAlNi-dashed green, YBaCuO dotted blue) are compared

Influence of the order parameter\(\delta\) In Fig. 5, we show the dependence of the needle length on the order parameter \(\delta\) for the elastic parameters of NiAl. It is apparent that the needle length has a very strong dependence on the order parameter \(\delta\) and it diverges as \(\delta \rightarrow 0\), as was already discussed in [17, 18]. Indeed, this dependence of the needle length on the order parameter explains most of the variability observed in Fig. 4. The asymmetry reflected by the shift \(\Delta\) of the needle tip gets larger in absolute value for increasing \(\delta\).

Influence of material constants The energy density depends on the three material constants \(\mathbb {C}_{11}\), \(\mathbb {C}_{12}\) and \(\mathbb {C}_{44}\). Neglecting the overall scaling there are two remaining degrees of freedom. In what follows, we investigate the influence of the anisotropy ratio \(A=2\mathbb {C}_{44}/(\mathbb {C}_{11}-\mathbb {C}_{12})\) and the Poisson ratio \(\nu =\mathbb {C}_{12}/\mathbb {C}_{11}\) via modulation of the parameters A and \(\nu\) starting from the material constants \(\mathbb {C}_{11}=115.5\), \(\mathbb {C}_{12}= 45.5\) and \(\mathbb {C}_{44}=110\) of NiAl. Here, we choose fixed \(\rho _1=0.1\), \(\rho _2=0.2\), \(\rho _3=0.1\), \(\rho _4=0.5\), \(b_4=1\). In detail, fixing \(\mathbb {C}_{11}\) and \(\mathbb {C}_{12}\) we set \({\mathbb {C}}_{44} = 0.5A (\mathbb {C}_{11}-\mathbb {C}_{12})\) for varying A. For varying \(\nu\) and fixed A, we hold \(\mathbb {C}_{11}\) fixed and set \({\mathbb {C}}_{12} = \nu \mathbb {C}_{11}\) and \({\mathbb {C}}_{44}=0.5A(\mathbb {C}_{11}-\bar{\mathbb {C}}_{12})\). The appropriate \(b_i\) are then calculated via Eq. (3.24). The results are shown in Fig. 6.

Fig. 5
figure 5

Optimal shape length \(\ell\) and shift \(\Delta\) for different values of the order parameter \(\delta\)

Fig. 6
figure 6

Optimal needle length \(\ell\) as a function of the anisotropy parameter A (left) and of the Poisson ratio \(\nu\) (right)

Influence of the volume fraction\(\theta\) Figure 7 shows that needle tips are shorter for larger values of the volume fraction \(\theta\), again using the elastic parameters of NiAl. Even though for varying values of \(\theta\) the underlying finite element mesh is updated to avoid over stretching of elements the general profile of the dependence of \(\ell\) on \(\theta\) is clearly visible. The needles become longer as \(\theta\) decreases, but—at variance with what has been discussed above in the case of \(\delta\)—we do not expect the length of the needle to diverge as \(\theta \rightarrow 0\).

Fig. 7
figure 7

Optimal needle length \(\ell\) for different values of the volume fraction \(\theta\)