1 Introduction

The primary goal of this note is the algebraic characterization of the systems \(\textbf{F}(x)=y\) of n polynomial equations in n real variables for which the solution exists, is unique, and varies differentiably with y.

Equivalently, we formulate algebraic necessary and sufficient conditions in order for the polynomial map \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) underlying the system to admit a differentiable inverse. The last condition requires \(J_F(x)=\text {det} DF(x)\) to be nowhere zero, and so we are really searching for an algebraic characterization of invertible polynomial local diffeomorphisms.

A. Bialynicki-Birula and M. Rosenlicht proved in [1] the intriguing result that, just as in the trivial case of linear operators in finite dimensions (i.e. polynomial maps \({\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) of degree one), all injective polynomial self-maps of \({\mathbb {R}}^n\) are surjective. The complex analogue of this result is the celebrated Ax-Grothendieck theorem [2].

These developments raise the possibility that perhaps other elementary features of linear systems also carry over to systems of polynomials of arbitrary degrees.

With this in mind, and since a linear map \(L:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is invertible if and only if \({\textrm{det}} L\ne 0\), it is natural to ask if, more generally, polynomial maps on \({\mathbb {R}}^n\) with non-vanishing Jacobian determinants (i.e. polynomial local diffeomorphisms) are invertible. This was answered in the negative by Pinchuk in [3] (see also [4, 5]).

An equivalent statement for the invertibility of L is the finite-dimensional version of the classical “Fredholm alternative": Either \(Lx=y\) has a unique solution for every \(y\in {\mathbb {R}}^n\), or the auxiliary \(n\times n\) system \(Lx=0\) has a non-trivial solution.

It turns out that, unlike the condition \({\textrm{det}} L\ne 0\) for invertibility, the Fredholm alternative does admit a version for non-singular arbitrary polynomial maps. The caveat is that the auxiliary system that controls invertibility is now of type \((2n)\times (2n)\):

Theorem 1.1

Let \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) be a polynomial map with \(|J_F|>0\). Then either the system \(F(x)=y\) has a unique solution for every \(y\in {\mathbb {R}}^n\), or the auxiliary system

$$\begin{aligned} F(x)-&F(y)=0 \nonumber \\ [DF(x)^{-1}]^{*} x+&[DF (y)^{-1}]^{*} y=0\end{aligned}$$
(1.1)

has a non-trivial solution. Otherwise said, the local diffeomorphism F is invertible if and only if the only solution of the above system in \({\mathbb {R}}^n \times {\mathbb {R}}^n\) is \(x=y=0\).

Matrix inversion can be computationally onerous. In this regard, the version of Theorem 1.1 given below, obtained by setting \([DF(x)^{-1}]^{*} x=z=-[DF (y)^{-1}DF(x)]^{*} y\), may be more useful:

Corollary 1.2

A polynomial local diffeomorphism \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is invertible if and only if \(x=y=z=0\) is the only solution of the homogeneous polynomial system

$$\begin{aligned}&F(x)-F(y)=0 \nonumber \\&DF(x)^{*} z-x=0 \nonumber \\&DF(x)^{*}z+y=0. \end{aligned}$$
(1.2)

Remark 1.3

  1. (a)

    Note that (1.1) is equivalent to

    $$\begin{aligned}&F(x)-F(y)=0\nonumber \\&x+[DF (y)^{-1}DF(x)]^{*} y=0.\end{aligned}$$
    (1.3)

    It follows from the theorem that this system has a non-trivial solution in all counterexamples to the strong real Jacobian conjecture (for instance, the ones in [3,4,5]).

  2. (b)

    Why systems come into play? Naively, the injectivity question can be understood conceptually as a matter of uniqueness in infinitely many problems: one is tasked with showing that for y fixed—and yet arbitrary, \(F(x)=F(y)\) has precisely one solution. Theorem 1.1 introduces a device that reduces the injectivity issue to the examination of uniqueness in only one problem, rather than infinitely many. The trade-off is that one has to work in a space having twice the dimension of the original one.

  3. (c)

    Unlike sufficiency, to be established in Sect. 2, the necessity half of the theorem is utterly trivial: injectivity implies that (1.3) reduces to \(x-y=0, \; x+y=0\).

The main application of Theorem 1.1 is to the study of the Jacobian conjecture in algebraic geometry [6,7,8]. The latter can be formulated, equivalently, over \({\mathbb {C}}\) or \({\mathbb {R}}\):

(JC) \(\forall n \in {\mathbb {N}}\), if \(G:{\mathbb {C}}^n \rightarrow {\mathbb {C}}^n\) is a polynomial map with \({\textrm{det}} DG= 1\), then G is invertible.

(RJC) \(\forall n \in {\mathbb {N}}\), if \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is a polynomial map with \({\textrm{det}} DF= 1\), then F is invertible.

To be clear, the complex Jacobian conjecture in dimension n implies the real version in dimension n (by complexification), whereas the real version in dimension 2n implies the complex one in dimension n (by realification).

We also point out that (RJC) fails if \({\textrm{det}} DF= 1\) is replaced by \({\textrm{det}} DF\ge 1\) [4].

The conjecture below is about Systems:

Conjecture

(SJC) \(\forall n \in {\mathbb {N}}\), if \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is a polynomial map with \({\textrm{det}} DF= 1\), then \((x,y)=(0,0)\) is the only solution in \({\mathbb {R}}^n \times {\mathbb {R}}^n\) of the polynomial system (1.1).

From Theorem 1.1 and \({\mathrm{(JC)}} \Longleftrightarrow {\mathrm{(RJC)}}\), one obtains:

Theorem 1.4

The Jacobian conjecture holds if and only if (SJC) is true.

(SJC) deals with a fairly explicit object, namely a polynomial system. This stands in sharp contrast with the abstract criteria in [7, Thm. (2.1)], stating that any of the assertions below is equivalent to version (JC) of the Jacobian conjecture:

  1. (i)

    \({\mathbb {C}}(X)\) is Galois over \({\mathbb {C}}(F)\).

  2. (ii)

    \({\mathbb {C}}[X]\) is a projective \({\mathbb {C}}[F]\)-module.

  3. (iii)

    The integral closure of \({\mathbb {C}}[F]\) in \({\mathbb {C}}[X]\) is unramified over \({\mathbb {C}}[F]\).

The idea for the proof of Theorem 1.1 comes from differential geometry. It is often the case that the solution of a geometric minimization problem yields a canonical representative of some class. For instance, there is a unique closed geodesic that minimizes length in each non-trivial free homotopy class of loops in a compact Riemannian manifold of negative curvature. An illustrative example is the “waist" of a truncated catenoid.

In our setting, the strategy is a simple one. Given a non-injective local diffeomorphism \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\), we look for an “optimal" pair (xy) off the diagonal D of \({\mathbb {R}}^n \times {\mathbb {R}}^n\) that realizes lack of injectivity. Specifically, we take the infimum of the quantity \(|x|^2+ |y|^2\) over those points \((x,y)\in ({\mathbb {R}}^n \times \mathbb R^n)-D\) for which \(F(x)=F(y)\), and then analyze a minimizer whose existence follows from general compactness arguments and the inverse function theorem.

This note is similar in the spirit of [9], a work that is also informed by geometry and ODE’s, two subjects to which Jorge Sotomayor made significant contributions. Both papers extend to the non-linear realm particular aspects of linear algebra.

2 A variational approach to injectivity

As injective polynomial maps \({\mathbb {R}}^n\rightarrow \mathbb R^n\) are surjective [1], Theorem 1.1 follows from

Theorem 2.1

A local diffeomorphism \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) of class \(C^1\) is injective if and only if \(x=y=0\) is the only solution of

$$\begin{aligned} F(x)-&F(y)=0 \nonumber \\ [DF(x)^{-1}]^{*} x+&[DF (y)^{-1}]^{*} y=0.\end{aligned}$$
(2.1)

It was observed already that if F is injective the system \(\Sigma _F\) above reduces to \(x-y=0, \; 2[DF(x)^{-1}]^{*} x=0\). For the converse, assume by contradiction that \(\Sigma _F\) has only the null solution but F is not injective. Select \(x_0, y_0 \in {\mathbb {R}}^n\), \(x_0\ne y_0\), such that \(F(x_0)=F(y_0)\). Let \(J\subset (0, \infty )\) be defined by the condition that a positive number r belongs to J if and only if there exist \(x, y\in {\mathbb {R}}^n\) such that \(x\ne y, \; F(x)=F(y), \; r= \sqrt{|x|^2+|y|^2}.\) In particular, \(r_0=\sqrt{|x_0|^2 +|y_0|^2} \in J\).

We let \(\alpha\) be the infimum of the (supposedly non-empty) set J and proceed to show that neither of the alternatives \(\alpha =0\), \(\alpha >0\), can occur. Choose a sequence \(r_j \in J\), \(r_j\le r_0\), with \(\lim _{j\rightarrow \infty } r_j=\alpha\), with corresponding \(x_j, y_j \in {\mathbb {R}}^n\) such that

$$\begin{aligned} x_j\ne y_j, \; \; F(x_j)=F(y_j), \;\;r_j=\sqrt{|x_j|^2+|y_j|^2}\le r_0.\end{aligned}$$
(2.2)

Passing to subsequences, if necessary, we may assume that \((x_j)\) and \((y_j)\), which are bounded in view of (2.2), converge to points \({\underline{x}}, {\underline{y}}\) in \({\mathbb {R}}^n\). By continuity, (2.2) implies

$$\begin{aligned} F({\underline{x}})=F({\underline{y}}), \; \alpha = \sqrt{|{\underline{x}}|^2+|{\underline{y}}|^2}. \end{aligned}$$
(2.3)

Next, we will see that the remaining piece of information from (2.2) is also preserved in the limit, namely that \({\underline{x}} \ne {\underline{y}}\). If not, since F is non-singular, by the inverse function theorem we can choose a neighborhood U of the common value \({\underline{x}}={\underline{y}}\) for which \(F\vert U\) is injective. As \(\lim x_j={\underline{x}}={\underline{y}}=\lim y_j,\) we can pick j sufficiently large so that the distinct points \(x_j, \; y_j\) belong to U. But since \(F\vert U\) is injective one must have \(F(x_j)\ne F(x_j)\), a contradiction to (2.2).

Thus, one obtains an enhanced version of (2.3):

$$\begin{aligned}&{\underline{x}} \ne {\underline{y}}, \; F({\underline{x}})=F({\underline{y}}), \;\; \alpha = \sqrt{|{\underline{x}}|^2+|{\underline{y}}|^2}. \end{aligned}$$
(2.4)

An immediate consequence of (2.4) is that the alternative \(\alpha =0\) cannot occur. Assume therefore that \(\alpha >0\) and consider

$$\begin{aligned}&\xi =(\xi _1, \xi _2): {\mathbb {R}}^n \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}^{n} \times {\mathbb {R}}^n, \nonumber \\&\xi _1(x, y) = -\big (x+[DF(y)^{-1}DF(x)]^{*} y\big ), \nonumber \\&\xi _2 (x,y)=DF(y)^{-1}DF(x) \xi _1(x,y).\end{aligned}$$
(2.5)

As the map F is of class \(C^1\), \(\xi\) can be regarded as a continuous vector field on \({\mathbb {R}}^{2n}={\mathbb {R}}^n \times {\mathbb {R}}^n\), and as such it has local trajectories by Peano’s theorem on existence of solutions of systems of ordinary differential equations with continuous coefficients. Let therefore \(\phi :[0, \epsilon _1)\rightarrow {\mathbb {R}}^n \times {\mathbb {R}}^n\), \(\phi (t)=(x(t), y(t))\), be a solution, for some \(\epsilon _1>0\), of the problem

$$\begin{aligned}&\frac{dx}{dt}=\xi _1(x,y), \nonumber \\&x(0)={\underline{x}} \nonumber \\&\frac{dy}{dt}=\xi _2(x,y) \nonumber \\&y(0)={\underline{y}}. \end{aligned}$$
(2.6)

One has \(\frac{d}{dt}\big (F(x(t))-F(y(t))\big )= DF(x)\frac{dx}{dt}-DF(y)\frac{dy}{dt}= DF(x)\xi _1-DF(y)\xi _2=0,\) and so, for \(t\in (0, \epsilon _1)\),

$$\begin{aligned} F(x(t))-F(y(t))= F(x(0))-F(y(0))= F({\underline{x}})-F({\underline{y}})=0. \end{aligned}$$
(2.7)

From (2.5) and (2.6),

$$\begin{aligned}&\frac{d}{dt}\frac{1}{2}(|x|^2+|y|^2)= \langle x, \frac{dx}{dt} \rangle + \langle y, \frac{dy}{dt} \rangle = \nonumber \\&\langle x, \frac{dx}{dt} \rangle +\langle y, DF(y)^{-1} DF(x)\frac{dx}{dt} \rangle =\nonumber \\&\langle x+[DF(y)^{-1} DF(x)]^{*}y, \frac{dx}{dt} \rangle = \nonumber \\&- \vert \xi _1(x,y) \vert ^2 \le 0. \end{aligned}$$
(2.8)

We claim that the previous inequality is strict at \(({\underline{x}},{\underline{y}})\), i.e.

$$\begin{aligned} \frac{d}{dt}(|x|^2+|y|^2)\big \vert _{t=0}<0. \end{aligned}$$
(2.9)

Indeed, by (2.8) if (2.9) fails one must have \(\xi _1({\underline{x}}, {\underline{y}})= 0\) and, by (2.5),

$$\begin{aligned}&{\underline{x}}+[DF({\underline{y}})^{-1}DF({\underline{x}})]^{*} {\underline{y}} =0. \end{aligned}$$
(2.10)

Multiplying (2.10) by \([DF(x)^{-1}]^{*}\) one obtains \([DF(x)^{-1}]^{*} x+[DF (y)^{-1}]^{*} y=0\). From the second relation in (2.4) one then sees that \(({\underline{x}}, {\underline{y}})\) is a solution of (1.1) and so, by the main hypothesis of the theorem, \({\underline{x}}={\underline{y}}=0\). But then (2.4) implies \(\alpha =0\), a case that had been discarded already, and therefore (2.9) must hold.

Next, using (2.4) and (2.9) one can find \(\epsilon _2 \in (0, \epsilon _1)\) such that, for all \(t\in (0, \epsilon _2)\),

$$\begin{aligned}&|x(t)|^2+|y(t)|^2 < |x(0)|^2+|y(0)|^2= |{\underline{x}}|^2+|{\underline{y}}|^2=\alpha ^2. \end{aligned}$$
(2.11)

It follows from (2.7), the first relation in (2.4), and the continuity of \(\phi\), that if \(\epsilon _3 \in (0, \epsilon _2)\) is sufficiently small then \(F(x(t))=F(y(t))\) and \(x(t)\ne y(t)\) for all \(t\in (0, \epsilon _3)\).

The last relations imply, for \(t\in (0, \epsilon _3)\), and after taking square roots in (2.11), that

$$\begin{aligned}&\sqrt{ |x(t)|^2+|y(t)|^2}\in J \\&\sqrt{ |x(t)|^2+|y(t)|^2}<\alpha , \end{aligned}$$

contradicting the definition of \(\alpha\). It follows that \(J=\emptyset\), \(x_0=y_0\), and so F is injective.

3 An explicit algebraic criterion in (JC)

In this section we show that Theorem 1.1 has a counterpart for local biholomorphisms (see Remark 1.3 a) and Theorems 3.1, 3.2 below), leading to a direct criterion for (JC) to hold, without having to go through its real version (RJC). Here, \(A^{*}\) stands for the adjoint of the complex matrix A, i.e. its conjugate-transpose.

Theorem 3.1

A local biholomorphism \(F:{\mathbb {C}}^n \rightarrow {\mathbb {C}}^n\) is injective if and only if \(x=y=0\) is the only solution in \({\mathbb {C}}^n\) of the system

$$\begin{aligned}&F(x)-F(y)=0 \nonumber \\&x+[DF (y)^{-1}DF (x)]^{*}y=0. \end{aligned}$$
(3.1)

From [1] one obtains

Theorem 3.2

A polynomial map \(F:\mathbb C^n \rightarrow {\mathbb {C}}^n\), \(\det DF=1\), is invertible if and only if \(x=y=0\) is the only solution in \({\mathbb {C}}^n\) of system (3.1).

Remark 3.3

Unlike the real case, in Theorem 3.2 the system is no longer polynomial in the (complex) coordinates of x and y, since it clearly involves their conjugates as well. Notice that Corollary 1.2 also admits a complex version.

The proof of Theorem 3.1 follows closely that of Theorem 1.1, but for the benefit of the reader primarily interested in the version (JC) of the Jacobian conjecture - and not on the invertibility of local diffeomorphisms - we go over the arguments again, in a slightly modified way.

If F is injective, (3.1) reduces to \(x-y=0, x+y=0\) and the conclusion follows. Assume now, by contradiction, that the only solution of (3.1) is the null one but F is not injective, say \(F(x_0)=F(y_0)\), \(x_0\ne y_0\).

Choose \(R>\sqrt{ |x_0|^2+|y_0|^2}\), let \(B_R\) be the closed ball in \({\mathbb {C}}^n\times {\mathbb {C}}^n\) with center (0, 0) and radius R, and take D to be the diagonal of \({\mathbb {C}}^n\times \mathbb C^n\). Consider

$$\begin{aligned} {\mathcal {C}}=\{(x,y)\in B_R-D \;\vert \; F(x)=F(y)\}, \end{aligned}$$

observing that this set is non-empty since it contains \((x_0, y_0)\).

Next, we show that \({\mathcal {C}}\) is closed, hence compact. To this end, take a sequence \((x_n, y_n)\) in \({\mathcal {C}}\) that converges in \({\mathbb {C}}^n\times {\mathbb {C}}^n\) to (ab). Evidently, \((a,b)\in B_R\), \(F(a)=F(b)\), and so it remains to check that \(a\ne b\), If not, the inverse function theorem applied at \(a=b\) implies \(x_n=y_n\) for sufficiently large n, contradicting \({\mathcal {C}} \cap D=\emptyset\).

Let \(({\underline{x}}, {\underline{y}})\) be a point of absolute minimum for the function \(h(x,y)=\sqrt{|x|^2+|y|^2}\) on \({\mathcal {C}}\). Observe that \(({\underline{x}}, {\underline{y}})\) lies in the interior of \(B_R\), since \(h({\underline{x}}, {\underline{y}})\le h(x_0, y_0)<R=h\vert \partial B_R\).

Consider the local solutions, in the interior of \(B_R\), of the initial value problem corresponding to (2.6):

$$\begin{aligned}&\frac{dx}{dt}=-\big (x+[DF(y)^{-1}DF(x)]^{*} y\big ) \\&x(0)={\underline{x}} \\ {}&\frac{dy}{dt}=-DF(y)^{-1}DF(x)\big (x+[DF(y)^{-1}DF(x)]^{*} y\big ) \\ {}&y(0)={\underline{y}}. \end{aligned}$$

Since \(y'= DF(y)^{-1}DF(x) x'\), the derivative of \(F(x(t))-F(y(t))\) is zero and so

$$\begin{aligned} F(x(t))-F(y(t))= F(x(0))-F(y(0))= F({\underline{x}})-F({\underline{y}})=0. \end{aligned}$$

The computation analogous to (2.8), of the variation of h along the local solution, now needs to take real parts into account:

$$\begin{aligned}&\frac{d}{dt}\frac{1}{2}(|x|^2+|y|^2)= {\textrm{Re}} \langle x, \frac{dx}{dt} \rangle + {\textrm{Re}} \langle y, \frac{dy}{dt} \rangle \\&\quad ={\textrm{Re}} \langle x, \frac{dx}{dt} \rangle +{\textrm{Re}}\langle y, DF(y)^{-1} DF(x)\frac{dx}{dt} \rangle \\&\quad = {\textrm{Re}}\langle x+[DF(y)^{-1} DF(x)]^{*}y, \frac{dx}{dt} \rangle \\&\quad =- \big \vert x+[DF(y)^{-1} DF(x)]^{*}y \big \vert ^2 \le 0. \end{aligned}$$

From this point on the argument proceeds as in Sect. 2. If the above inequality were strict at \(t=0\), for t close to zero the quantity \(|x(t)|^2+|y(t)|^2\) would be strictly smaller than \(|{\underline{x}}|^2+|{\underline{y}}|^2\), a contradiction to the fact that \(({\underline{x}}, {\underline{y}})\) is a point of global minimum for h.

Hence, the derivative in (2.8) is zero at \(t=0\), leading to \({\underline{x}}+[DF({\underline{y}})^{-1} DF({\underline{x}})]^{*}{\underline{y}}=0\). By the main hypothesis of the theorem this implies \(({\underline{x}}, {\underline{y}})=(0,0)\), contradicting \({\mathcal {C}}\cap D=\emptyset\). Thus, there is no \((x_0, y_0)\) off the diagonal for which \(F(x_0)=F(y_0)\).

4 Remarks on polynomial homeomorphisms

Since we dealt with the issue of differentiable dependence on y of the solutions x of a polynomial system \(F(x)=y\), it is natural to look also into continuous dependence.

By the invariance of domain theorem (i.e., a continuous injective map \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is open, hence \({\mathbb {R}}^n {\mathop {\rightarrow }\limits ^{F}} F({\mathbb {R}}^n)\) is a homeomorphism), existence and uniqueness of solutions of \(F(x)=y\) already guarantees continuous dependence. Thus, the main problem in this circle of ideas is to characterize algebraically those polynomial maps \(F:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) that are homeomorphisms (instead of diffeomorphisms, as it was done in the present note).

Simply put, one can leave topology aside and try to characterize algebraically the polynomial maps on \({\mathbb {R}}^n\) that are invertible.

An obvious necessary condition, arising from the fact that a homeomorphism either preserves or reverses orientation, is that the Jacobian determinant should be everywhere non-negative or non-positive, i.e. it does not change sign. Likewise, by the aforementioned invariance of domain theorem, another necessary condition is that F be an open map. In the smooth (resp. polynomial) case, openness is equivalent to discreteness (resp. finiteness) of fibers, plus the requirement that the Jacobian determinant does not change sign [10,11,12].

It is conceivable that besides these two conditions one needs just a single extra one in order to ensure invertibility, involving the solutions of a suitable polynomial system. However, as matters stand, it is not clear how to proceed because the passage from \(J_F>0\) to \(J_F\ge 0\) introduces several technical difficulties in our variational approach.