Optimal measure transportation with respect to non-traditional costs

Artstein-Avidan, S.; Sadovsky, S.; Wyczesany, K.

doi:10.1007/s00526-022-02362-w

Optimal measure transportation with respect to non-traditional costs

Published: 06 December 2022

Volume 62, article number 35, (2023)
Cite this article

Calculus of Variations and Partial Differential Equations Aims and scope Submit manuscript

292 Accesses
4 Citations
Explore all metrics

Abstract

We study optimal mass transport problems between two measures with respect to a non-traditional cost function, i.e. a cost c which can attain the value $+\infty $. We define the notion of c-compatibility and strong c-compatibility of two measures, and prove that if there is a finite-cost plan between the measures then the measures must be c-compatible, and if in addition the two measures are strongly c-compatible, then there is an optimal plan concentrated on a c-subgradient of a c-class function. This function is the so-called potential of the plan. We give two proofs of this theorem, under slightly different assumptions. In the first we utilize the notion of c-path-boundedness, showing that strong c-compatibility implies a strong connectivity result for a directed graph associated with an optimal map. Strong connectivity of the graph implies that the c-cyclic monotonicity of the support set (which follows from classical reasoning) guarantees its c-path-boundedness, implying, in turn, the existence of a potential. We also give a constructive proof, in the case when one of the measures is discrete. This approach adopts a new notion of ‘Hall polytopes’, which we introduce and study in depth, to which we apply a version of Brouwer’s fixed point theorem to prove the existence of a potential in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Existence, duality, and cyclical monotonicity for weak transport costs

Article Open access 02 November 2019

Semi-discrete optimal transport: a solution procedure for the unsquared Euclidean distance case

Article Open access 12 February 2020

Dual potentials for capacity constrained optimal transport

Article 09 November 2014

Notes

All considered measures are Borel measures on Polish spaces, which are complete, separable metric spaces equipped with their Borel $\sigma $-algebra (see also the beginning of Sect. 2).
When referring to a function on $X \times Y$ as “measurable” we assume it is both measurable with respect to the product $\sigma $-algebra and its fibers $f(\cdot ,y)$ and $f(x,\cdot )$ are measurable functions on X and Y respectively, for any $x\in X$ and $y\in Y$.

References

Ambrosio, L., Pratelli, A.: Existence and stability results in the ${L}^1$ theory of optimal transportation. Optimal transportation and applications, pp. 123–160. Springer, Berlin, Heidelberg (2003)
MATH Google Scholar
Artstein-Avidan, S., Barel, H., Rubinstein, Y., Sadovsky, S., Wyczesany, K.: Transportation induced by the polarity transform, in preparation
Artstein-Avidan, S., Milman, V.: Hidden structures in the class of convex functions and a new duality transform. J. Eur. Math. Soc. 13(4), 975–1004 (2011)
Article MathSciNet MATH Google Scholar
Artstein-Avidan, S., Rubinstein, Y.A.: Differential analysis of polarity: Polar Hamilton-Jacobi, conservation laws, and Monge Ampère equations. J. d’Anal. Math. 132(1), 133–156 (2017)
Article MATH Google Scholar
Artstein-Avidan, S., Sadovsky, S., Wyczesany, K.: A Rockafellar-type theorem for non-traditional costs. Adv. Math. 395, 108157 (2022)
Article MathSciNet MATH Google Scholar
Ball, K.: An elementary introduction to monotone transportation. In: Geometric aspects of functional analysis, Lecture Notes in Mathematics, pp. 41–52. Springer, Berlin, Heidelberg (2004)
Google Scholar
Barel, H.: Optimal transportation problem for polar cost, Master’s thesis, Tel Aviv University (2019)
Beiglböck, M., Goldstern, M., Maresch, G., Schachermayer, W.: Optimal and better transport plans. J. Funct. Anal. 256(6), 1907–1927 (2009)
Article MathSciNet MATH Google Scholar
Bianchini, S., Caravenna, L.: On optimality of c-cyclically monotone transference plans. Comptes Rendus Math. 348(11–12), 613–618 (2010)
Article MathSciNet MATH Google Scholar
Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 44(4), 375–417 (1991)
Article MathSciNet MATH Google Scholar
Caffarelli, L.A.: The regularity of map**s with a convex potential. J. Am. Math. Soc. 5(1), 99–104 (1992)
Article MathSciNet MATH Google Scholar
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)
Article MathSciNet MATH Google Scholar
Kantorovich, L.V.: On the transfer of masses. Dokl. Acad. Sci. SSSR 37, 227–229 (1942)
Google Scholar
Kantorovich, L.V.: On a problem of Monge. Uspekhi Mat. Nauk. 3, 225–226 (1948)
Google Scholar
McCann, R.J.: A convexity theory for interacting gases and equilibrium crystals, Ph.D. thesis, Princeton University (1994)
Rochet, J.C.: A necessary and sufficient condition for rationalizability in a quasi-linear context. J. Math. Econ. 16(2), 191–200 (1987)
Article MATH Google Scholar
Rockafellar, R.T.: Characterization of the subdifferentials of convex functions. Pac. J. Math. 17(3), 497–510 (1966)
Article MathSciNet MATH Google Scholar
Rüschendorf, L.: On c-optimal random variables. Stat. Probab. Lett. 27(3), 267–270 (1996)
Article MathSciNet MATH Google Scholar
Smith, C., Knott, M.: On Hoeffding-Fréchet bounds and cyclic monotone relations. J. Multivar. Anal. 40(2), 328–334 (1992)
Article MATH Google Scholar
Strassen, V.: The existence of probability measures with given marginals. Ann. Math. Stat. 36(2), 423–439 (1965)
Article MathSciNet MATH Google Scholar
Trudinger, N.S., Wang, X.: On strict convexity and continuous differentiability of potential functions in optimal transportation. Arch. Ration. Mech. Anal. 192(3), 403–418 (2009)
Article MathSciNet MATH Google Scholar
Villani, C.: Topics in optimal transportation, vol. 58. Graduate Studies in Mathematics American Mathematical Society, Ann Arbor (2003)
MATH Google Scholar
Villani, C.: Optimal transport: old and new, vol. 338. Springer Science & Business Media, Springer, New York, (2008)
MATH Google Scholar
Wyczesany, K.: Topics in high-dimensional geometry and optimal transport, Ph.D. thesis, University of Cambridge (2020)

Download references

Acknowledgements

The authors were partially supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 770127). The first named author was partially supported by ISF Grant No. 784/20. The second named author is grateful to the Azrieli foundation for the award of an Azrieli fellowship. We thank the anonymous referee whose comments helped improve and clarify this manuscript.

Author information

Authors and Affiliations

School of Mathematical Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
S. Artstein-Avidan, S. Sadovsky & K. Wyczesany

Authors

S. Artstein-Avidan
View author publications
You can also search for this author in PubMed Google Scholar
S. Sadovsky
View author publications
You can also search for this author in PubMed Google Scholar
K. Wyczesany
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Sadovsky.

Additional information

Communicated by A. Mondino.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: c-subgradients and polar subgradients

Since c-subgradients play such an important role in this theory, we gather here some relevant information regarding them but which we did not include in the main text so as not to disturb its flow.

Let us recall that given a function $\varphi $ in the c-class, its c-subgradient is defined by

$$\begin{aligned} \partial ^c\varphi = \left\{ (x,y):\, \varphi (x)+\varphi ^c(y)=c(x,y) \,\text { and } \, c(x,y)<\infty \right\} . \end{aligned}$$

Denoting by $\partial ^c\varphi (x)$ the set of points $y\in Y$ for which $(x,y)\in \partial ^c \varphi $, we have by the definition that

$$\begin{aligned} x\in \partial ^c\varphi ^c(y)\;\;\Leftrightarrow \;\;y\in \partial ^c\varphi (x). \end{aligned}$$

Notice that $y\in \partial ^c\varphi (x)$ if and only if the function $c(\cdot ,y)-\varphi ^c(y)$ is above $\varphi $ and coincides with it at x. This provides the first simple but useful way to think about c-subgradients, summarized in Lemma A.1. Given a function $\varphi $ in the c-class, it is the image, under the c-transform, of another c-class function $\psi = \varphi ^c$ and therefore, it can be written as an infimum over basic functions as follows:

$$\begin{aligned} \varphi (x) = \inf _y \left( c(x,y) - \varphi ^c(y)\right) . \end{aligned}$$

All the functions on the right hand side lie above $\varphi $. If any one of the basic functions (indexed by y) on the right hand side is equal to $\varphi $ at the point x, then the pair (x, y) belongs to $\partial ^c \varphi $, and $y \in \partial ^c \varphi (x)$.

Lemma A.1

Let $\varphi $ be a c-class function, and $x\in X$ and assume that $\varphi (x)<\infty $. Then $y_0\in \partial ^c \varphi (x)$ if and only if $c(x,y_0)<\infty $ and the function $\ell (z)=c(z,y_0)-c(x,y_0)+\varphi (x)$ satisfies

$$\begin{aligned} \ell (z)\ge \varphi (z) \ \text { for all } \ z\in X. \end{aligned}$$

Proof

By the definition we have that $y_0\in \partial ^c \varphi (x)$ if and only if

$$\begin{aligned} \varphi (x)+ \varphi ^c(y_0)=c(x,y_0)<\infty . \end{aligned}$$

Using the definition of the c-transform we see that

$$\begin{aligned} \varphi (x) = c(x,y_0)-\varphi ^c (y_0)= \sup _z(c(x,y_0)-c(z,y_0)+\varphi (z)), \end{aligned}$$

which holds if and only if for all z we have $c(z,y_0)-c(x,y_0)+\varphi (x) \ge \varphi (z)$. $\square $

It is useful to understand the structure of the c-subgradient of the basic functions. In parallel to the classical case, where the linear functions have constant subgradient, we show that under mild assumptions the same is true for c-subgradients of basic functions. This was, of course, our motivation for using the specific candidates for the potential functions in Sect. 5.

Lemma A.2

Let X, Y be Polish spaces and let $c:X\times Y \rightarrow (-\infty , \infty ]$ be a measurable cost function. Consider a basic function $\varphi (x) = c(x,y_0)+ t$ for some $y_0 \in Y$. If $c(x,y_0)<\infty $, then $y_0\in \partial ^c\varphi (x)$. If, in addition, for any $y_1\ne y_0$ we have that $\inf _z\left( c(z,y_1) - c(z,y_0)\right) $ is not attained at x (for example, if the infimum is $-\infty $, or bounded but not attained at all) then $\{y_0 \}= \partial ^c\varphi (x)$.

Proof

Indeed, let $\varphi $ be as in the statement. From the definition it follows that $y\in \partial ^c\varphi (x)$ if and only if $c(x,y)<\infty $ and

$$\begin{aligned} c(x,y)-\varphi (x)=\varphi ^c(y) =\inf _{z} ( c(z,y)-\varphi (z)), \end{aligned}$$

which can be reformulated as

$$\begin{aligned} c(x,y)-\varphi (x) \le c(z,y_1)-\varphi (z) \ \ \ \text { for all } z\in X. \end{aligned}$$

Plugging in the definition of $\varphi $ we get

$$\begin{aligned} c(x,y)-c(x,y_0)\le c(z,y)-c(z,y_0) \ \ \ \text { for all } z\in X. \end{aligned}$$

We see that $y=y_0$ always satisfies the equality, so that $y_0\in \partial ^c \varphi (x)$. Clearly for $y_1\ne y_0$, such an inequality means precisely that the infimum is attained at x. $\square $

An important and motivating first example is the one coming from the clasical cost function $c(x,y) = -\langle x,y \rangle $.

Example A.3

For the cost function $c(x,y)=-\langle x,y \rangle $, whose transport plans and maps coincide with those associated to the quadratic cost, the c-subgradient coincides, up to a minus sign, with the well known subgradient. More formally, a function $\varphi $ is in the c-class if and only if $-\varphi \in \textrm{Cvx}({\mathbb {R}}^n)$, namely is convex and lower semi-continuous. Denoting $\psi = -\varphi $ and using the definition of the c-transform we see that $(x,y)\in \partial ^c \varphi $ if and only if for all $z\in X$ we have

$$\begin{aligned} \psi (z) - \psi (x) = \varphi (x)-\varphi (z)\ge c(x,y)-c(z,y). \end{aligned}$$

Plugging in the cost we indeed get that $y\in \partial ^c \varphi (x)$ if for all z it holds that $ \psi (x) + \langle z-x,y \rangle \le \psi (z)$, namely $y\in \partial \psi (x)$.

The second motivating example, which is our main point of interest, is that of the polar cost $p:{\mathbb {R}}^n \times {\mathbb {R}}^n \rightarrow (-\infty , \infty ]$, which we once again recall

$$\begin{aligned} p(x,y) =- \ln (\langle x,y \rangle -1) = {\left\{ \begin{array}{ll} - \ln (\langle x,y \rangle -1), &{} \ \ \text {if } \langle x,y \rangle >1 \\ +\infty , &{} \ \ \text {otherwise.} \end{array}\right. } \end{aligned}$$

It was shown in [7] that for the polar cost the p-class consists of all functions of the form $-\ln (\varphi )$, where $\varphi $ is a geometric convex function, that is, a lower semi-continuous non-negative convex function with $\varphi (0)=0$. We denote this class by $\textrm{Cvx}_0({\mathbb {R}}^n)$. The associated cost transform is linked with the ${{\mathcal {A}}}$-transform defined in [3] and given by

$$\begin{aligned} {{\mathcal {A}}}\varphi (y)= \sup _{\{x:\, \langle x,y\rangle >1\}} \frac{\langle x,y\rangle -1}{\varphi (x)}. \end{aligned}$$

(16)

More precisely, one may easily verify that $- \ln ({{\mathcal {A}}}\varphi )=(-\ln (\varphi ))^p$. Further, the p-subgradient of the function $-\ln (\varphi )$ can be rewritten as the polar subgradient $\partial ^\circ $, introduced in [4], of the function $\varphi \in \textrm{Cvx}_0({\mathbb {R}}^n)$. Indeed, we have that

$$\begin{aligned} \partial ^p(-\ln (\varphi ))= \partial ^\circ \varphi = \left\{ (x,y): \varphi (x) {{\mathcal {A}}}\varphi (y) = \langle x,y\rangle -1>0 \right\} . \end{aligned}$$

(17)

This convenient form is a reason for us to sometimes consider a “multiplicative” setting, where the basic functions are of the form

$$\begin{aligned} \varphi _{u,t}(x) = t (\langle x,u \rangle -1)_+ = t \max (\langle x,u \rangle -1,0). \end{aligned}$$

The next lemma, which is a version of [4, Lemma 3.3], describes the connection between the polar subgradient and the classical subgradient. We will also use the following notation for the zero set $Z_\varphi = \{ x: \varphi (x) = 0\}$ and $\textrm{dom}( \varphi ) = \{x: \varphi (x) <\infty \}$ for the domain where $\varphi $ is finite.

Lemma A.4

Let $\varphi \in \textrm{Cvx}_0 ({\mathbb {R}}^n)$ and let $x\in { \mathrm dom}(\varphi ){\setminus } Z_\varphi $. Then there is a bijection between the set $\{ z\in \partial \varphi (x): \, \langle x,z\rangle \ne \varphi (x)\}$ and the set $\partial ^\circ \varphi (x)$ given by $z \mapsto y=\frac{z}{ \langle x,z \rangle - \varphi (x)}$. (Note that these sets might be empty.)

Proof

We first show that the map** has the desired range. Let $z\in \partial \varphi (x)$ with $\langle x,z\rangle \ne \varphi (x)$, which means that for every w we have $\langle w,z \rangle -\varphi (w)\le \langle x,z \rangle -\varphi (x)$. In particular, $\langle x,z\rangle -\varphi (x)> 0$. Hence, letting $y=\frac{z}{ \langle x,z \rangle - \varphi (x)}$ we have that $\langle x,y\rangle >1$.

To show that $y \in \partial ^\circ \varphi (x)$, it remains to show that $ \varphi (x){{\mathcal {A}}}\varphi (y) = \langle x,y \rangle -1$. According to the definition of ${{\mathcal {A}}}$, this holds if for every w with $\langle w,y \rangle >1$ and $\varphi (w)>0$, we have

$$\begin{aligned} \frac{\langle w,y \rangle -1}{\varphi (w)}\le \frac{\langle x,y \rangle -1}{\varphi (x)}. \end{aligned}$$

Plugging in the formula for y and rearranging gives

$$\begin{aligned} \frac{\langle w,\frac{z}{ \langle x,z \rangle - \varphi (x)} \rangle -1}{\varphi (w)}\le \frac{\langle x,\frac{z}{ \langle x,z \rangle - \varphi (x)} \rangle -1}{\varphi (x)} = {\frac{1}{ \langle x,z \rangle - \varphi (x)}}. \end{aligned}$$

Using that $\langle x,z \rangle -\varphi (x)> 0$, the above inequality is equivalent to our initial assumption $\langle w,z \rangle -\varphi (w)\le \langle x,z \rangle -\varphi (x)$.

Having established that the map** is indeed into $\partial ^{\circ }\varphi (x)$, we proceed to show that it is surjective. Given $y \in \partial ^\circ \varphi (x)$ it follows from the definition that $\langle x,y \rangle >1$. Consider

$$\begin{aligned} z = \frac{ y \varphi (x)}{\langle y,x \rangle -1}, \end{aligned}$$

(18)

which is well defined. Taking scalar product with x we see that $\langle z,x \rangle = \frac{\langle y,x \rangle \varphi (x)}{\langle y,x \rangle -1}>0$, which can be rearranged to $\langle y,x \rangle (\langle z,x \rangle -\varphi (x)) = \langle z,x \rangle >0$ so, in particular, $\langle z,x \rangle \ne \varphi (x)$. Solving for $\langle x,y \rangle $ and plugging back into the Eq. (18) we get that $y=\frac{z}{ \langle x,z \rangle - \varphi (x)}$. To prove surjectivity, it remain to show that $z\in \partial \varphi (x)$. To this end, as before, we use that if $y\in \partial ^\circ \varphi (x)$ then for any w with $\langle w,y \rangle >1$ and $\varphi (w) >0$ we have $ \frac{\langle w,y \rangle -1}{\varphi (w)}\le \frac{\langle x,y \rangle -1}{\varphi (x)}$. Plugging in the formula for y and rearranging, we get that

$$\begin{aligned} \varphi (x) + \langle w-x,z \rangle \le \varphi (w) \end{aligned}$$

holds for any w such that $\langle w,z \rangle >\langle x,z \rangle -\varphi (x)$ and $\varphi (w)>0$. In the case when w is such that $\langle w,z \rangle \le \langle x,z \rangle -\varphi (x)$, this actually means that $ \varphi (x) + \langle w-x,z \rangle \le 0$ and since the geometric convex functions are non-negative the desired inequality trivially follows. Finally, we consider the case when $w\in Z_\varphi $, i.e. when $\varphi (w)=0$. Then, plugging in z as defined in (18), we have that the inequality defining the subgradient of $\varphi $ at x becomes simply

$$\begin{aligned} \langle w,y\rangle \le 1. \end{aligned}$$

That is, we need to show that $\partial ^\circ \varphi (x)$ is contained in the polar set of $Z_\varphi $. Indeed, $y\in \partial ^\circ \varphi (x)$ implies, in particular, that $y \in \textrm{dom}\,({{\mathcal {A}}}\varphi )$ (since the value of ${{\mathcal {A}}}\varphi (y) = \frac{\langle x,y \rangle -1}{\varphi (x)} <\infty $) and it follows from the definition of ${{\mathcal {A}}}$ that $\textrm{dom}\,({{\mathcal {A}}}\varphi )\subset Z_\varphi ^\circ $, which completes the proof of surjectivity.

To show injectivity, assume that for $z,w\in \partial \varphi (x)$ we have that $ y=\frac{z}{ \langle x,z \rangle - \varphi (x)} = \frac{w}{ \langle x,w \rangle - \varphi (x)}$. This means that these vectors are parallel. Taking scalar product of each of them with x and simplifying we get that $\langle x,z \rangle = \langle x,w \rangle $, and plugging this back into the equality we get $z = w$, as claimed. $\square $

We end this appendix with one explicit example of a function and its p-subgradient. More examples and applications can be found in [7, 24] and in the forthcoming [2].

Example A.5

Let $\varphi (x) = \Vert x\Vert ^2/2$, in which case ${{{\mathcal {A}}}}\varphi (y)= \Vert y\Vert ^2/2$ and the supremum in the definition of ${{\mathcal {A}}}\varphi $ is satisfied for $x = 2y/\Vert y\Vert ^2$. Hence, $\partial ^\circ \varphi (x) = \frac{2x}{\Vert x\Vert ^2}$. Note that the map** $x\mapsto \partial ^\circ \varphi (x)$ in this case is a (rescaled) spherical inversion.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Artstein-Avidan, S., Sadovsky, S. & Wyczesany, K. Optimal measure transportation with respect to non-traditional costs. Calc. Var. 62, 35 (2023). https://doi.org/10.1007/s00526-022-02362-w

Download citation

Received: 27 July 2021
Accepted: 01 October 2022
Published: 06 December 2022
DOI: https://doi.org/10.1007/s00526-022-02362-w

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Optimal measure transportation with respect to non-traditional costs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Existence, duality, and cyclical monotonicity for weak transport costs

Semi-discrete optimal transport: a solution procedure for the unsquared Euclidean distance case

Dual potentials for capacity constrained optimal transport

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: c-subgradients and polar subgradients

Lemma A.1

Proof

Lemma A.2

Proof

Example A.3

Lemma A.4

Proof

Example A.5

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Optimal measure transportation with respect to non-traditional costs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Existence, duality, and cyclical monotonicity for weak transport costs

Semi-discrete optimal transport: a solution procedure for the unsquared Euclidean distance case

Dual potentials for capacity constrained optimal transport

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: c-subgradients and polar subgradients

Appendix A: c-subgradients and polar subgradients

Lemma A.1

Proof

Lemma A.2

Proof

Example A.3

Lemma A.4

Proof

Example A.5

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation