Computational Optimal Transport

  • Living reference work entry
  • First Online:
Encyclopedia of Optimization

Abstract

The optimal transport (OT) problem is a classical optimization problem having the form of linear programming. Machine learning applications put forward new computational challenges in their solution. In particular, the OT problem defines a distance between real-world objects such as images, videos, texts, etc., modeled as probability distributions. In this case, the large dimension of the corresponding optimization problem does not allow applying classical methods such as network simplex or interior-point methods. This challenge was overcome by introducing entropic regularization and using the efficient Sinkhorn algorithm to solve the regularized problem. A flexible alternative is the accelerated primal–dual gradient method, which can use any strongly convex regularization. These algorithms and other related problems such as approximating the Wasserstein barycenter together with efficient algorithms for its solution, including decentralized distributed algorithms, are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Agueh M, Carlier G (2011) Barycenters in the Wasserstein space. SIAM J Math Anal 43(2):904–924

    Article  MathSciNet  MATH  Google Scholar 

  2. Allen-Zhu Z, Li Y, Oliveira R, Wigderson A (2017) Much faster algorithms for matrix scaling. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 890–901. https://arxiv.org/abs/1704.02315

  3. Altschuler J, Bach F, Rudi A, Weed J (2018) Approximating the quadratic transportation metric in near-linear time. ar**v preprint ar**v:1810.10046

    Google Scholar 

  4. Altschuler J, Weed J, Rigollet P (2017) Near-linear time approxfimation algorithms for optimal transport via Sinkhorn iteration. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates, Inc., pp 1961–1971. https://arxiv.org/abs/1705.09634

    Google Scholar 

  5. Ambrosio L, Brué E, Semola D (2021) Lectures on Optimal Transport. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-030-72162-6

    Book  MATH  Google Scholar 

  6. Benamou J-D, Carlier G, Cuturi M, Nenna L, Peyré G (2015) Iterative Bregman projections for regularized transportation problems. SIAM J Sci Comput 37(2):A1111–A1138

    Article  MathSciNet  MATH  Google Scholar 

  7. Bigot J, Cazelles E, Papadakis N (2019) Data-driven regularization of Wasserstein barycenters with an application to multivariate density registration. Inf Inference: J IMA 8(4):719–755

    Article  MathSciNet  MATH  Google Scholar 

  8. Blanchet J, Jambulapati A, Kent C, Sidford A (2018) Towards optimal running times for optimal transport. ar**v preprint ar**v:1810.07717

    Google Scholar 

  9. Blondel M, Seguy V, Rolet A (2018) Smooth and sparse optimal transport. In: International Conference on Artificial Intelligence and Statistics. PMLR, pp 880–889

    Google Scholar 

  10. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, NY

    Book  MATH  Google Scholar 

  11. Chambolle A, Contreras JP (2022) Accelerated Bregman primal-dual methods applied to optimal transport and Wasserstein barycenter problems

    Google Scholar 

  12. Chambolle A, Pock T (2011) A first-order primal-dual algorithm for convex problems with applications to imaging. J Math Imaging Vision 40(1):120–145

    Article  MathSciNet  MATH  Google Scholar 

  13. Cohen MB, Madry A, Tsipras D, Vladu A (2017) Matrix scaling and balancing via box constrained Newton’s method and interior point methods. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 902–913. https://arxiv.org/abs/1704.02310

  14. Cominetti R, San Martin J (1994) Asymptotic analysis of the exponential penalty trajectory in linear programming. Math Program 67:169–187

    Article  MathSciNet  MATH  Google Scholar 

  15. Cuturi M (2013) Sinkhorn distances: lightspeed computation of optimal transport. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates, Inc., New York, pp 2292–2300

    Google Scholar 

  16. Cuturi M, Peyré G (2016) A smoothed dual approach for variational Wasserstein problems. SIAM J Imaging Sci 9(1):320–343

    Article  MathSciNet  MATH  Google Scholar 

  17. Del Barrio E, Cuesta-Albertos JA, Matrán C, Mayo-Íscar A (2019) Robust clustering tools based on optimal transportation. Stat Comput 29(1):139–160

    Article  MathSciNet  MATH  Google Scholar 

  18. Dvinskikh D, Gorbunov E, Gasnikov A, Dvurechensky P, Uribe CA (2019) On primal and dual approaches for distributed stochastic convex optimization over networks. In: 2019 IEEE 58th Conference on Decision and Control (CDC). IEEE, pp 7435–7440

    Google Scholar 

  19. Dvinskikh D, Tiapkin D (2021) Improved complexity bounds in Wasserstein barycenter problem. In: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. PMLR, pp 1738–1746

    Google Scholar 

  20. Dvurechenskii P, Dvinskikh D, Gasnikov A, Uribe C, Nedich A (2018) Decentralize and randomize: faster algorithm for Wasserstein barycenters. Adv Neural Inf Process Syst 31:10760–10770

    Google Scholar 

  21. Dvurechensky P, Gasnikov A, Kroshnin A (2018) Computational optimal transport: complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In: Jennifer D, Andreas K (eds) Proceedings of the 35th International Conference on Machine Learning, vol 80, pp 1367–1376. ar**v:1802.04367

    Google Scholar 

  22. Fang S-C, Rajasekera J, Tsao H-S (1997) Entropy optimization and mathematical programming. Kluwer’s International Series. https://epubs.siam.org/doi/10.1137/130929886

    Book  MATH  Google Scholar 

  23. Ferradans S, Papadakis N, Peyré G, Aujol J-F (2014) Regularized discrete optimal transport. SIAM J Imaging Sci 7(3):1853–1882

    Article  MathSciNet  MATH  Google Scholar 

  24. Franklin J, Lorenz J (1989) On the scaling of multidimensional matrices. Linear Algebra Appl 114:717–735. Special Issue Dedicated to Alan J. Hoffman

    Google Scholar 

  25. Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Ann l’inst Henri Poincaré 10(4):215–310

    MATH  Google Scholar 

  26. Gabow HN, Tarjan RE (1991) Faster scaling algorithms for general graph matching problems. J ACM (JACM) 38(4):815–853

    Article  MathSciNet  MATH  Google Scholar 

  27. Gasnikov AV, Gasnikova EV, Nesterov YE, Chernov AV (2016) Efficient numerical methods for entropy-linear programming problems. Comput Math Math Phys 56(4):514–524

    Article  MathSciNet  MATH  Google Scholar 

  28. Gasnikov A, Dvurechensky P, Kamzolov D, Nesterov Y, Spokoiny V, Stetsyuk P, Suvorikova A, Chernov A (2015) Universal method with inexact oracle and its applications for searching equilibriums in multistage transport problems. ar**v preprint ar**v:1506.00292

    Google Scholar 

  29. Genevay A, Cuturi M, Peyré G, Bach F (2016) Stochastic optimization for large-scale optimal transport. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems 29. Curran Associates, Inc., New York, pp 3440–3448

    Google Scholar 

  30. Gorbunov E, Rogozin A, Beznosikov A, Dvinskikh D, Gasnikov A (2022) Recent theoretical advances in decentralized distributed convex optimization. In: High-dimensional optimization and probability. Springer International Publishing, Cham, pp 253–325. https://springer.longhoe.net/chapter/10. 1007/978-3-031-00832-0_8#copyright-information

  31. Gramfort A, Peyré G, Cuturi M (2015) Fast optimal transport averaging of neuroimaging data. In: International Conference on Information Processing in Medical Imaging. Springer, pp 261–272

    Google Scholar 

  32. Guminov S, Dvurechensky P, Gasnikov A (2019) Accelerated alternating minimization. ar**v preprint ar**v:1906.03622

    Google Scholar 

  33. Guminov S, Dvurechensky P, Tupitsa N, Gasnikov A (2021) On a combination of alternating minimization and Nesterov’s momentum. In: International Conference on Machine Learning. PMLR, pp 3886–3898

    MATH  Google Scholar 

  34. Guo W, Ho N, Jordan M (2020) Fast algorithms for computational optimal transport and Wasserstein barycenter. In: Chiappa S, Calandra R (eds) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol 108. PMLR, 26–28 Aug 2020, pp 2088–2097

    Google Scholar 

  35. Heinemann F, Munk A, Zemel Y (2020) Randomised Wasserstein barycenter computation: resampling with statistical guarantees. ar**v preprint ar**v:2012.06397

    Google Scholar 

  36. Hopcroft JE, Karp RM (1973) An nˆ5/2 algorithm for maximum matchings in bipartite graphs. SIAM J Comput 2(4):225–231

    Article  MathSciNet  MATH  Google Scholar 

  37. Jambulapati A, Sidford A, Tian K (2019) A direct tilde \(\widetilde {O}(1/\varepsilon )\) iteration parallel algorithm for optimal transport. In: Advances in neural information processing systems, pp 11359–11370

    Google Scholar 

  38. Kalantari B, Lari I, Ricca F, Simeone B (2008) On the complexity of general matrix scaling and entropy minimization via the RAS algorithm. Math Program 112(2):371–401

    Article  MathSciNet  MATH  Google Scholar 

  39. Kantorovich L (1942) On the translocation of masses. (Doklady) Acad Sci URSS (NS) 37:199–201

    Google Scholar 

  40. Kantorovich LV (1960) Mathematical methods of organizing and planning production. Manag Sci 6(4):366–422

    Article  MathSciNet  MATH  Google Scholar 

  41. Knight PA (2008) The Sinkhorn–Knopp algorithm: convergence and applications. SIAM J Matrix Anal Appl 30(1):261–275

    Article  MathSciNet  MATH  Google Scholar 

  42. Kroshnin A, Dvinskikh D, Tupitsa N, Dvurechensky P, Gasnikov A, Uribe C (2019) On the complexity of approximating Wasserstein barycenters. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, vol 97, pp 3530–3540. ar**v:1901.08686

    Google Scholar 

  43. Le Gouic T, Loubes J-M (2017) Existence and consistency of Wasserstein barycenters. Probab Theory Relat Fields 168(3–4):901–917

    MathSciNet  MATH  Google Scholar 

  44. Lee YT, Sidford A (2014) Path finding methods for linear programming: solving linear programs in \(\tilde {O}(\sqrt {\text{rank}})\) iterations and faster algorithms for maximum flow. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp 424–433

    Google Scholar 

  45. Léonard C (2013) A survey of the Schr∖” odinger problem and some of its connections with optimal transport. ar**v preprint ar**v:1308.0215

    Google Scholar 

  46. Lin T, Ho N, Chen X, Cuturi M, Jordan MI (2020) Fixed-support Wasserstein barycenters: computational hardness and fast algorithm. Adv Neural Inf Process Syst 33:5368–5380

    Google Scholar 

  47. Lin T, Ho N, Jordan M (2019) On efficient optimal transport: an analysis of greedy and accelerated mirror descent algorithms. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol 97. PMLR, 09–15 Jun 2019, pp 3982–3991

    Google Scholar 

  48. Lin T, Ho N, Jordan MI (2022) On the efficiency of entropic regularized algorithms for optimal transport. J Mach Learn Res 23(137):1–42

    MathSciNet  Google Scholar 

  49. Monge G (1781) Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris

    Google Scholar 

  50. Nesterov Y (2005) Smooth minimization of non-smooth functions. Math Program 103(1):127–152

    Article  MathSciNet  MATH  Google Scholar 

  51. Nesterov Y (2007) Dual extrapolation and its applications to solving variational inequalities and related problems. Math Program 109(2–3):319–344

    Article  MathSciNet  MATH  Google Scholar 

  52. Pele O, Werman M (2009) Fast and robust earth mover’s distances. In: 2009 IEEE 12th International Conference on Computer Vision, pp 460–467

    Google Scholar 

  53. Peyré G, Cuturi M et al (2019) Computational optimal transport. Found Trends® Mach Learn 11(5–6):355–607

    Article  MATH  Google Scholar 

  54. Quanrud K (2018) Approximating optimal transport with linear programs. ar**v preprint ar**v:1810.05957

    Google Scholar 

  55. Rabin J, Peyré G, Delon J, Bernot M (2011) Wasserstein barycenter and its application to texture mixing. In: International Conference on Scale Space and Variational Methods in Computer Vision. Springer, pp 435–446

    Google Scholar 

  56. Rogozin A, Dvurechensky P, Dvinkikh D, Beznosikov A, Kovalev D, Gasnikov A (2021) Decentralized distributed optimization for saddle point problems. ar**v preprint ar**v:2102.07758

    Google Scholar 

  57. Schmidt M, Le Roux N, Bach F (2017) Minimizing finite sums with the stochastic average gradient. Math Program 162(1–2):83–112

    Article  MathSciNet  MATH  Google Scholar 

  58. Sherman J (2017) Area-convexity, l regularization, and undirected multicommodity flow. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp 452–460

    Google Scholar 

  59. Sinkhorn R (1974) Diagonal equivalence to matrices with prescribed row and column sums. II. Proc Am Math Soc 45:195–198

    Article  MathSciNet  MATH  Google Scholar 

  60. Solomon J, De Goes F, Peyré G, Cuturi M, Butscher A, Nguyen A, Du T, Guibas L (2015) Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains. ACM Trans Graphics (TOG) 34(4):66

    Article  MATH  Google Scholar 

  61. Srivastava S, Cevher V, Dinh Q, Dunson D (2015) WASP: scalable bayes via barycenters of subset posteriors. In: Artificial intelligence and statistics. PMLR, pp 912–920

    Google Scholar 

  62. Stonyakin FS, Dvinskikh D, Dvurechensky P, Kroshnin A, Kuznetsova O, Agafonov A, Gasnikov A, Tyurin A, Uribe CA, Pasechnyuk D, Artamonov S (2019) Gradient methods for problems with inexact model of the objective. In: Khachay M, Kochetov Y, Pardalos P (eds) Mathematical optimization theory and operations research. Springer International Publishing, Cham, pp 97–114

    Chapter  Google Scholar 

  63. Tarjan RE (1997) Dynamic trees as search trees via euler tours, applied to the network simplex algorithm. Math Program 78(2):169–177

    Article  MathSciNet  MATH  Google Scholar 

  64. Uribe CA, Lee S, Gasnikov A, Nedić A (2017) Optimal algorithms for distributed optimization. ar**v preprint ar**v:1712.00232

    Google Scholar 

  65. Weed J (2018) An explicit analysis of the entropic penalty in linear programming. In: Bubeck S, Perchet V, Rigollet P (eds) Proceedings of the 31st Conference On Learning Theory. Proceedings of Machine Learning Research, vol 75. PMLR, 06–09 Jul 2018, pp 1841–1855

    Google Scholar 

Download references

Acknowledgements

The first section of the research is supported by the Ministry of Science and Higher Education of the Russian Federation (Goszadaniye) 075-00337-20-03, project No. 0714-2020-0005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nazarii Tupitsa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Tupitsa, N., Dvurechensky, P., Dvinskikh, D., Gasnikov, A. (2023). Computational Optimal Transport. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_861-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-54621-2_861-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-54621-2

  • Online ISBN: 978-3-030-54621-2

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Navigation