Stress Functions for Unsupervised Dimensionality Reduction

  • Chapter
  • First Online:
Nonlinear Dimensionality Reduction Techniques

Abstract

Dimensionality Reduction (DR) represents a set of points {ξ i} in a high dimensional metric data space \(\mathcal {D}\) by associated points {x i} in a low-dimensional embedding space \(\mathcal {E}\). This representation defines a map** \(\Phi : \mathcal {D} \longrightarrow \mathcal {E}\) such that Φ(ξ i) = x i for all i. This map** must preserve as much as possible the structure of data.

Behold yon miserable creature. That Point is a Being like ourselves, but confined to the non-dimensional Gulf. He is himself his own World, his own Universe; of any other than himself he can form no conception; he knows not Length, nor Breadth, nor Height, for he has had no experience of them; he has no cognizance even of the number Two; nor has he a thought of Plurality, for he is himself his One and All, being really Nothing. Flatland: A Romance of Many Dimensions Edwin Abbot

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amsaleg L, Chelly O, Furon T, Girard S, Houle ME, Kawarabayashi Ki, Nett M (2015) Estimating local intrinsic dimensionality. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining - KDD ’15, ACM Press, Sydney, NSW, pp 29–38, https://doi.org/10.1145/2783258.2783405. http://dl.acm.org/citation.cfm?doid=2783258.2783405

  2. Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems, pp 585–591

    Google Scholar 

  3. Böhm JN, Berens P, Kobak D (2020) A unifying perspective on neighbor embeddings along the attraction-repulsion spectrum. ar**v:200708902 [cs, stat]. http://arxiv.org/abs/2007.08902

  4. Borg I, Groenen PJF (2005) Modern multidimensional scaling: theory and applications. Springer series in statistics, 2nd edn. Springer, New York

    Google Scholar 

  5. Burges CJC (2009) Dimension Reduction: a guided tour. Found TrendsⓇ Mach Learn 2(4):275–364. https://doi.org/10.1561/2200000002. http://www.nowpublishers.com/article/Details/MAL-002

  6. Carreira-Perpinán MA (2010) The elastic embedding algorithm for dimensionality reduction. In: Proceedings of the 27th international conference on international conference on machine learning, vol 10, pp 167–174

    Google Scholar 

  7. Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B, Warner F, Zucker SW (2005) Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Procee Natl Acad Sci 102(21):7426–7431. https://doi.org/10.1073/pnas.0500334102. https://www.pnas.org/content/102/21/7426

  8. Collins M, Dasgupta S, Schapire RE (2002) A generalization of principal components analysis to the exponential family. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems, vol 14. MIT Press, Cambridge, pp 617–624. http://papers.nips.cc/paper/2078-a-generalization-of-principal-components-analysis-to-the-exponential-family.pdf

    Google Scholar 

  9. Cook J, Sutskever I, Mnih A, Hinton G (2007) Visualizing similarity data with a mixture of maps. In: Artificial intelligence and statistics, pp 67–74

    Google Scholar 

  10. Dasgupta S, Freund Y (2008) Random projection trees and low dimensional manifolds. In: Proceedings of the fortieth annual ACM symposium on theory of computing, pp 537–546

    Google Scholar 

  11. de Bodt C, Mulders D, Verleysen M, Lee JA (2018) Perplexity-free t-SNE and twice Student tt-SNE. In: ESANN. https://dial.uclouvain.be/pr/boreal/object/boreal:200844

  12. Demartines P, Hérault J (1997) Curvilinear component analysis: a self-organizing neural network for nonlinear map** of data sets. IEEE Trans Neural Netw 8(1):148–154

    Article  Google Scholar 

  13. Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3-4):325–338

    Article  MathSciNet  Google Scholar 

  14. Ham J, Lee DD, Mika S, Schölkopf B (2004) A kernel view of the dimensionality reduction of manifolds. In: Proceedings of the twenty-first international conference on machine learning, p 47

    Google Scholar 

  15. Hinton GE, Roweis ST (2003) Stochastic neighbor embedding. In: Advances in neural information processing systems, pp 857–864

    Google Scholar 

  16. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507. https://doi.org/10.1126/science.1127647. https://science.sciencemag.org/content/313/5786/504

  17. Jolliffe IT (1986) Principal components in regression analysis. In: Jolliffe IT (ed) Principal component analysis. Springer series in statistics. Springer, New York, pp 129–155. https://doi.org/10.1007/978-1-4757-1904-8_8

    Chapter  Google Scholar 

  18. Kitazono J, Grozavu N, Rogovschi N, Omori T, Ozawa S (2016) t-distributed stochastic neighbor embedding with inhomogeneous degrees of freedom. In: Hirose A, Ozawa S, Doya K, Ikeda K, Lee M, Liu D (eds) Neural information processing. Lecture notes in computer science. Springer International Publishing, Cham, pp 119–128

    Chapter  Google Scholar 

  19. Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates, Red Hook, pp 971–980. http://papers.nips.cc/paper/6698-self-normalizing-neural-networks.pdf

  20. Kruskal JB (1964) Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1):1–27. https://doi.org/10.1007/BF02289565

    Article  MathSciNet  Google Scholar 

  21. Kruskal JB (1964) Nonmetric multidimensional scaling: a numerical method. Psychometrika 29(2):115–129

    Article  MathSciNet  Google Scholar 

  22. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791. https://doi.org/10.1038/44565. https://www.nature.com/articles/44565

  23. Lee JA, Verleysen M (2014) Two key properties of dimensionality reduction methods. In: 2014 IEEE symposium on computational intelligence and data mining (CIDM). IEEE, Piscataway, pp 163–170

    Chapter  Google Scholar 

  24. Lee JA, Lendasse A, Verleysen M (2004) Nonlinear projection with curvilinear distances: isomap versus curvilinear distance analysis. Neurocomputing 57:49–76. https://doi.org/10.1016/j.neucom.2004.01.007. https://linkinghub.elsevier.com/retrieve/pii/S0925231204000645

  25. Lee JA, Renard E, Bernard G, Dupont P, Verleysen M (2013) Type 1 and 2 mixtures of Kullback–Leibler divergences as cost functions in dimensionality reduction based on similarity preservation. Neurocomputing 112:92–108. https://doi.org/10.1016/j.neucom.2012.12.036. http://linkinghub.elsevier.com/retrieve/pii/S0925231213001471

  26. Lee JA, Peluffo-Ordóñez DH, Verleysen M (2015) Multi-scale similarities in stochastic neighbour embedding: Reducing dimensionality while preserving both local and global structure. Neurocomputing 169:246–261. https://doi.org/10.1016/j.neucom.2014.12.095. https://linkinghub.elsevier.com/retrieve/pii/S0925231215003641

  27. Lespinats S, Verleysen M, Giron A, Fertil B (2007) DD-HDS: a method for visualization and exploration of high-dimensional data. IEEE Trans Neural Netw 18(5):1265–1279. https://doi.org/10.1109/TNN.2007.891682

    Article  Google Scholar 

  28. Lespinats S, Fertil B, Villemain P, Hérault J (2009) RankVisu: map** from the neighborhood network. Neurocomputing 72(13):2964–2978. https://doi.org/10.1016/j.neucom.2009.04.008. http://www.sciencedirect.com/science/article/pii/S0925231209001544

  29. Lu Y, Lu J (2020) A universal approximation theorem of deep neural networks for expressing probability distributions. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in neural information processing systems, , vol 33. Curran Associates, Red Hook, pp 3094–3105. https://proceedings.neurips.cc/paper/2020/file/2000f6325dfc4fc3201fc45ed01c7a5d-Paper.pdf

  30. McInnes L, Healy J, Melville J (2018) UMAP: uniform manifold approximation and projection for dimension reduction. ar**v:180203426 [cs, stat]. http://arxiv.org/abs/1802.03426

  31. Moon KR, van Dijk D, Wang Z, Gigante S, Burkhardt DB, Chen WS, Yim K, Elzen Avd, Hirn MJ, Coifman RR, Ivanova NB, Wolf G, Krishnaswamy S (2019) Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37(12):1482–1492. https://doi.org/10.1038/s41587-019-0336-3. https://www.nature.com/articles/s41587-019-0336-3

  32. Nocedal J, Wright SJ (1999) Numerical optimization. Springer series in operations research. Springer, New York

    Google Scholar 

  33. Pearson K (1901) LIII. On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos Mag J Sci 2(11):559–572. https://doi.org/10.1080/14786440109462720. https://www.tandfonline.com/doi/ref/10.1080/14786440109462720

  34. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  35. Qiu H, Hancock ER (2007) Clustering and embedding using commute times. IEEE Trans Pattern Analy Mach Intell 29(11):1873–1890. https://doi.org/10.1109/TPAMI.2007.1103

    Article  Google Scholar 

  36. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326. https://doi.org/10.1126/science.290.5500.2323. http://science.sciencemag.org/content/290/5500/2323

  37. Sammon JW (1969) A nonlinear map** for data structure analysis. IEEE Trans Comput C-18(5):401–409. https://doi.org/10.1109/T-C.1969.222678

    Article  Google Scholar 

  38. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a Kernel eigenvalue problem. Neural Comput 10(5):1299–1319. https://doi.org/10.1162/089976698300017467

    Article  Google Scholar 

  39. Schubert E, Gertz M (2017) Intrinsic t-stochastic neighbor embedding for visualization and outlier detection. In: Beecks C, Borutta F, Kröger P, Seidl T (eds) Similarity search and applications, vol 10609. Springer International Publishing, Cham, pp 188–203. https://doi.org/10.1007/978-3-319-68474-1_13. http://springer.longhoe.net/10.1007/978-3-319-68474-1_13

  40. Szubert B, Cole JE, Monaco C, Drozdov I (2019) Structure-preserving visualisation of high dimensional single-cell datasets. Sci Rep 9(1):1–10. https://doi.org/10.1038/s41598-019-45301-0. https://www.nature.com/articles/s41598-019-45301-0

  41. Tang J, Liu J, Zhang M, Mei Q (2016) Visualizing large-scale and high-dimensional data. In: Proceedings of the 25th international conference on world wide web - WWW ’16, pp 287–297. https://doi.org/10.1145/2872427.2883041 http://arxiv.org/abs/1602.00370

  42. Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  43. Tip** ME, Bishop CM (1999) Probabilistic principal component analysis. J Roy Statist Soc Ser B (Statist Methodol) 61(3):611–622. https://doi.org/10.1111/1467-9868.00196. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/1467-9868.00196

  44. Torgerson WS (1952) Multidimensional scaling: I. Theory and method. Psychometrika 17(4):401–419

    Article  MathSciNet  Google Scholar 

  45. Udell M, Horn C, Zadeh R, Boyd S (2016) Generalized low rank models. Found TrendsⓇ Mach Learn 9(1):1–118. https://doi.org/10.1561/2200000055. https://www.nowpublishers.com/article/Details/MAL-055

  46. Van Der Maaten L (2009) Learning a parametric embedding by preserving local structure. In: Artificial intelligence and statistics, pp 384–391

    Google Scholar 

  47. van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

    MATH  Google Scholar 

  48. Venna J (2007) Dimensionality reduction for visual exploration of similarity structures. PhD thesis, Helsinki University of Technology, Espoo, oCLC: 231147068

    Google Scholar 

  49. Venna J, Kaski S (2006) Local multidimensional scaling. Neural Netw 19(6–7):889–899. https://doi.org/10.1016/j.neunet.2006.05.014. http://linkinghub.elsevier.com/retrieve/pii/S0893608006000724

  50. Venna J, Kaski S (2007) Nonlinear dimensionality reduction as information retrieval. In: Artificial intelligence and statistics, pp 572–579

    Google Scholar 

  51. Venna J, Peltonen J, Nybo K, Aidos H, Kaski S (2010) Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J Mach Learn Res 11(Feb):451–490

    MathSciNet  MATH  Google Scholar 

  52. Vladymyrov M, Carreira-Perpinán MÁ (2013) Entropic affinities: properties and efficient numerical computation. In: Proceedings of the 30th international conference on machine learning, no 3, pp 477–485

    Google Scholar 

  53. Wismüller A, Verleysen M, Aupetit M, Lee JA (2010) Recent advances in nonlinear dimensionality reduction, manifold and topological learning. In: ESANN, European symposium on artificial neural networks

    Google Scholar 

  54. Yang Z, King I, Xu Z, Oja E (2009) Heavy-tailed symmetric stochastic neighbor embedding. In: Advances in neural information processing systems, pp 2169–2177

    Google Scholar 

  55. Yang Z, Peltonen J, Kaski S (2014) Optimization equivalence of divergences improves neighbor embedding. In: International conference on machine learning, pp 460–468

    Google Scholar 

  56. Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3(1):19–22

    Article  Google Scholar 

  57. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Computat Graph Statist 15(2):265–286

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lespinats, S., Colange, B., Dutykh, D. (2022). Stress Functions for Unsupervised Dimensionality Reduction. In: Nonlinear Dimensionality Reduction Techniques. Springer, Cham. https://doi.org/10.1007/978-3-030-81026-9_5

Download citation

Publish with us

Policies and ethics

Navigation