Nonparametric Bayesian Deep Visualization

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13713))

  • 1031 Accesses

Abstract

Visualization methods such as t-SNE [1] have helped in knowledge discovery from high-dimensional data; however, their performance may degrade when the intrinsic structure of observations is in low-dimensional space, and they cannot estimate clusters that are often useful to understand the internal structure of a dataset. A solution is to visualize the latent coordinates and clusters estimated using a neural clustering model. However, they require a long computational time since they have numerous weights to train and must tune the layer width, the number of latent dimensions and clusters to appropriately model the latent space. Additionally, the estimated coordinates may not be suitable for visualization since such a model and visualization method are applied independently. We utilize neural network Gaussian processes (NNGP) [2] equivalent to a neural network whose weights are marginalized to eliminate the necessity to optimize weights and layer widths. Additionally, to determine latent dimensions and the number of clusters without tuning, we propose a latent variable model that combines NNGP with automatic relevance determination [3] to extract necessary dimensions of latent space and infinite Gaussian mixture model [4] to infer the number of clusters. We integrate this model and visualization method into nonparametric Bayesian deep visualization (NPDV) that learns latent and visual coordinates jointly to render latent coordinates optimal for visualization. Experimental results on images and document datasets show that NPDV shows superior accuracy to existing methods, and it requires less training time than the neural clustering model because of its lower tuning cost. Furthermore, NPDV can reveal plausible latent clusters without labels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Rotatable plots are provided as an html file in the supplemental material.

  2. 2.

    All appendices are provided in the Supplemental Materials.

  3. 3.

    The network architecture is the same as that of the data generation network.

  4. 4.

    http://qwone.com/\(\sim {}\)jason/20Newsgroups/.

  5. 5.

    http://korpus.uib.no/icame/manuals/BROWN/INDEX.HTML.

References

  1. van der Maaten, L., Hinton, G.: Visualizing data using \(t\)-SNE. J. Mach. Learn. Res. 9(1), 2579–2605 (2008)

    Google Scholar 

  2. Lee, J.H., Bahri, Y., Novak, R., Schoenholz, S., Pennington, J., Sohl-Dickstein, J.: Deep neural networks as Gaussian processes. In: International Conference on Learning, Representation, vol.2018, no. 48, pp. 478–487 (2018)

    Google Scholar 

  3. Mackay, D.J.: Bayesian non-linear modeling for the prediction competition. ASHRAE Trans. 100(2), 1053–1062 (1994)

    Google Scholar 

  4. Rassmusen, C.E.: The infinite Gaussian mixture model. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, pp. 554–560

    Google Scholar 

  5. Kruscal, J.B.: Multidimensional scaling by optimizing goodness of git to a nonmetric hypothesis. Psychometrika 29, 1–27 (1964)

    Article  MathSciNet  Google Scholar 

  6. Tenebaum, J.B., de Silva, J.C., Langford, A.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Google Scholar 

  7. Hinto, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems 15 (2002)

    Google Scholar 

  8. Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data. In: The 25th International Conference on the World Wide Web, pp. 287–297 (2016)

    Google Scholar 

  9. Aleksandr, A., Maxim, P.: NCVis: noise contrastive approach for scalable visualization. Proceed. Web Conf. 2020, 2941–2947 (2020)

    Google Scholar 

  10. McInnes, L., Healy, J., Saul, N., Großberge, L.: UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3(29), 2579–2605 (2018)

    Google Scholar 

  11. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant map**. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742 (2006)

    Google Scholar 

  12. van der Maaten, L., Weinberger, K.: Stochastic triplet embedding. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, pp. 1–6 (2012)

    Google Scholar 

  13. Wilber, M.J., Kwak, I.S., Kriegman D.J., Belongie, S.: Learning concept embeddings with combined human-machine expertise. In: Proceedings of the IEEE International Conference on Computer Vision 2, pp. 981–989 (2015)

    Google Scholar 

  14. Ehsan, A., Manfred, K.W.: TriMap: large-scale dimensionality reduction using triplets. ar**v preprint ar**v:1910.00204 (2019)

  15. Wang, Y., Huang, H.M., Rudin, C., Shaposhnik, Y.: Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization. J. Mach. Learn. Res. 2021(201), 1–73 (2021)

    Google Scholar 

  16. Wallach, I., Liliean, R.: The protein-small-molecule database, a non-redundant structural resource for the analysis of protein-ligand binding. Bioinformatics 25(5), 615–620 (2010)

    Google Scholar 

  17. Hamel, P., Eck, D.: Learning features from music audio with deep belief networks. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 339–344 (2010)

    Google Scholar 

  18. Geng, X., Zhan, D.-C., Zhou, Z.-H.: Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Trans. Syst. Man Cybern. 35(6), 1098–1107 (2005)

    Google Scholar 

  19. Venna, A., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11(13), 451–490 (2010)

    MathSciNet  MATH  Google Scholar 

  20. Zheng, J., Zhang, H.H., Cattani, C., Wang, W.: Dimensionality reduction by supervised neighbor embedding using Laplacian search. Biomed. Sig. Process. Model. Complexity Living Syst. 2014, 594379 (2014)

    Google Scholar 

  21. **e, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, vol. 48, pp. 478–487 (2016)

    Google Scholar 

  22. Fard, M.N., Thonet, T., Gaussier, E.: Deep \(k\)-means: jointly clustering with \(k\)-means and learning representations. Ar**v:1806.10069 (2018)

  23. Yang, X., Yan, Y., Huang, K., Zhang, R.: VSB-DVM: an end-to-end Bayesian nonparametric generalization of deep variational Mixture Model. In: 2019 IEEE International Conference on Data Mining (2019)

    Google Scholar 

  24. Iwata, T., Duvenaud, D., Ghahramani, Z.: Warped mixtures for nonparametric cluster shapes. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pp. 311–320 (2013)

    Google Scholar 

  25. Lawrence, N.D.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2004)

    MathSciNet  MATH  Google Scholar 

  26. Rassmusen, C.E., Williams, C.: Gaussian Processes for Machine Learning. The MIT Press (2006)

    Google Scholar 

  27. Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2), 209–230 (1973)

    Google Scholar 

  28. Cho, Y., Saul, L.K.: Kernel methods for deep learning. Adv. Neural. Inf. Process. Syst. 22, 342–350 (2009)

    Google Scholar 

  29. Sethuraman, J.: Constructive definition of Dirichlet process. Statist. Sinica 4(2), 639–650 (1994)

    MathSciNet  MATH  Google Scholar 

  30. Zhu, J., Chen, N., **ng, E.P.: Bayesian inference with posterior regularization and applications to infinite latent SVMs. J. Mach. Learn. 15(1), 1799–1847 (2014)

    MathSciNet  MATH  Google Scholar 

  31. Zellner, A.: Optimal information processing and Bayes’ theorem. Am. Stat. 42(4), 278–280 (1988)

    MathSciNet  Google Scholar 

  32. Bishop, C.M.: Pattern recognition and machine learning, 1st Edn. Springer (2006)

    Google Scholar 

  33. Blei, D., Jordan, M.: Variational inference for Dirichlet process mixtures. J. Bayesian Anal. 1(1), 121–144 (2006)

    MathSciNet  MATH  Google Scholar 

  34. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (2013)

    Google Scholar 

  35. Titsias, M.K., Lawrence, N.D.: Bayesian Gaussian process latent variable model. In: Proceedings of the 13th International Workshop on Artificial Intelligence and Statistics, vol. 9, pp. 844–851 (2010)

    Google Scholar 

  36. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631 (2019)

    Google Scholar 

  37. Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haruya Ishizuka .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 4960 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ishizuka, H., Mochihashi, D. (2023). Nonparametric Bayesian Deep Visualization. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13713. Springer, Cham. https://doi.org/10.1007/978-3-031-26387-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26387-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26386-6

  • Online ISBN: 978-3-031-26387-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation