Fully Variational Noise-Contrastive Estimation

  • Conference paper
  • First Online:
Image Analysis (SCIA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13886))

Included in the following conference series:

  • 647 Accesses

Abstract

By using the underlying theory of proper scoring rules, we design a family of noise-contrastive estimation (NCE) methods that are tractable for latent variable models. Both terms in the underlying NCE loss, the one using data samples and the one using noise samples, can be lower-bounded as in variational Bayes, therefore we call this family of losses fully variational noise-contrastive estimation. Variational autoencoders are a particular example in this family and therefore can be also understood as separating real data from synthetic samples using an appropriate classification loss. We further discuss other instances in this family of fully variational NCE objectives and indicate differences in their empirical behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We omit the possibility of using general Bernoulli RV for notational simplicity.

  2. 2.

    For brevity we use sums to refer to marginalization of RV, but these sums should always be understood as the appropriate Lebesque integrals.

  3. 3.

    If we use unnormalized models \(p_\theta ^0\), then Eq. 37 is bounded by \(Z(\theta )\).

  4. 4.

    This is only necessary for continuous latent variables as pmf’s are always in [0, 1].

References

  1. Ceylan, C., Gutmann, M.U.: Conditional noise-contrastive estimation of unnormalised models. In: International Conference on Machine Learning, pp. 726–734. PMLR (2018)

    Google Scholar 

  2. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  3. Dawid, A.P., Musio, M.: Theory and applications of proper scoring rules. METRON 72(2), 169–183 (2014). https://doi.org/10.1007/s40300-014-0039-y

    Article  MathSciNet  MATH  Google Scholar 

  4. Dayan, P., Hinton, G.E., Neal, R.M., Zemel, R.S.: The helmholtz machine. Neural Comput. 7(5), 889–904 (1995)

    Article  Google Scholar 

  5. Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. Adv. Neural. Inf. Process. Syst. 34, 8780–8794 (2021)

    Google Scholar 

  6. Ghosh, P., Sajjadi, M.S., Vergari, A., Black, M.: From variational to deterministic autoencoders. In: 8th International Conference on Learning Representations (2020)

    Google Scholar 

  7. Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102(477), 359–378 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. Goodfellow, I.J., et al.: Generative adversarial nets. In: NIPS (2014)

    Google Scholar 

  9. Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 297–304 (2010)

    Google Scholar 

  10. Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J. Mach. Learn. Res. 13(Feb), 307–361 (2012)

    Google Scholar 

  11. Hyvärinen, A.: Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6(4) (2005)

    Google Scholar 

  12. Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)

    Article  MATH  Google Scholar 

  13. Kanamori, T., Hido, S., Sugiyama, M.: A least-squares approach to direct importance estimation. J. Mach. Learn. Res. 10, 1391–1445 (2009)

    MathSciNet  MATH  Google Scholar 

  14. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Proceedings of the 2nd International Conference on Learning Representations (ICLR) (2014)

    Google Scholar 

  15. Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  16. Kirichenko, P., Izmailov, P., Wilson, A.G.: Why normalizing flows fail to detect out-of-distribution data. Adv. Neural. Inf. Process. Syst. 33, 20578–20589 (2020)

    Google Scholar 

  17. Liu, W., Wang, X., Owens, J., Li, Y.: Energy-based out-of-distribution detection. Adv. Neural. Inf. Process. Syst. 33, 21464–21475 (2020)

    Google Scholar 

  18. Mirza, M., Osindero, S.: Conditional generative adversarial nets. ar**v preprint ar**v:1411.1784 (2014)

  19. Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. ar**v preprint ar**v:1807.03748 (2018)

  20. Parry, M., Dawid, A.P., Lauritzen, S., et al.: Proper local scoring rules. Ann. Stat. 40(1), 561–592 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  21. Pihlaja, M., Gutmann, M., Hyvärinen, A.: A family of computationally efficient and simple estimators for unnormalized statistical models. In: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, pp. 442–449 (2010)

    Google Scholar 

  22. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. ar**v preprint ar**v:1511.06434 (2015)

  23. Ren, J., et al.: Likelihood ratios for out-of-distribution detection. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  24. Rezende, D., Mohamed, S.: Variational inference with normalizing flows. In: International conference on machine learning, pp. 1530–1538. PMLR (2015)

    Google Scholar 

  25. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1278–1286 (2014)

    Google Scholar 

  26. Rhodes, B., Gutmann, M.U.: Variational noise-contrastive estimation. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 2741–2750. PMLR (2019)

    Google Scholar 

  27. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  MATH  Google Scholar 

  28. Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: International Conference on Learning Representations (2020)

    Google Scholar 

  29. Sugiyama, M., Suzuki, T., Kanamori, T.: Density ratio estimation in machine learning. Cambridge University Press (2012)

    Google Scholar 

  30. Sugiyama, M., Suzuki, T., Kanamori, T.: Density-ratio matching under the bregman divergence: a unified framework of density-ratio estimation. Ann. Inst. Stat. Math. 64(5), 1009–1044 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  31. Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)

    Article  MATH  Google Scholar 

  32. Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.: Adversarially learned anomaly detection. In: 2018 IEEE International conference on data mining (ICDM), pp. 727–736. IEEE (2018)

    Google Scholar 

Download references

Acknowledgement

This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Zach .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zach, C. (2023). Fully Variational Noise-Contrastive Estimation. In: Gade, R., Felsberg, M., Kämäräinen, JK. (eds) Image Analysis. SCIA 2023. Lecture Notes in Computer Science, vol 13886. Springer, Cham. https://doi.org/10.1007/978-3-031-31438-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31438-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31437-7

  • Online ISBN: 978-3-031-31438-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation