Deep Image Models

  • Chapter
  • First Online:
Computer Vision
  • 490 Accesses

Abstract

In this chapter, we will present the deep FRAME model or deep energy-based model as a recursive multi-layer generalization of the original FRAME model. We shall also present the generator model that can be considered a nonlinear multi-layer generalization of the factor analysis model. Such multi-layer models capture the fact that visual patterns and concepts appear at multiple layers of abstractions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 62.99
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 58.84
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 80.24
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alain, G., & Bengio, Y. (2014). What regularized auto-encoders learn from the data-generating distribution. The Journal of Machine Learning Research, 15(1), 3563–3593.

    MathSciNet  MATH  Google Scholar 

  2. Bengio, Y., Goodfellow, I. J., & Courville, A. (2015). Deep learning. Book in preparation for MIT Press.

    Google Scholar 

  3. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees.

    MATH  Google Scholar 

  4. Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.

    Article  Google Scholar 

  5. Dai, J., Lu, Y., & Wu, Y. N. (2014). Generative modeling of convolutional neural networks. ar**v preprint ar**v:1412.6296.

    Google Scholar 

  6. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–38.

    MathSciNet  MATH  Google Scholar 

  7. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE.

    Google Scholar 

  8. Denton, E. L., Chintala, S., Fergus, R., et al. (2015). Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems (pp. 1486–1494).

    Google Scholar 

  9. Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using real NVP. In International Conference on Learning Representations, abs/1605.08803.

    Google Scholar 

  10. Dosovitskiy, A., Springenberg, J. T., & Brox, T. (2015). Learning to generate chairs with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1538–1546).

    Google Scholar 

  11. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.

    MathSciNet  MATH  Google Scholar 

  12. Gao, R., Lu, Y., Zhou, J., Zhu, S.-C., & Wu, Y. N. (2017). Learning multi-grid generative convnets by minimal contrastive divergence. ar**v preprint ar**v:1709.08868.

    Google Scholar 

  13. Girolami, M., & Calderhead, B. (2011). Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214.

    Article  MathSciNet  MATH  Google Scholar 

  14. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems (pp. 2672–2680).

    Google Scholar 

  15. Grenander, U., & Miller, M. I. (2007). Pattern theory: from representation to inference. Oxford University Press.

    MATH  Google Scholar 

  16. Han, T., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2017). Alternating back-propagation for generator network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 3, p. 13).

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).

    Google Scholar 

  18. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.

    Article  MATH  Google Scholar 

  19. Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554–2558.

    Article  MathSciNet  MATH  Google Scholar 

  20. Hyvärinen, A. (2005). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr), 695–709.

    MathSciNet  MATH  Google Scholar 

  21. Hyvarinen, A. (2007). Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. IEEE Transactions on Neural Networks, 18(5), 1529–1531.

    Article  MathSciNet  Google Scholar 

  22. Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). Wiley.

    Google Scholar 

  23. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. ar**v preprint ar**v:1502.03167.

    Google Scholar 

  24. Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401–4410).

    Google Scholar 

  25. Kim, H., & Park, H. (2008). Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2), 713–730.

    Article  MathSciNet  MATH  Google Scholar 

  26. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.

    Article  Google Scholar 

  27. Kossaifi, J., Tzimiropoulos, G., & Pantic, M. (2017). Fast and exact newton and bidirectional fitting of active appearance models. IEEE Transactions on Image Processing, 26(2), 1040–1053.

    Article  MathSciNet  MATH  Google Scholar 

  28. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097–1105).

    Google Scholar 

  29. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

    Article  Google Scholar 

  30. Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems (pp. 556–562).

    Google Scholar 

  31. Liu, C., Rubin, D. B., & Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika, 85(4), 755–770.

    Article  MathSciNet  MATH  Google Scholar 

  32. Liu, F., Shen, C., & Lin, G. (2015). Deep convolutional neural fields for depth estimation from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition.

    Google Scholar 

  33. Liu, J. S. (2008). Monte Carlo strategies in scientific computing. Springer.

    MATH  Google Scholar 

  34. Lu, Y., Gao, R., Zhu, S.-C., & Wu, Y. N. (2018). Exploring generative perspective of convolutional neural networks by learning random field models. Statistics and Its Interface, 11(3), 515–529.

    Article  MathSciNet  MATH  Google Scholar 

  35. Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016). Learning FRAME models using CNN filters. Thirtieth AAAI Conference on Artificial Intelligence.

    Google Scholar 

  36. Montufar, G. F., Pascanu, R., Cho, K., & Bengio, Y. (2014). On the number of linear regions of deep neural networks. In Advances in Neural Information Processing Systems (pp. 2924–2932).

    Google Scholar 

  37. Neal, R. M. et al. (2011). MCMC using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2(11), 2.

    MATH  Google Scholar 

  38. Ngiam, J., Chen, Z., Koh, P. W., & Ng, A. Y. (2011). Learning deep energy models (pp. 1105–1112).

    Google Scholar 

  39. Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325.

    Article  Google Scholar 

  40. Pascanu, R., Montufar, G., & Bengio, Y. (2013). On the number of response regions of deep feed forward networks with piece-wise linear activations. ar**v preprint ar**v:1312.6098.

    Google Scholar 

  41. Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. ar**v preprint ar**v:1511.06434.

    Google Scholar 

  42. Roth, S., & Black, M. J. (2005). Fields of experts: A framework for learning image priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 860–867). IEEE.

    Google Scholar 

  43. Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47(1), 69–76.

    Article  MathSciNet  MATH  Google Scholar 

  44. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.

    Google Scholar 

  45. Swersky, K., Ranzato, M., Buchman, D., Marlin, B., & Freitas, N. (2011). On autoencoders and score matching for energy based models. In Getoor, L., & Scheffer, T. (Eds.) Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 1201–1208). ACM.

    Google Scholar 

  46. Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In International Conference on Machine Learning (pp. 1064–1071). ACM.

    Google Scholar 

  47. Vincent, P. (2010). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674.

    Article  MathSciNet  MATH  Google Scholar 

  48. Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning (pp. 1096–1103). ACM.

    Google Scholar 

  49. Welling, M. (2009). Herding dynamical weights to learn. In International Conference on Machine Learning (pp. 1121–1128). ACM.

    Google Scholar 

  50. Wu, Y. N., Zhu, S. C., & Liu, X. (2000). Equivalence of julesz ensembles and frame models. International Journal of Computer Vision, 38(3), 247–265.

    Article  MATH  Google Scholar 

  51. **e, J., Gao, R., Zheng, Z., Zhu, S.-C., & Wu, Y. N. (2019a). Learning dynamic generator model by alternating back-propagation through time. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33(01)).

    Google Scholar 

  52. **e, J., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016b). A theory of generative ConvNet. In International Conference on Machine Learning (pp. 2635–2644).

    Google Scholar 

  53. **ng, X., Gao, R., Han, T., Zhu, S.-C., & Wu, Y. N. (2020). Deformable generator networks: Unsupervised disentanglement of appearance and geometry. In IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 1–1).

    Google Scholar 

  54. **ng, X., Han, T., Gao, R., Zhu, S.-C., & Wu, Y. N. (2019). Unsupervised disentangling of appearance and geometry by deformable generator network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10354–10363).

    Google Scholar 

  55. Younes, L. (1999). On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics: An International Journal of Probability and Stochastic Processes, 65(3-4), 177–228.

    MathSciNet  MATH  Google Scholar 

  56. Zhu, S.-C., & Mumford, D. B. (1997). Prior learning and gibbs reaction-diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236–1250.

    Article  Google Scholar 

  57. Zhu, S.-C., Wu, Y. N., & Mumford, D. (1997). Minimax entropy principle and its application to texture modeling. Neural Computation, 9(8), 1627–1660.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhu, SC., Wu, Y.N. (2023). Deep Image Models. In: Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-96530-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-96530-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-96529-7

  • Online ISBN: 978-3-030-96530-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation