Deep Image Models

Zhu, Song-Chun; Wu, Ying Nian

doi:10.1007/978-3-030-96530-3_11

Song-Chun Zhu³ &
Ying Nian Wu⁴

490 Accesses

Abstract

In this chapter, we will present the deep FRAME model or deep energy-based model as a recursive multi-layer generalization of the original FRAME model. We shall also present the generator model that can be considered a nonlinear multi-layer generalization of the factor analysis model. Such multi-layer models capture the fact that visual patterns and concepts appear at multiple layers of abstractions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 62.99; Price includes VAT (Germany)

Softcover Book: EUR 58.84; Price includes VAT (Germany)

Hardcover Book: EUR 80.24; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

A survey of the recent architectures of deep convolutional neural networks

Article 21 April 2020

Deep Insights into Convolutional Networks for Video Recognition

Article Open access 29 October 2019

References

Alain, G., & Bengio, Y. (2014). What regularized auto-encoders learn from the data-generating distribution. The Journal of Machine Learning Research, 15(1), 3563–3593.
MathSciNet MATH Google Scholar
Bengio, Y., Goodfellow, I. J., & Courville, A. (2015). Deep learning. Book in preparation for MIT Press.
Google Scholar
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees.
MATH Google Scholar
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Article Google Scholar
Dai, J., Lu, Y., & Wu, Y. N. (2014). Generative modeling of convolutional neural networks. ar**v preprint ar**v:1412.6296.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–38.
MathSciNet MATH Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE.
Google Scholar
Denton, E. L., Chintala, S., Fergus, R., et al. (2015). Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems (pp. 1486–1494).
Google Scholar
Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using real NVP. In International Conference on Learning Representations, abs/1605.08803.
Google Scholar
Dosovitskiy, A., Springenberg, J. T., & Brox, T. (2015). Learning to generate chairs with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1538–1546).
Google Scholar
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.
MathSciNet MATH Google Scholar
Gao, R., Lu, Y., Zhou, J., Zhu, S.-C., & Wu, Y. N. (2017). Learning multi-grid generative convnets by minimal contrastive divergence. ar**v preprint ar**v:1709.08868.
Google Scholar
Girolami, M., & Calderhead, B. (2011). Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214.
Article MathSciNet MATH Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems (pp. 2672–2680).
Google Scholar
Grenander, U., & Miller, M. I. (2007). Pattern theory: from representation to inference. Oxford University Press.
MATH Google Scholar
Han, T., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2017). Alternating back-propagation for generator network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 3, p. 13).
Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
Google Scholar
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Article MATH Google Scholar
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554–2558.
Article MathSciNet MATH Google Scholar
Hyvärinen, A. (2005). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr), 695–709.
MathSciNet MATH Google Scholar
Hyvarinen, A. (2007). Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. IEEE Transactions on Neural Networks, 18(5), 1529–1531.
Article MathSciNet Google Scholar
Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). Wiley.
Google Scholar
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. ar**v preprint ar**v:1502.03167.
Google Scholar
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401–4410).
Google Scholar
Kim, H., & Park, H. (2008). Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2), 713–730.
Article MathSciNet MATH Google Scholar
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.
Article Google Scholar
Kossaifi, J., Tzimiropoulos, G., & Pantic, M. (2017). Fast and exact newton and bidirectional fitting of active appearance models. IEEE Transactions on Image Processing, 26(2), 1040–1053.
Article MathSciNet MATH Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097–1105).
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems (pp. 556–562).
Google Scholar
Liu, C., Rubin, D. B., & Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika, 85(4), 755–770.
Article MathSciNet MATH Google Scholar
Liu, F., Shen, C., & Lin, G. (2015). Deep convolutional neural fields for depth estimation from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Google Scholar
Liu, J. S. (2008). Monte Carlo strategies in scientific computing. Springer.
MATH Google Scholar
Lu, Y., Gao, R., Zhu, S.-C., & Wu, Y. N. (2018). Exploring generative perspective of convolutional neural networks by learning random field models. Statistics and Its Interface, 11(3), 515–529.
Article MathSciNet MATH Google Scholar
Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016). Learning FRAME models using CNN filters. Thirtieth AAAI Conference on Artificial Intelligence.
Google Scholar
Montufar, G. F., Pascanu, R., Cho, K., & Bengio, Y. (2014). On the number of linear regions of deep neural networks. In Advances in Neural Information Processing Systems (pp. 2924–2932).
Google Scholar
Neal, R. M. et al. (2011). MCMC using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2(11), 2.
MATH Google Scholar
Ngiam, J., Chen, Z., Koh, P. W., & Ng, A. Y. (2011). Learning deep energy models (pp. 1105–1112).
Google Scholar
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325.
Article Google Scholar
Pascanu, R., Montufar, G., & Bengio, Y. (2013). On the number of response regions of deep feed forward networks with piece-wise linear activations. ar**v preprint ar**v:1312.6098.
Google Scholar
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. ar**v preprint ar**v:1511.06434.
Google Scholar
Roth, S., & Black, M. J. (2005). Fields of experts: A framework for learning image priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 860–867). IEEE.
Google Scholar
Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47(1), 69–76.
Article MathSciNet MATH Google Scholar
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Google Scholar
Swersky, K., Ranzato, M., Buchman, D., Marlin, B., & Freitas, N. (2011). On autoencoders and score matching for energy based models. In Getoor, L., & Scheffer, T. (Eds.) Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 1201–1208). ACM.
Google Scholar
Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In International Conference on Machine Learning (pp. 1064–1071). ACM.
Google Scholar
Vincent, P. (2010). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674.
Article MathSciNet MATH Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning (pp. 1096–1103). ACM.
Google Scholar
Welling, M. (2009). Herding dynamical weights to learn. In International Conference on Machine Learning (pp. 1121–1128). ACM.
Google Scholar
Wu, Y. N., Zhu, S. C., & Liu, X. (2000). Equivalence of julesz ensembles and frame models. International Journal of Computer Vision, 38(3), 247–265.
Article MATH Google Scholar
**e, J., Gao, R., Zheng, Z., Zhu, S.-C., & Wu, Y. N. (2019a). Learning dynamic generator model by alternating back-propagation through time. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33(01)).
Google Scholar
**e, J., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016b). A theory of generative ConvNet. In International Conference on Machine Learning (pp. 2635–2644).
Google Scholar
**ng, X., Gao, R., Han, T., Zhu, S.-C., & Wu, Y. N. (2020). Deformable generator networks: Unsupervised disentanglement of appearance and geometry. In IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 1–1).
Google Scholar
**ng, X., Han, T., Gao, R., Zhu, S.-C., & Wu, Y. N. (2019). Unsupervised disentangling of appearance and geometry by deformable generator network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10354–10363).
Google Scholar
Younes, L. (1999). On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics: An International Journal of Probability and Stochastic Processes, 65(3-4), 177–228.
MathSciNet MATH Google Scholar
Zhu, S.-C., & Mumford, D. B. (1997). Prior learning and gibbs reaction-diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236–1250.
Article Google Scholar
Zhu, S.-C., Wu, Y. N., & Mumford, D. (1997). Minimax entropy principle and its application to texture modeling. Neural Computation, 9(8), 1627–1660.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Bei**g Institute for General Artificial Intelligence, Peking and Tsinghua Universities jointly, Bei**g, China
Song-Chun Zhu
Department of Statistics, University of California, Los Angeles, Los Angeles, CA, USA
Ying Nian Wu

Authors

Song-Chun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Nian Wu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhu, SC., Wu, Y.N. (2023). Deep Image Models. In: Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-96530-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-96530-3_11
Published: 15 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96529-7
Online ISBN: 978-3-030-96530-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Image Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

A survey of the recent architectures of deep convolutional neural networks

Deep Insights into Convolutional Networks for Video Recognition

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Deep Image Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

A survey of the recent architectures of deep convolutional neural networks

Deep Insights into Convolutional Networks for Video Recognition

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation