Abstract
In this chapter, we will present the deep FRAME model or deep energy-based model as a recursive multi-layer generalization of the original FRAME model. We shall also present the generator model that can be considered a nonlinear multi-layer generalization of the factor analysis model. Such multi-layer models capture the fact that visual patterns and concepts appear at multiple layers of abstractions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alain, G., & Bengio, Y. (2014). What regularized auto-encoders learn from the data-generating distribution. The Journal of Machine Learning Research, 15(1), 3563–3593.
Bengio, Y., Goodfellow, I. J., & Courville, A. (2015). Deep learning. Book in preparation for MIT Press.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees.
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Dai, J., Lu, Y., & Wu, Y. N. (2014). Generative modeling of convolutional neural networks. ar**v preprint ar**v:1412.6296.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–38.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE.
Denton, E. L., Chintala, S., Fergus, R., et al. (2015). Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems (pp. 1486–1494).
Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2017). Density estimation using real NVP. In International Conference on Learning Representations, abs/1605.08803.
Dosovitskiy, A., Springenberg, J. T., & Brox, T. (2015). Learning to generate chairs with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1538–1546).
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.
Gao, R., Lu, Y., Zhou, J., Zhu, S.-C., & Wu, Y. N. (2017). Learning multi-grid generative convnets by minimal contrastive divergence. ar**v preprint ar**v:1709.08868.
Girolami, M., & Calderhead, B. (2011). Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems (pp. 2672–2680).
Grenander, U., & Miller, M. I. (2007). Pattern theory: from representation to inference. Oxford University Press.
Han, T., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2017). Alternating back-propagation for generator network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 3, p. 13).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8), 2554–2558.
Hyvärinen, A. (2005). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr), 695–709.
Hyvarinen, A. (2007). Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. IEEE Transactions on Neural Networks, 18(5), 1529–1531.
Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). Wiley.
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. ar**v preprint ar**v:1502.03167.
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401–4410).
Kim, H., & Park, H. (2008). Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2), 713–730.
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.
Kossaifi, J., Tzimiropoulos, G., & Pantic, M. (2017). Fast and exact newton and bidirectional fitting of active appearance models. IEEE Transactions on Image Processing, 26(2), 1040–1053.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097–1105).
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems (pp. 556–562).
Liu, C., Rubin, D. B., & Wu, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika, 85(4), 755–770.
Liu, F., Shen, C., & Lin, G. (2015). Deep convolutional neural fields for depth estimation from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Liu, J. S. (2008). Monte Carlo strategies in scientific computing. Springer.
Lu, Y., Gao, R., Zhu, S.-C., & Wu, Y. N. (2018). Exploring generative perspective of convolutional neural networks by learning random field models. Statistics and Its Interface, 11(3), 515–529.
Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016). Learning FRAME models using CNN filters. Thirtieth AAAI Conference on Artificial Intelligence.
Montufar, G. F., Pascanu, R., Cho, K., & Bengio, Y. (2014). On the number of linear regions of deep neural networks. In Advances in Neural Information Processing Systems (pp. 2924–2932).
Neal, R. M. et al. (2011). MCMC using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2(11), 2.
Ngiam, J., Chen, Z., Koh, P. W., & Ng, A. Y. (2011). Learning deep energy models (pp. 1105–1112).
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325.
Pascanu, R., Montufar, G., & Bengio, Y. (2013). On the number of response regions of deep feed forward networks with piece-wise linear activations. ar**v preprint ar**v:1312.6098.
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. ar**v preprint ar**v:1511.06434.
Roth, S., & Black, M. J. (2005). Fields of experts: A framework for learning image priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 860–867). IEEE.
Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47(1), 69–76.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Swersky, K., Ranzato, M., Buchman, D., Marlin, B., & Freitas, N. (2011). On autoencoders and score matching for energy based models. In Getoor, L., & Scheffer, T. (Eds.) Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 1201–1208). ACM.
Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In International Conference on Machine Learning (pp. 1064–1071). ACM.
Vincent, P. (2010). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674.
Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning (pp. 1096–1103). ACM.
Welling, M. (2009). Herding dynamical weights to learn. In International Conference on Machine Learning (pp. 1121–1128). ACM.
Wu, Y. N., Zhu, S. C., & Liu, X. (2000). Equivalence of julesz ensembles and frame models. International Journal of Computer Vision, 38(3), 247–265.
**e, J., Gao, R., Zheng, Z., Zhu, S.-C., & Wu, Y. N. (2019a). Learning dynamic generator model by alternating back-propagation through time. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33(01)).
**e, J., Lu, Y., Zhu, S.-C., & Wu, Y. N. (2016b). A theory of generative ConvNet. In International Conference on Machine Learning (pp. 2635–2644).
**ng, X., Gao, R., Han, T., Zhu, S.-C., & Wu, Y. N. (2020). Deformable generator networks: Unsupervised disentanglement of appearance and geometry. In IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 1–1).
**ng, X., Han, T., Gao, R., Zhu, S.-C., & Wu, Y. N. (2019). Unsupervised disentangling of appearance and geometry by deformable generator network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10354–10363).
Younes, L. (1999). On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics: An International Journal of Probability and Stochastic Processes, 65(3-4), 177–228.
Zhu, S.-C., & Mumford, D. B. (1997). Prior learning and gibbs reaction-diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236–1250.
Zhu, S.-C., Wu, Y. N., & Mumford, D. (1997). Minimax entropy principle and its application to texture modeling. Neural Computation, 9(8), 1627–1660.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zhu, SC., Wu, Y.N. (2023). Deep Image Models. In: Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-96530-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-96530-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96529-7
Online ISBN: 978-3-030-96530-3
eBook Packages: Computer ScienceComputer Science (R0)