Abstract
Scene recognition systems which attempt to deal with a large number of scene categories currently lack proper knowledge about the perceptual ontology of scene categories and would enjoy significant advantage from a perceptually meaningful scene representation. In this work we perform a large-scale human study to create “SceneNet”, an online ontology database for scene understanding that organizes scene categories according to their perceptual relationships. This perceptual ontology suggests that perceptual relationships do not always conform the semantic structure between categories, and it entails a lower dimensional perceptual space with “perceptually meaningful” Euclidean distance, where each embedded category is represented by a single prototype. Using the SceneNet ontology and database we derive a computational scheme for learning non-linear map** of scene images into the perceptual space, where each scene image is closest to its category prototype than to any other prototype by a large margin. Then, we demonstrate how this approach facilitates improvements in large-scale scene categorization over state-of-the-art methods and existing semantic ontologies, and how it reveals novel perceptual findings about the discriminative power of visual attributes and the typicality of scenes.
Chapter PDF
Similar content being viewed by others
Keywords
References
**ao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large scale scene recognition from abbey to zoo. In: CVPR (2010)
SceneNet: An Online Perceptual Ontology Database for Scene Understanding. (2013) Anonymous URL. Concealed for blind review
Fei-Fei, L., Perona, P.: A bayesian hierarchy model for learning natural scene categories. In: CVPR (2005)
Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR (2008)
Bart, E., Porteous, I., Perona, P., Welling, M.: Unsupervised learning of visual taxonomies. In: CVPR (2008)
Ahuja, N., Todorovic, S.: Learning the taxonomy and models of categories present in arbitrary images. In: ICCV (2007)
Marszałek, M., Schmid, C.: Constructing Category Hierarchies for Visual Recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 479–491. Springer, Heidelberg (2008)
Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A.: Unsupervised discovery of visual object class hierarchies. In: CVPR (2008)
Li, L., Wang, C., Lim, Y., Blei, D., Fei-Fei, L.: Building and using a semantivisual image hierarchy. In: CVPR (2010)
Marszalek, M., Schmid, C.: Semantic hierarchies for visual object recognition. In: CVPR (2007)
Torralba, A., Fergus, R., W.T., F.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic Label Sharing for Learning with Many Categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)
Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet. In: CVPR, pp. 1777–1784 (2011)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR (2009)
Verma, N., Mahajan, D., Sellamanickam, S., Nair, V.: Learning hierarchical similarity metrics. In: CVPR (2012)
Miller, G.: Wordnet: A lexical database for english. In: Communications of the ACM (1995)
Deng, J., Berg, A., Fei-Fei, L.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR (2011)
Weinberger, K., Chapelle, O.: Large margin taxonomy embedding for document categorization. In: NIPS, pp. 1737–1744 (2008)
Kadar, I., Ben-Shahar, O.: Small sample scene categorization from perceptual relations. In: CVPR, pp. 2711–2718 (2012)
Rousselet, G.A., Fabre-Thorpe, M., Thorpe, S.J.: Parallel processing in high-level categorization of natural images. Nature Neuroscience 5(7), 629–630 (2002)
Torgerson, W.S.: Multidimensional scaling: theory and method. Psychometrika 17(6), 401–419 (1952)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)
Greene, M., Oliva, A.: Forest before the trees: the precedence of global features in visual perception. Cognit. Sci. 58, 137–179 (2009)
Patterson, G., Hays, J.: SUN attribute database: Discovering, annotating, and recognizing scene attributes. In: CVPR (2012)
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: ICML, p. 515521 (1998)
Boyd, S., Vandenberghe, L. (eds.): Convex Optimization. Cambridge University Press (2004)
Weinberger, K., Saul, L.: Fast solvers and efficient implementations for distance metric learning. In: ICML, pp. 1160–1167 (2008)
Vogel, J., Schiele, B.: Semantic typicality measure for natural scene categorization. In: Annual Pattern Recognition Symposium (2004)
Ehinger, K., **ao, J., Torralba, A., Oliva, A.: Estimating scene typicality from human ratings and image features. In: Proceedings of the 33rd Annual Conference of the Cognitive Science Society, pp. 2562–2567 (2011)
Murphy, G.L. (ed.): The big book of concepts. MIT Press (2002)
Rosch, E.: Cognitive representations of semantic categories. J. Exp. Psych. (1975)
Mervis, C., Pani, J.: Acquisition of basic object categories. Cognit. Sci. 12 (1980)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kadar, I., Ben-Shahar, O. (2015). SceneNet: A Perceptual Ontology for Scene Understanding. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8926. Springer, Cham. https://doi.org/10.1007/978-3-319-16181-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-16181-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16180-8
Online ISBN: 978-3-319-16181-5
eBook Packages: Computer ScienceComputer Science (R0)