Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Yang, **aodong; Tian, YingLi

doi:10.1007/978-3-319-10605-2_47

**aodong Yang¹⁹ &
YingLi Tian¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8690))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses
31 Citations

Abstract

This paper presents a novel framework for human action recognition based on sparse coding. We introduce an effective coding scheme to aggregate low-level descriptors into the super descriptor vector (SDV). In order to incorporate the spatio-temporal information, we propose a novel approach of super location vector (SLV) to model the space-time locations of local interest points in a much more compact way compared to the spatio-temporal pyramid representations. SDV and SLV are in the end combined as the super sparse coding vector (SSCV) which jointly models the motion, appearance, and location cues. This representation is computationally efficient and yields superior performance while using linear classifiers. In the extensive experiments, our approach significantly outperforms the state-of-the-art results on the two public benchmark datasets, i.e., HMDB51 and YouTube.

Download to read the full chapter text

Chapter PDF

A Non-negative Low Rank and Sparse Model for Action Recognition

Sparse coding-based space-time video representation for action recognition

Article 25 June 2016

sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bhattacharya, S., Sukthankar, R., **, R., Shah, M.: A Probabilistic Representation for Efficient Large-Scale Visual Recognition Tasks. In: CVPR (2011)
Google Scholar
Brendel, W., Todorovic, S.: Activities as Time Series of Human Postures. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 721–734. Springer, Heidelberg (2010)
Chapter Google Scholar
Coates, A., Ng, A.: The Importance of Encoding versus Training with Sparse Coding and Vector Quantization. In: ICML (2011)
Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A Library for Large Linear Classification. JMLR (2008)
Google Scholar
Gemert, J., Veenman, C., Smeulders, A., Geusebroek, J.: Visual Word Ambiguity. PAMI (2009)
Google Scholar
Kliper-Gross, O., Gurovich, Y., Hassner, T., Wolf, L.: Motion Interchange Patterns for Action Recognition in Unconstrained Videos. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 256–269. Springer, Heidelberg (2012)
Chapter Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, Scene and Actions: Combining Multiple Features for Human Action Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 494–507. Springer, Heidelberg (2010)
Chapter Google Scholar
Jaakkola, T., Haussler, D.: Exploiting Generative Models in Discriminative Classifiers. In: NIPS (1998)
Google Scholar
Jain, M., Jegou, H., Bouthemy, P.: Better Exploiting Motion for Better Action Recognition. In: CVPR (2013)
Google Scholar
Jegou, H., Douze, M., Schmid, C., Perez, P.: Aggregating Local Descriptors into a Compact Image Representation. In: CVPR (2010)
Google Scholar
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-Based Modeling of Human Actions with Motion Reference Points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Chapter Google Scholar
Krapac, J., Verbeek, J., Jurie, F.: Modeling Spatial Layout with Fisher Vector for Image Categorization. In: ICCV (2011)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A Large Video Database for Human Motion Recognition. In: CVPR (2011)
Google Scholar
Laptev, I.: On Space-Time Interest Points. IJCV (2005)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning Realistic Human Actions from Movies. In: CVPR (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: CVPR (2006)
Google Scholar
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning Hierarchical Invariant Spatio-Temporal Features for Action Recognition with Independent Subspace Analysis. In: CVPR (2011)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing Realistic Actions from Videos in the Wild. In: CVPR (2009)
Google Scholar
Liu, L., Wang, L., Liu, X.: In Defense of Soft-Assignment Coding. In: ICCV (2011)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online Dictionary Learning for Sparse Coding. In: ICML (2009)
Google Scholar
McCann, S., Lowe, D.G.: Spatially Local Coding for Object Recognition. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 204–217. Springer, Heidelberg (2013)
Chapter Google Scholar
Peng, X., Qiao, Y., Peng, Q., Qi, X.: Exploring Motion Boundary based Sampling and Spatio-Temporal Context Descriptors for Action Recognition. In: BMVC (2013)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Sanchez, J., Perronnin, F., Campos, T.: Modeling the Spatial Layout of Images Beyond Spatial Pyramids. PRL (2012)
Google Scholar
Sanchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image Classification with the Fisher Vector: Theory and Practice. IJCV (2013)
Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C.: Dense Trajectories and Motion Boundary Descriptors for Action Recognition. IJCV (2013)
Google Scholar
Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of Local Spatio-Temporal Features for Action Recognition. In: BMVC (2009)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-Constrained Linear Coding for Image Classification. In: CVPR (2010)
Google Scholar
Wang, X., Wang, L., Qiao, Y.: A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part III. LNCS, vol. 7726, pp. 572–585. Springer, Heidelberg (2013)
Chapter Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification. In: CVPR (2009)
Google Scholar
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering City College, City University of New York, USA
**aodong Yang & YingLi Tian

Authors

**aodong Yang
View author publications
You can also search for this author in PubMed Google Scholar
YingLi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
KU Leuven, ESAT - PSI, iMinds, Kasteelpark Arenberg, 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, X., Tian, Y. (2014). Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8690. Springer, Cham. https://doi.org/10.1007/978-3-319-10605-2_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-10605-2_47
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10604-5
Online ISBN: 978-3-319-10605-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Abstract

Chapter PDF

Similar content being viewed by others

A Non-negative Low Rank and Sparse Model for Action Recognition

Sparse coding-based space-time video representation for action recognition

sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Abstract

Chapter PDF

Similar content being viewed by others

A Non-negative Low Rank and Sparse Model for Action Recognition

Sparse coding-based space-time video representation for action recognition

sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation