An adversarial semi-supervised approach for action recognition from pose information

Pikramenos, George; Mathe, Eirini; Vali, Eleanna; Vernikos, Ioannis; Papadakis, Antonios; Spyrou, Evaggelos; Mylonas, Phivos

doi:10.1007/s00521-020-05162-5

An adversarial semi-supervised approach for action recognition from pose information

S.I. : Emerging applications of Deep Learning and Spiking ANN
Published: 08 July 2020

Volume 32, pages 17181–17195, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

George Pikramenos^1,2,
Eirini Mathe^1,3,
Eleanna Vali⁵,
Ioannis Vernikos^1,4,
Antonios Papadakis²,
Evaggelos Spyrou^1,4 &
…
Phivos Mylonas ORCID: orcid.org/0000-0002-6916-3129³

464 Accesses
6 Citations
Explore all metrics

Abstract

The collection of video data for action recognition is very susceptible to measurement bias; the equipment used, camera angle and environmental conditions are all factors that majorly affect the distribution of the collected dataset. Inevitably, training a classifier that can successfully generalize to new data becomes a very hard problem, since it is impossible to gather general enough training sets. Recent approaches in the literature attempt to solve this problem by augmenting a given training set, with synthetic data, so as to better represent the global distribution of the covariates. However, these approaches are limited because they essentially involve hand-crafted data synthesizers, which are typically hard to implement and problem specific. In this work, we propose a different approach to tackling the above issues, which relies on the combination of two techniques: pose extraction, and domain adaptation as a means to improve the generalization capabilities of classifiers. We show that adapted skeletal representations can be retrieved automatically in a semi-supervised setting and these help to generalize classifiers to new forms of measurement bias. We empirically validate our approach for generalizing across different camera angles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Article 06 August 2019

Synthetic Humans for Action Recognition from Unseen Viewpoints

Article Open access 12 May 2021

Pose Conditioned Human Motion Generation Using Generative Adversarial Networks

Notes

References

Aggarwal JK (2005) Human activity recognition: a grand challenge. In: Digital image computing: techniques and applications (DICTA’05). IEEE, p 1
Wang P, Li W, Ogunbona P, Wan J, Escalera S (2018) RGB-D-based human motion recognition with deep learning: a survey. Comput Vis Image Understanding 171:118–139
Article Google Scholar
Berretti S, Daoudi M, Turaga P, Basu A (2018) Representation, analysis, and recognition of 3D humans: a survey. ACM Trans Multim Comput Commun Appl (TOMM) 14(1):1–36
Google Scholar
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 3. IEEE, pp 32–36
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR 2011. IEEE, pp 1297–1304
Liu J, Shahroudy A, Perez ML, Wang G, Duan L-Y, Chichung AK (2019) NTU RGB+ D 120: a large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2916873
Article Google Scholar
Liu C, Hu Y, Li Y, Song S, Liu J (2017) PKU-MMD: a large scale benchmark for continuous multi-modal human action understanding. ar**v:1703.07475
Van Dyk DA, Meng X-L (2001) The art of data augmentation. J Comput Graph Stat 10(1):1–50
Article MathSciNet Google Scholar
Ding J, Chen B, Liu H, Huang M (2016) Convolutional neural network with data augmentation for SAR target recognition. IEEE Geosci Remote Sens Lett 13(3):364–368
Google Scholar
Li B, Dai Y, Cheng X, Chen H, Lin Y, He M (2017) Skeleton based action recognition using translation-scale invariant image map** and multi-scale deep CNN. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 601–604
Papadakis A, Mathe E, Vernikos I, Maniatis A, Spyrou E, Mylonas P (2019) Recognizing human actions using 3d skeletal information and CNNs. In: Proceedings of international conference on engineering applications of neural networks (EANN)
Lawton MP, Brody EM (1969) Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontol 9(3 Part 1):179–186
Article Google Scholar
Papadakis A, Mathe E, Spyrou E, Mylonas P (2019) A geometric approach for cross-view human action recognition using deep learning. In: Proceedings of international symposium on image and signal processing and analysis (ISPA)
Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Soomro K, Zamir AR, Shah M (2012) UCF101: a dataset of 101 human actions classes from videos in the wild. ar**v:1212.0402
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: 2011 international conference on computer vision. IEEE, pp 2556–2563
Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 6645–6649
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Du Y, Fu Y, Wang L (2015) Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR asian conference on pattern recognition (ACPR). IEEE, pp 579–583
Wang P, Li W, Li C, Hou Y (2018) Action recognition based on joint trajectory maps with convolutional neural networks. Knowl Based Syst 158:43–53
Article Google Scholar
Hou Y, Li Z, Wang P, Li W (2016) Skeleton optical spectra-based action recognition using convolutional neural networks. IEEE Trans Circuits Syst Video Technol 28(3):807–811
Article Google Scholar
Li C, Hou Y, Wang P, Li W (2017) Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Process Lett 24(5):624–628
Article Google Scholar
Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346–362
Article Google Scholar
Ke Q, An S, Bennamoun M, Sohel F, Boussaid F (2017) Skeletonnet: mining deep part features for 3-d action recognition. IEEE Signal Process Lett 24(6):731–735
Article Google Scholar
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Xu T et al (2016) Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition. Image Vis Comput 55:127–137
Article Google Scholar
Sadanand S, Corso JJ (2012) Action bank: a high-level representation of activity in video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1234–1241. https://doi.org/10.1109/CVPR.2012.6247806
Tas Y, Koniusz P (2018) Cnn-based action recognition and supervised domain adaptation on 3d body skeletons via kernel feature maps. ar**v:1806.09078
Koniusz P, Tas Y, Porikli F (2017) Domain adaptation by mixture of alignments of second- or higher-order scatter tensors. In: CVPR
Zhang J et al (2016) Semi-supervised image-to-video adaptation for video action recognition. IEEE Trans Cybern 47(4):960–973
Article Google Scholar
Hachiya H, Sugiyama M, Ueda N (2012) Importance-weighted least-squares probabilistic classifier for covariate shift adaptation with application to human activity recognition. Neurocomputing 80:93–101
Article Google Scholar
Jiang W, Yin Z (2015) Human activity recognition using wearable sensors by deep convolutional neural networks. In: Proceedings of the 23rd ACM international conference on Multimedia. ACM, pp 1307–1310
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Mach Learn 79(1–2):151–175
Article MathSciNet Google Scholar
Csurka G (2017) A comprehensive survey on domain adaptation for visual applications. In: Csurka G (ed) Domain adaptation in computer vision applications. Advances in computer vision and pattern recognition. Springer, Cham
Chapter Google Scholar
Wang M, Deng W (2018) Deep visual domain adaptation: a survey. Neurocomputing 312:135–153
Article Google Scholar
Tzeng E et al (2017) Adversarial discriminative domain adaptation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2962–2971
Ajakan H et al (2014) Domain-adversarial neural networks. ar**v:1412.4446
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. NIPS
Chollet F (2015) Keras. https://github.com/fchollet/keras. Accessed 22 April 2020
Abadi M et al (2016) TensorFlow: a system for large-scale maching learning. In: Proceedings of the USENIX symposium on operating systems design and implementation (OSDI)

Download references

Acknowledgements

This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under Grant Agreement No. 273 (Funding Decision: GGET122785/I2/19-07-2018). We also acknowledge support of this work by the Project SYNTELESIS “Innovative Technologies and Applications based on the Internet of Things (IoT) and the Cloud Computing” (MIS 5002521) which is implemented under the “Action for the Strategic Development on the Research and Technological Sector”, funded by the Operational Programme “Competitiveness, Entrepreneurship and Innovation” (NSRF 2014–2020) and co-financed by Greece and the European Union (European Regional Development Fund).

Author information

Authors and Affiliations

Institute of Informatics and Telecommunications, National Center for Scientific Research-“Demokritos”, Athens, Greece
George Pikramenos, Eirini Mathe, Ioannis Vernikos & Evaggelos Spyrou
Department of Informatics, University of Athens, Athens, Greece
George Pikramenos & Antonios Papadakis
Department of Informatics, Ionian University, Corfu, Greece
Eirini Mathe & Phivos Mylonas
Department of Computer Science and Telecommunications, University of Thessaly, Lamia, Greece
Ioannis Vernikos & Evaggelos Spyrou
School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
Eleanna Vali

Authors

George Pikramenos
View author publications
You can also search for this author in PubMed Google Scholar
Eirini Mathe
View author publications
You can also search for this author in PubMed Google Scholar
Eleanna Vali
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Vernikos
View author publications
You can also search for this author in PubMed Google Scholar
Antonios Papadakis
View author publications
You can also search for this author in PubMed Google Scholar
Evaggelos Spyrou
View author publications
You can also search for this author in PubMed Google Scholar
Phivos Mylonas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phivos Mylonas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pikramenos, G., Mathe, E., Vali, E. et al. An adversarial semi-supervised approach for action recognition from pose information. Neural Comput & Applic 32, 17181–17195 (2020). https://doi.org/10.1007/s00521-020-05162-5

Download citation

Received: 23 January 2020
Accepted: 24 June 2020
Published: 08 July 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s00521-020-05162-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

An adversarial semi-supervised approach for action recognition from pose information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Synthetic Humans for Action Recognition from Unseen Viewpoints

Pose Conditioned Human Motion Generation Using Generative Adversarial Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An adversarial semi-supervised approach for action recognition from pose information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Synthetic Humans for Action Recognition from Unseen Viewpoints

Pose Conditioned Human Motion Generation Using Generative Adversarial Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation