A survey on intelligent human action recognition techniques

Kumar, Rahul; Kumar, Shailender

doi:10.1007/s11042-023-17529-6

A survey on intelligent human action recognition techniques

Published: 11 November 2023

Volume 83, pages 52653–52709, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Rahul Kumar¹ &
Shailender Kumar¹

427 Accesses
Explore all metrics

Abstract

Human Action Recognition is an essential research area in computer vision due to its automated nature of video monitoring. Human Action Recognition has several applications, including robotics, video monitoring, health care, elderly monitoring, crowd behavior and the detection of aberrant activity. This study seeks to offer the reader an up-to-date overview of intelligent human activity recognition literature and current advancements in this area. This work discusses the recent state-of-the-art research for activity recognition techniques and challenges associated with identifying human activity and discusses publicly available datasets. This work consists of an in-depth survey of numerous works published from 2010 to 2022 focusing on intelligent techniques. This article describes all steps of human action recognition along with their techniques. This study comes with the Datasets for Human Action Recognition, Handcrafted-Feature technique, Machine Learning (ML), Deep-Learning (DL), Hybrid Deep Learning and limitation of this area. This study offers a comparative analysis between ML and DL approaches to show their effectiveness in action recognition. This study examines some unexplored areas in human action recognition that can be unearthed to create a more resilient system in the presence of issues. Previous research has demonstrated that deep learning surpasses standard machine learning for recognizing human activities. This study also emphasizes the most pressing issues and research direction. All relevant datasets are described in detail. Furthermore, our opinions and suggestions for future research have been shared. Compared to past surveys, this study offers a more systematic description of Human Action Recognition methods regarding comparability, problems, and the most recent evaluation technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 10

Fig. 12

Fig. 19

Fig. 20

Data availability

N.A.

Code availability

N.A.

References

Ke SR, Thuc HLU, Lee YJ, Hwang JN, Yoo JH, Choi KH (2013) A review on video-based human activity recognition. Computers 2(2): 88–131. MDPI AG. https://doi.org/10.3390/computers2020088
Gupta N, Gupta SK, Pathak RK et al (2022) Human activity recognition in artificial intelligence framework: a narrative review. Artif Intell Rev 55:4755–4808. https://doi.org/10.1007/s10462-021-10116-x
Article Google Scholar
Laptev I, Lindeberg T (2004) Local descriptors for spatio-temporal recognition. In: International workshop on spatial coherence for visual motion analysis
Google Scholar
Gorelick L, BlankM SE, Irani M, Basri R (2005) Actions as space-time shapes. In: The tenth IEEE international conference on computer vision (ICCV’05)
Google Scholar
Rodriguez MD, Ahmed J, Shah M (2008) Action of MACH a spatio-temporal maximum average correlation height filter for action recognition. In: 26th IEEE conference on computer vision and pattern recognition, CVPR, pp 1–8
Google Scholar
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (n.d.) HMDB: a large video database for human motion recognition. In: International conference on computer vision, Barcelona, pp 2556–2563. https://doi.org/10.1109/ICCV.2011.6126543
Reddy KK, Shah M (2012) Recognizing 50 human action categories of web videos. Machine Vision and Applications Journal (MVAP)
Soomro K, Zamir AR, Mubarak Shah (2012) UCF101: A dataset of 101 human action classes from videos in the wild, CRCV-TR-12-01
Weinland D, Boyer E, Ronfard R (2007) Action recognition from arbitrary views using 3D exemplars. In: IEEE11th international conference on computer vision, Rio de Janeiro
Google Scholar
Kwapisz JR, Weiss GM, Moore SA (2011) Activity recognition using cell phone accelerometers. ACM SIGKDD Explorations Newsl 12(2):74–82. https://doi.org/10.1145/1964897.1964918
Article Google Scholar
Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Proceedings of IEEE international conference on image processing. Canada
Google Scholar
Heilbron FC, Escorcia V, Ghanem B, Niebles JC (n.d) ActivityNet: a large-scale video benchmark for human activity understanding. In: IEEE conference on computer vision and pattern recognition (CVPR), Boston M A
Wang J, Nie X, **a Y, Wu Y, Zhu SC (2014) Cross-view action modeling, learning and recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2649–2656. https://doi.org/10.1109/CVPR.2014.339
Chapter Google Scholar
Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) HOPC: histogram of oriented principal components of 3D pointclouds for action recognition. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8690 LNCS(PART 2):742–757. https://doi.org/10.1007/978-3-319-10605-2_48/COVER
Article Google Scholar
Shahroudy A, Liu J, Ng T-T, Wang G (n.d.) NTU RGB+D: a large-scale dataset for 3D human activity analysis. In: IEEE conference on computer vision and pattern recognition (CVPR)
Jalal A, Kamal S, Kim D (n.d.) A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments. Sensors 14(7):11735–11759
Liu J, Shahroudy A, Perez M, Wang G, Duan L-Y, Kot AC (n.d.) NTU RGB+D 120: a large-scale benchmark for 3D human activity understanding. In: IEEE transactions on pattern analysis and machine intelligence (TPAMI)
Kay W et al (2017) The Kinetics Human Action Video Dataset. [Online]. Available: http://arxiv.org/abs/1705.06950
Li A, Thotakuri M, Ross DA, Carreira J, Vostrikov A, Zisserman A (2020) The AVA-Kinetics Localized Human Actions Video Dataset, [Online]. Available: http://arxiv.org/abs/2005.00214
Damen D, Doughty H, Farinella GM et al (2022) Rescaling egocentric vision: collection, pipeline and challenges for EPIC-KITCHENS-100. Int J Comput Vis 130:33–55. https://doi.org/10.1007/s11263-021-01531-2
Article Google Scholar
Carreira J, Noland E, Banki-Horvath A, Hillier C, Zisserman A (2018) A Short Note about Kinetics-600, [Online]. Available: http://arxiv.org/abs/1808.01340
Carreira J, Noland E, Hillier C, Zisserman A (2019) A Short Note on the Kinetics-700 Human Action Dataset, [Online]. Available: http://arxiv.org/abs/1907.06987
Monfort M et al (2018) Moments in Time Dataset: one million videos for event understanding, [Online]. Available: http://arxiv.org/abs/1801.03150
Niebles JC, Wang H, Fei-Fei L (n.d.) Unsupervised learning of human action categories using spatio-temporal words. Int J Comput Vis 79:299–318
Calderara S, Cucchiara R, Prati A (n.d.) Action signature: a novel holistic representation for action recognition. In: Proc. IEEE 5th international conference on advanced video and signal-based surveillance, pp 121–128
Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422. https://doi.org/10.1109/TPAMI.2011.239
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2012) Neural representation and learning for multi-view human action recognition. In: The 2012 international joint conference on neural networks (IJCNN), Brisbane, pp 1–6. https://doi.org/10.1109/IJCNN.2012.6252675
Lu Y et al (2012) A human action recognition method based on Tchebichef moment invariants and temporal templates. In: 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics, 2:76–79
Chapter Google Scholar
Ji X, Liu H (2010) Advances in view-invariant human motion analysis: a review. In: IEEE transactions on systems, man, and cybernetics, Part C (applications and reviews), 40(1):13–24. https://doi.org/10.1109/TSMCC.2009.2027608
Chapter Google Scholar
Estevam V, Pedrini H, Menotti D (2021) Zero-shot action recognition in videos: a survey. Neurocomputing 439:59–175. https://doi.org/10.1016/j.neucom.2021.01.036
Article Google Scholar
Pareek P, Thakkar A (n.d.) A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif Intell Rev 54:2259–2322
Dang LM, Min K, Wang H, Piran MJ, Lee CH, Moon HJ (2020) Sensor-based and vision-based human activity recognition: a comprehensive survey. Pattern Recogn 108(107561):31–3203
Google Scholar
Beddiar DR, Nini B, Sabokrou M et al (2020) Vision-based human activity recognition: a survey. Multimed Tools Appl 79:30509–30555. https://doi.org/10.1007/s11042-020-09004-3
Article Google Scholar
Zhang H-B, Zhang Y-X, Zhong B, Lei Q, Yang L, Du J-X, Chen D-S (2019) A comprehensive survey of vision-based human action recognition methods. Sensors 19:1005. https://doi.org/10.3390/s19051005
Article Google Scholar
Herath S, Harandi M, Porikli F (2017) Going deeper into action recognition: a survey. Image and Vision Computing 60:4–21. https://doi.org/10.1016/j.imavis.2017.01.010
Article Google Scholar
Singh PK, Kundu S, Adhikary T, Sarkar R, Bhattacharjee D (2021) Progress of Human Action Recognition Research in the Last Ten Years: A Comprehensive Survey. Arch Comput Methods Eng 29:4:2309–2349. https://doi.org/10.1007/S11831-021-09681-9
Jobanputra H, Bavishi J, Doshi N (2019) Human activity recognition: a survey. Procedia Comput Sci 155:698–703. https://doi.org/10.1016/j.procs.2019.08.100
Kong Y, Yun Raymond F (2018) Human action recognition and prediction: a survey. Int J Comput Vis 130:1366–1401
Article Google Scholar
Guangchun C, Yiwen W, Abdullah S, Kamesh N, Bill B (2015) Advances in human action recognition: A survey
Vishwakarma S, Agrawal A (n.d.) A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29(10):983–1009
Aggarwal JK, Ryoo MS (2011) Human activity analysis. ACM Computing Surveys (CSUR) 43:1–43
Article Google Scholar
Bobick AF, Davis JW (n.d.) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267
Sheikh Y, Sheikh M, Shah M (n.d.) Exploring the space of a human action. In: Tenth IEEE Int Conf on Computer Vision, pp 144–149
Amor BB, Su J, Srivastava A (n.d.) Action recognition using rate-invariant analysis of skeletal shape trajectories. Trans Pattern Anal Mach Intell 38:1–13
Wang H, Kläser A, Schmid C, Liu C (n.d.) Action recognition by dense trajectories. CVPR 3169–3176
Laptev I, Lindeberg T (n.d.) Space-time interest points. In: Proc. 9th IEEE Int. Conf. On computer vision, pp 432–439
Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse Spatio-temporal features. In: IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance
Google Scholar
Bregonzio M, Gong S, **ang T (2009) Recognising action as clouds of space-time interest points. In: 2009 IEEE conference on computer vision and pattern recognition, Miami, pp 1948–1955. https://doi.org/10.1109/CVPR.2009.5206779
Thi TH, Zhang J, Cheng L, Wang L, Satoh S (n.d.) Human action recognition and localization in video using structured learning of local space-time features. IEEE International Conference on Advanced Video and Signal Based Surveillance, pp 204–211
Sadek S, Al-Hamadi A, Michaelis B, Sayed U (n.d.) An action recognition scheme using fuzzy log-polar histogram and temporal self-similarity. EURASIP J Adv Signal Process
Chaudhry R, Ravichandran A, Hager G, Vidal R (n.d.) Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE computer Soc. Conf. Computer. Vis. Pattern recognition work. CVPR work. IEEE, pp 1932–1939
Yuan C, Li X, Hu W, Ling H, Maybank S (n.d.) 3D R transform on spatio-temporal interest points for action recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 724–730
Sahoo SP, Silambarasi R, Ari S (n.d.) Fusion of histogram-based features for human action recognition. In: 5th international conference on advanced computing & communication systems, pp 1012–1016
Gupta S, Mazumdar S, Student M (2013) Sobel edge detection algorithm.
Teoh SH, Ibrahim H (n.d) Median filtering frameworks for reducing impulse noise from grayscale digital images: a literature survey. Int J Future Comput Commun 1:323–326
Le QV, Zou WY, Yeung SY, Ng AY (n.d.) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3361–3368
Darrell T, Pentland A (n.d.) Space-time gestures. In: Proc. IEEE computer society Conf. On computer vision and pattern recognition, pp 335–340
Jiang H, Drew MS, Li ZN (n.d.) Successive convex matching for action detection. In: IEEE computer society Conf. On computer vision and pattern recognition, pp 1646–1653
Oliver NM, Rosario B, Pentland AP (n.d.) A Bayesian computer vision system for modelling human interactions. IEEE Trans Pattern Anal Mach Intell 22(8):831–843
Shi Q, Cheng L, Wang L, Smola A (n.d.) Human action segmentation and recognition using discriminative semi-Markov models. Int J Comput Vis 93:22–32
Oliver N, Horvitz E, Garg A (n.d) Layered representations for human activity recognition. In: Proc. 4th IEEE Int. Conf. On multimodal interfaces, pp 3–8
Zhang D, Gatica-Perez D, Bengio S, McCowan I (n.d.) Modelling individual and group actions in meetings with layered hmms. IEEE Trans Multimed 8(3):509–520
Nguyen NT, Phung DQ, Venkatesh S, Bui H (n.d.) Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model, IEEE computer society Conf on computer vision and pattern recognition, pp 955–960
Shi Y, Huang Y, Minnen D, Bobick A, Essa I (n.d.) Propagation networks for recognition of partially ordered sequential action. In: Proc. of IEEE computer society Conf. On computer vision and pattern recognition, pp 862–869
Iosifidis A, Tefas A, Pitas I (n.d.) Action-based person identification using fuzzy representation and discriminant learning. IEEE Trans Inf Forensics Secur 7:530–542
Xu W, Miao Z, Zhang X, Tian Y (n.d.) Learning a hierarchical spatio-temporal model for human activity recognition. In: International conference on acoustics, speech and signal processing (ICASSP). IEEE, New Orleans, pp 1607–1611
Kitani KM, Sato Y, Sugimoto A (2007) Recovering the basic structure of human activities from a video-based symbol string. In: 2007 IEEE workshop on motion and video computing (WMVC'07), Austin, p 9. https://doi.org/10.1109/WMVC.2007.34
Ivanov Y, Bobick A (n.d.) Recognition of visual activities and interactions by stochastic parsing. IEEE Trans Pattern Anal Mach Intell 22:852–872
Moore D, Essa I (n.d.) Recognizing multitasked activities from video using stochastic context-free grammar. AAAI National Conference on Artificial Intelligence, pp 770–776
Minnen D, Essa I, Starner T (n.d.) Expectation grammars: leveraging high-level expectations for activity recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 626–632
Joo SW, Chellappa R (n.d.) Attribute grammar-based event recognition and anomaly detection. IEEE Conference on Computer Vision and Pattern Recognition Workshop, pp 107–114
Siskind JM (n.d.) Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J Artif Intell Res 15:31–90
Gupta A, Srinivasan P, Shi J, Davis L (n.d.) Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2012–2019
Ijsselmuiden J, Stiefelhagen R (n.d.) Towards high-level human activity recognition through computer vision and temporal logic. In: The 33rd annual German conference on advances in artificial intelligence, pp 426–435
Khare M, Jeon M (2022) Multi-resolution approach to human activity recognition in video sequence based on combination of complex wavelet transform, Local Binary Pattern and Zernike moment. Multimed Tools Appl 81(24):34863–34892. https://doi.org/10.1007/S11042-021-11828-6/FIGURES/10
Article Google Scholar
Li C, Huang Q, Li X, Wu Q (2021) Human action recognition based on multi-scale feature maps from depth video sequences. Multimed Tools Appl 80(21–23):32111–32130. https://doi.org/10.1007/S11042-021-11193-4/TABLES/8
Article Google Scholar
Ikizler N, Duygulu PD (n.d.) Histogram of oriented rectangles: a new pose descriptor for human action recognition. Image Vis Comput 27(10):1515–1526. https://doi.org/10.1016/j.imavis.2009.02.002
Kellokumpu V, Zhao G, Pietikäinen M (n.d.) Recognition of human actions using texture descriptors. Mach Vis Appl 22:767–780
Kliper-Gross O, Gurovich Y, Hassner T, Wolf L (n.d.) Motion interchange patterns for action recognition in unconstrained videos. In: European conference on computer vision. Springer, Berlin/Heidelberg, pp 256–269
Jiang YG, Dai Q, Xue X, Liu W, Ngo CW (n.d.) Trajectory-based modeling of human actions with motion reference points. In: European conference on computer vision. Springer, Berlin/Heidelberg, pp 425–438
Wang C, Wang Y, Yuille AL (n.d.) An approach to pose-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Portland, OR, USA, pp 915–922
Zanfir M, Leordeanu M, Sminchisescu C (n.d.) The moving pose: an efficient 3d kinematics descriptor for low-latency action recognition and detection. In: Proceedings of the IEEE international conference on computer vision. Sydney, Australia, pp 2752–2759
Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (n.d.) Silhouette-based human action recognition using sequences of key poses. Pattern Recogn Lett 34:1799–1807
Rahman SA, Song I, Leung MK, Lee I, Lee K (n.d.) Fast action recognition using negative space features. Expert Syst Appl 41:574–587
Junejo IN, Junejo KN, Al Aghbari Z (n.d) Silhouette-based human action recognition using SAX-shapes. Vis Comput 30:259–269
Vishwakarma DK, Kapoor R, Dhiman A (n.d.) A proposed unified framework for the recognition of human activity by exploiting the characteristics of action dynamics. Robot Auton Syst 77:25–38
Jalal A, Kim YH, Kim YJ, Kamal S, Kim D (n.d.) Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recogn 61:295–308
Patrona F, Chatzitofis A, Zarpalas D, Daras P (2018) Motion analysis: action detection, recognition and evaluation based on motion capture data. Pattern Recogn 76:612–622
Zhang C, Xu Y, Xu Z et al (2022) Hybrid handcrafted and learned feature framework for human action recognition. Appl Intell 52:12771–12787. https://doi.org/10.1007/s10489-021-03068-w
Article Google Scholar
Bengio Y (n.d) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127
Ji S, Xu W, Yang M, Yu K (n.d.) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Weimer D, Scholz-Reiter B, Shpitalni M (n.d.) Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection. CIRP Ann Manuf Technol 65(1):417–420
Le QV (n.d.) Building high-level features using large scale unsupervised learning. In: 2013 IEEE Int. Conf. On acoustics, speech and signal processing (ICASSP)
Huang Y, Lai S-H, Tai S-H (n.d.) Human action recognition based on temporal pose CNN and multidimensional fusion. In: Proceedings of the European conference on computer vision (ECCV)
Min S, Lee B, Yoon S (2017) Deep learning in bioinformatics. Brief Bioinform 18(5):851–869. https://doi.org/10.1093/bib/bbw068
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (n.d.) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, Lake Tahoe, Nevada, pp 1097–1105
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li FF (n.d.) Large-scale video classification with convolutional neural networks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, pp 1725–1732
Ravi D, Wong C, Lo B, Yang GZ (n.d.) Deep learning for human action recognition: a resource efficient implementation on low-power devices. In: BSN 2016—13th annual body sensor networks conference, pp 71–76
Marjaneh S, Hassan F (2017) Single image action recognition by predicting space-time saliency
Banerjee A, Singh PK, Sarkar R (n.d.) Fuzzy integral based CNN classifier fusion for 3D skeleton action recognition. IEEE Trans Circ Syst Video Technol 31(6):2206–2216
Ng A (n.d.) Sparse autoencoder. CS294A Lect Note 72:1–19
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (n.d.) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
Hasan M, Roy-Chowdhury AK (n.d.) A continuous learning framework for activity recognition using deep hybrid feature models. IEEE Trans Multimed 17:11
Wang X, Gao L, Song J, Zhen X, Sebe N, Shen HT (n.d.) Deep appearance and motion learning for egocentric activity recognition. Neurocomputing 275:438–447
Gao X, Luo H, Wang Q, Zhao F, Ye L, Zhang Y (2019) A human activity recognition algorithm based on stacking Denoising autoencoder and LightGBM. Sensors. 19(4):947. https://doi.org/10.3390/s19040947
Article Google Scholar
Du Y, Wang W, Wang L (n.d.) Hierarchical recurrent neural network for skeleton-based action recognition. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1110–1118
Graves A (n.d.) Generating sequences with recurrent neural networks.
Salehinejad H, Sankar S, Barfett J, Colak E, Valaee S (n.d.) Recent advances in recurrent neural networks.
Qi M, Wang Y, Qin J, Li A, Luo J, Gool L (n.d.) stagNet: an attentive semantic RNN for group action and individual action recognition. IEEE Trans Circ Syst Video Technol 30:1
Liu J, Shahroudy A, Xu D, Wang G (n.d.) Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9907. LNCS, pp 816–833
Cho K et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734
Goodfellow I et al (n.d.) Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp 2672–2680
Huang GB, Lee H, Learned-Miller E (n.d.) Learning hierarchical representations for face verifcation with convolutional deep belief networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’ 12), pp 2518–2525
Radford A, Metz L, Chintala S (n.d.) Unsupervised representation learning with deep convolutional generative adversarial networks.
Zadeh MZ, Babu AR, Jaiswal A, Makedon F (n.d.) Self-supervised human activity recognition by augmenting generative adversarial networks, p 11755
Li R, Pan J, Li Z, Tang J (n.d.) Single image Dehazing via conditional generative adversarial network.
Yang Y, Hou C, Lang Y, Guan D, Huang D, Xu J (n.d.) Open-set human activity recognition based on micro-Doppler signatures. Pattern Recogn 85:60–69
Gammulle H, Denman S, Sridharan S, Fookes C (2019) Multi-level sequence GAN for group activity recognition. In: Jawahar C, Li H, Mori G, Schindler K (eds) Computer vision – ACCV 2018. Lecture notes in computer science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_21
Ahsan U, Sun C, Essa I (n.d.) DiscrimNet: semi-supervised action recognition from videos using generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Women in computer vision (WiCV’17)
Donahue J et al (n.d.) Long-term recurrent convolutional networks for visual recognition and description. CVPR
Kar A, Rai N, Sikka K, Sharma G (n.d.) Adascan: adaptive scan pooling in deep convolutional neural networks for human action recognition in videos. CVPR
Jaouedi N, Boujnah N, Bouhlel MS (n.d.) A new hybrid deep learning model for human action recognition. J King Saud Univ - Comput Inf Sci 32
Gowda SN (2017) Human activity recognition using combinatorial deep belief networks. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 1589–1594. https://doi.org/10.1109/CVPRW.2017.203
Chapter Google Scholar
Wu Z, Wang X, Jiang Y-G, Ye H, Xue X (2015) Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In: Proceedings of the 23rd ACM international conference on multimedia (MM '15). Association for Computing Machinery, New York, pp 461–470. https://doi.org/10.1145/2733373.2806222
Chapter Google Scholar
Lv M, Xu W, Chen T (n.d.) A hybrid deep convolutional and recurrent neural network for complex activity recognition using multimodal sensors. Neurocomputing 362
Ij**a EP, Mohan CK (n.d.) Hybrid deep neural network model for human action recognition. Appl. Soft Comput 46:936–952
Al-Azzawi NA (n.d.) Human action recognition based on hybrid deep learning model and Shearlet transform. In: 2020 12th international conference on information technology and electrical engineering (ICITEE, Yogyakarta), pp 152–155
Yadav SK, Tiwari K, Pandey HM, Akbar SA (2022) Skeleton-based human activity recognition using ConvLSTM and guided feature learning. Soft comput 26(2):877–890. https://doi.org/10.1007/S00500-021-06238-7/FIGURES/11
Article Google Scholar
Wensel J, Ullah H, Member S, Munir A, Member S (2022) ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos. Accessed: May 11, 2023. [Online]. Available: https://arxiv.org/abs/2208.07929v2
Challa SK, Kumar A, Semwal VB (2022) A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. Vis Comput 38(12):4095–4109. https://doi.org/10.1007/S00371-021-02283-3/TABLES/7
Article Google Scholar
Jiang N, Quan W, Geng Q, Shi Z, Xu P (2023) Exploiting 3D human recovery for action recognition with Spatio-temporal bifurcation fusion. In: ICASSP 2023–2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1–5. https://doi.org/10.1109/ICASSP49357.2023.10096404
Chapter Google Scholar
Merlo E, Lagomarsino M, Lamon E, Ajoudani A (2023) Automatic Interaction and Activity Recognition from Videos of Human Manual Demonstrations with Application to Anomaly Detection. Accessed: May 12, 2023. [Online]. Available: http://arxiv.org/abs/2304.09789
Usmani A, Siddiqui N, Islam S (2023) Skeleton joint trajectories based human activity recognition using deep RNN. Multimed Tools Applic 2023:1–25. https://doi.org/10.1007/S11042-023-15024-6
Article Google Scholar
Yin M, He S, Soomro TA, Yuan H (2023) Efficient skeleton-based action recognition via multi-stream depthwise separable convolutional neural network. Expert Syst Appl 226:120080. https://doi.org/10.1016/J.ESWA.2023.120080
Article Google Scholar
Barkoky A, Charkari NM (2022) Complex Network-based features extraction in RGB-D human action recognition. J Vis Commun Image Represent 82:103371. https://doi.org/10.1016/J.JVCIR.2021.103371
Article Google Scholar
Deng L (n.d.) A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans Signal Inf Process 3:2
Dosovitskiy A, Fischer P, Springenberg JT (n.d.) Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 38(9):1734–1747
Núñez JC, Cabido R, Pantrigo JJ, Montemayor AS, Vélez JF (2018) Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recogn 76
Dobhal T, Shitole V, Thomas G, Navada G (n.d.) Human activity recognition using binary motion image and deep learning. Procedia Comput Sci 58:178–185
Khelalef A, Ababsa F, Benoudjit N (2019) An efficient human activity recognition technique based on deep learning. Pattern Recognit Image Anal 29:702–715
Si C, Chen W, Wang W, Wang L, Tan T (n.d.) An attention enhanced graph convolutional LSTM network for skeleton-based action recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1227–1236
Majd M, Safabakhsh R (2020) Correlational convolutional LSTM for human action recognition. Neurocomputing 396:224–229. https://doi.org/10.1016/j.neucom.2018.10.095
Article Google Scholar
Dai C, Liu X, Lai J (n.d.) Human action recognition using two-stream attention-based LSTM networks. Appl Soft Comput
Simonyan K, Zisserman A (n.d.) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems, pp. 568–576
Ullah A, Muhammad K, Ser JD, Baik SW, Albuquerque VHC (n.d.) Activity Recognition Using Temporal Optical Flow Convolutional Features and Multilayer LSTM. IEEE Trans Ind Electr 66(12):9692–9702
Hinton GE, Osindero S, Teh Y-W (n.d.) A fast-learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Uddin MZ (n.d.) Facial expression recognition utilizing local direction-based robust features and deep belief network. IEEE Access 5:4525–4536
Sheeba PT, SSM, Rani SD (n.d.) Fuzzy Based Deep Belief Network for Activity Recognition. In: Proceedings of International Conference on Recent Trends in Computing, Communication & Networking Technologies (ICRTCCNT)
Lee H, Grosse R, Ranganath R, Ng AY (n.d.) Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun ACM 54(10):95–103
Li X et al (n.d.) Region-based Activity Recognition Using Conditional GAN. In: Proceedings of the 25th ACM international conference on Multimedia Association for Computing Machinery, New York, NY, USA, pp. 1059–1067
Savadi Hosseini M, Ghaderi F (n.d.) A Hybrid Deep Learning Architecture Using 3D CNNs and GRUs for Human Action Recognition. Int J Eng 33(5):959–965
Wang L, Qiao Y (n.d.) Tang X Action recognition with trajectory-pooled deep-convolutional descriptors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4305–4314
Ullah A, Muhammad K, Haq IU (n.d.) Baik SW Action recognition using optimized deep Autoencoder and CNN for surveillance data streams of non-stationary environments. Future Gener Comput 96:386–397
Shi Y, Tian Y, Wang Y, Huang T (n.d.) Sequential deep trajectory descriptor for action recognition with three-stream cnn. IEEE Trans Multimed 19(7):1510–1520
Liu M, Liu H, Chen C (n.d.) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346–362
Li C, Wang P, Wang S, Hou Y, Li W (n.d.) Skeleton-based action recognition using LSTM and CNN. In: IEEE international conference on multimedia and expo workshops (ICMEW). IEEE, pp. 585–590
Das S, Chaudhary A, Bremond F, Thonnat M (n.d.) Where to focus on for human action recognition? In: IEEE winter conference on applications of computer vision (WACV). IEEE, pp. 71–80
Ij**a EP, Chalavadi KM (n.d.) Human action recognition in RGB-D videos using motion sequence information and deep learning. Recognition 72:pp. 31–3203, 504–516
Verma P, Sah A, Srivastava R (n.d.) Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition. Multimed Syst 26:671–685
Tanberk S, Kilimci ZH, Tükel DB, Uysal M, Akyokuş S (n.d.) A Hybrid Deep Model Using Deep Learning and Dense Optical Flow Approaches for Human Activity Recognition. IEEE Access 8:19799–19809
Singh T, Vishwakarma DK (n.d.) A deeply coupled ConvNet for human activity recognition using dynamic and RGB images. Neural Comput Applic 33:469–485
Mukherjee D, Mondal R, Singh PK (n.d.) EnsemConvNet: a deep learning approach for human activity recognition using smartphone sensors for healthcare applications. Multimed Tools Appl 79:31663–31690
Tasnim N, Islam MK, Baek J-H (2021) Deep Learning Based Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints. Appl Sci 11(6):2675
Article Google Scholar
Bilal M, Maqsood M, Yasmin S (n.d.) A transfer learning-based efficient spatiotemporal human action recognition framework for long and overlap** action classes. J Supercomput 78:2873–2908
Muhammad K et al (n.d.) Human action recognition using attention-based LSTM network with dilated CNN features. Future Gener Comput Syst 125:820–830, pp. 167–739
Andrade-Ambriz YA, Ledesma S, Ibarra-Manzano M-A, Oros-Flores MI, Almanza-Ojeda D-L (2022) Human activity recognition using temporal convolutional neural network architecture. Expert Syst Appl 191:116287
Ullah A, Muhammad K, Ding W, Palade V, Haq IU, Baik SW (2021) Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications. Appl Soft Comput 103:107102. https://doi.org/10.1016/J.ASOC.2021.107102
Article Google Scholar
Yadav SK, Luthra A, Tiwari K, Pandey HM, Akbar SA (2022) ARFDNet: An efficient activity recognition & fall detection system using latent feature pooling. Knowl Based Syst 239:107948. https://doi.org/10.1016/J.KNOSYS.2021.107948
Article Google Scholar
Basak H, Kundu R, Singh PK, Ijaz MF, Woźniak M, Sarkar R (2022) A union of deep learning and swarm-based optimization for 3D human action recognition. Sci Rep 12(1). https://doi.org/10.1038/s41598-022-09293-8
Putra PU, Shima K, Shimatani K (n.d.) A deep neural network model for multi-view human activity recognition. PLoS One 17(1):262181
Sánchez-Caballero A et al (2022) 3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information. Multimed Tools Appl 81(17):24119–24143. https://doi.org/10.1007/S11042-022-12091-Z/TABLES/7
Article Google Scholar
Nasir IM, Raza M, Ulyah SM, Shah JH, Fitriyani NL, Syafrudin M (2023) ENGA: Elastic Net-Based Genetic Algorithm for human action recognition. Expert Syst Appl 227:120311. https://doi.org/10.1016/J.ESWA.2023.120311
Article Google Scholar
Nikpour B, Armanfard N (2023) Spatio-temporal hard attention learning for skeleton-based activity recognition. Pattern Recognit 139:109428. https://doi.org/10.1016/J.PATCOG.2023.109428
Article Google Scholar
Al-Faris M, Chiverton J, Ndzi D, Ahmed AI (n.d.) A Review on Computer Vision-Based Methods for Human Action Recognition. J Imaging 10;6(6):46

Download references

Funding

NA.

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Delhi Technological University, Delhi, 42, India
Rahul Kumar & Shailender Kumar

Authors

Rahul Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Shailender Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rahul Kumar.

Ethics declarations

Ethics approval

N.A.

Consent to participate

N.A.

Consent for publication

NA.

Conflicts of interest/Competing interests

Authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, R., Kumar, S. A survey on intelligent human action recognition techniques. Multimed Tools Appl 83, 52653–52709 (2024). https://doi.org/10.1007/s11042-023-17529-6

Download citation

Received: 10 October 2022
Revised: 29 May 2023
Accepted: 15 October 2023
Published: 11 November 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17529-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on intelligent human action recognition techniques

Abstract

Access this article

Subscribe and save

Buy Now

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Conflicts of interest/Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation