A Systematic Survey on Human Behavior Recognition Methods

Yuan, Meixue; Wei, Shouke; Zhao, **dong; Sun, Ming

doi:10.1007/s42979-021-00932-x

A Systematic Survey on Human Behavior Recognition Methods

Survey Article
Published: 23 October 2021

Volume 3, article number 6, (2022)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Meixue Yuan¹,
Shouke Wei ORCID: orcid.org/0000-0002-4665-5366^1,2,
**dong Zhao¹ &
…
Ming Sun¹

987 Accesses
2 Citations
Explore all metrics

Abstract

Human behavior is an essential component of social interaction and is of great significance to identify and analyze human behaviors in a variety of fields. Due to the rapid development of computer vision and machine learning technology, machine with intelligence has started replacing human beings to observe, perceive and analyze the explosive growth of image and video data. Computer vision and machine learning-based human behavior recognition is one of these tasks, which has become a particularly hot research topic in many different fields, such as intelligent monitoring, human–computer interaction, smart home, virtual reality, and medical diagnosis. In this study, we survey systematically the popular methods, algorithms, models and well-known action datasets in human behavior analysis in the past two decades. In addition, the advantages and disadvantages of the methods are discussed and propitious future research directions are also presented. The results of this survey reveal that paradigms of human behavior analysis is being shifted from traditional RGB to RGB-D, from deep learning to more intelligent and automated deep reinforcement learning, and from fixed camera devices to portable devices and channel state information (CSI), and paradigms based on automated deep reinforcement learning and portable devices and CSI would become some hot topics for future research on human behavior analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 3

Fig. 4

Fig. 5

Fig. 10

Fig. 12

Fig. 13

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Article 12 August 2023

Sensor-Based Human Activity and Behavior Computing

A Review on Human Behavior Using Machine Learning for Ambient Assisted Living

References

Aggarwal JK, Ryoo MS. Human activity analysis: a review. ACM Comput Surv. 2011;43(3):1–43.
Google Scholar
Zhu H, Vial R, Lu S. Tornado: a spatio-temporal convolutional regression network for video action proposal. In: IEEE International Conference on Computer Vision. 2017.
Paul SN, Singh YJ. Survey on video analysis of human walking motion. Int J Signal Process Image Process Pattern Recognit. 2014;7:99–122.
Google Scholar
Papadopoulos GT, Axenopoulos A, Daras P. Real-time skeleton-tracking-based human action recognition using kinect data. In: Proceedings of the international conference on multimedia modeling. Cham: Springer; 2014. p. 473–83.
Google Scholar
Mao XD, Fan YW. Application of high-definition technology in city public safety video surveillance. Video Eng. 2010;34(04):103–5.
Google Scholar
Zhang W, Li W. A deep reinforcement learning based human behavior prediction approach in smart home environments. In: 2019 International Conference on Robots and Intelligent System (ICRIS). 2019.
Zhang PF, He KZ, OuYang ZZ, Zhang JY. Multifunctional intelligent outdoor mobile robot testbed-THMR-V. Robot. 2002;24(02):97–101.
Google Scholar
Presti LL, Cascia ML. 3D Skeleton-based human action classifification: a survey. Pattern Recogn. 2016;53:130–47.
Google Scholar
Haritaoglu I, Harwood D, Davis LS. W4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell. 2000;22(8):809–30.
Google Scholar
Moeslund TB, Hilton A, Krüger V. A survey of advances in vision-based human motion capture and analysis. IEEE Trans Med Imaging. 2006;104(2–3):90–126.
Google Scholar
Collins RT, Lipton AJ, Fujiyoshi H, Kanade T. Algorithms for cooperative multisensor surveillance. Proc IEEE. 2001;89(10):1456–77.
Google Scholar
Gemert JCV, Jain M, Gati E, Snoek CGM. APT: action localization proposals from dense trajectories. In: Proceedings of the British Machine Vision Conference, 2015, pp. 7–10.
Wren CR, Azarbayejani AJ, Darrell TJ, Pentland AP. Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell. 1997;19(7):780–5.
Google Scholar
Arulampalam MS, Maskell S, Gordon N, Clapp T. A tutorial on particule filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process. 2002;50(174):v2.
Google Scholar
Chen YP, Qiu WG. Review of human behavior recognition algorithms based on vision. Comput Appl Res. 2019;36(7):1–10.
Google Scholar
Zhang Z, Tao DC. Slow feature analysis for human action recognition. IEEE Trans Pattern Anal Mach Intell. 2012;34(3):436–50.
Google Scholar
Laptev I, Marszalek M, Schmid C, Rozenfeld B. Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.
Herath S, Harandi M, Porikli F. Going deeper into actio recognition: a survey. Image Vis Comput. 2017;60:4–21.
Google Scholar
Dawn DD, Shaikh SH. A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis Comput. 2016;32(3):289–306.
Google Scholar
Laptev I. On space-time interest points. Int J Comput Vis. 2005;64(2–3):107–23.
Google Scholar
Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005, pp. 65–72.
Hu JF, Wang XH, Zheng WS, Lai JH. Research progress and prospect of RGB-D behavior recognition. J Autom. 2019;45(5):829–40.
Google Scholar
Pushpajit K, Praveen K, Javed I. Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recogn Lett. 2018;115:107–16.
Google Scholar
Annalisa F, Antonio M, Dario M. A multimodal approach for human activity recognition based on skeleton and RGB data. Pattern Recogn Lett. 2020;131:293–9.
Google Scholar
Scovanner P, Ali S, Shah M. A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th international conference on Multimedia. New York: ACM; 2007. p. 357–60.
Google Scholar
Yilmaz A, Shah M. Actions sketch: a novel action representation. IEEE Comput Soc Conf Comput Vis Pattern Recogn (CVPR). 2005;1:984–9.
Google Scholar
Klaser A, Marszalek M, Schmid C. A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference, 2008, pp, 1–10.
Wang H, Klaser A, Schmid C, Liu CL. Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vision. 2013;103(1):60–79.
MathSciNet Google Scholar
Wang J, **a L. Abnormal behavior detection in videos using deep learning. Clust Comput. 2018;22:9229–39.
Google Scholar
Jaouedi N, Boujnah N, Bouhlel MS. A new hybrid deep learning model for human action recognition. J King Saud Univ Comput Inf Sci. 2020;32(4):447–53.
Google Scholar
Jadhav N, Sugandhi R. Survey on human behavior recognition using affective computing. IEEE Glob Conf Wirel Comput Netw (GCWCN). 2018. https://doi.org/10.1109/GCWCN.2018.8668632.
Article Google Scholar
Wang JD, Chen YQ, Hao SJ, Peng XH, Hu LS. Deep learning for sensor-based activity recognition: a survey. Pattern Recogn Lett. 2017;119:3–11.
Google Scholar
Wang LM, **ong YJ, Wang Z, Qiao Y. Temporal segment networks: towards good practices for deep action recognition. In: Proceedings of the European conference on computer vision. Cham: Springer; 2016.
Google Scholar
Peng X, Wang L, Wang X, Qiao Y. Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput Vis Image Underst. 2016;150:109–25.
Google Scholar
Zhang HB, Zhang YX, Zhong B, Lei Q, Yang L, Du JX, Chen DS. A comprehensive survey of vision-based human action recognition methods. Sensors. 2019;19(5):1005.
Google Scholar
Kong Y, Fu Y. Human action recognition and prediction: a survey. Comput Vis Pattern Recogn. 2018;1–20. ar**v:1806.11230.
Ramasamy Ramamurthy S, Roy N. Recent trends in machine learning for human activity recognition: a survey. Wiley Interdiscip Rev. 2018;8(4):e1254.
Google Scholar
Fu M, Chen N, Huang Z, Ni K, Ma X. Human action recognition: a survey. Plant long non-coding RNAS. Cham: . Springer; 2019. p. 69–77.
Google Scholar
Lara OD, Labrador MA. A survey on human activity recognition using wearable sensors. IEEE Commun Surv Tutor. 2013;15(3):1192–209.
Google Scholar
Wang L, Liu R. Human activity recognition based on wearable sensor using hierarchical deep LSTM networks. Circuits Syst Signal Process. 2019;39:837–56.
Google Scholar
Wang Z, Jiang K, Hou Y, Dou W, Zhang C, Huang Z, Guo Y. A Survey on human behavior recognition using channel state information. IEEE Access. 2019;7:155986.
Google Scholar
Yousefi S, Narui H, Dayal S, Ermon S, Valaee S. A survey on behavior recognition using WiFi channel state information. IEEE Commun Mag. 2017;55(10):98–104.
Google Scholar
Zhu HL, Zhu CS, Xu ZG. Research advances on human activity recognition datasets. Acta Autom Sin. 2018;44(6):978–1004.
Google Scholar
Chaquet JM, Carmona EJ, Fernández CA. A survey of video datasets for human action and activity recognition. Comput Vis Image Underst. 2013;117(6):633–59.
Google Scholar
Huang QQ, Zhou FY, Liu MZ. Survey of human action recognition algorithms based on video. Appl Res Comput. 2020;37(11):3213–9.
Google Scholar
Jegham I, Khalifa AB, Alouani I, Mahjoub MA. Vision-based human action recognition: An overview and real world challenges. Forensic Sci Int. 2009;32:200901.
Google Scholar
Harris C, Stephens MJ. A combined corner and edge detector. In: Proceeding of the 4th Alvey Vision Conference, 1988, pp. 147–51.
Willems G, Tuytelaars T, Vaaan GL. An efficient dense and scale-invariant spatio-temporal interest point detector. Computer vision. Cham: Springer; 2008. p. 650–63.
Google Scholar
Hu Q, Qin L, Huang QM. Overview of human action recognition based on vision. Chin J Comput. 2013;36(12):2512–24.
Google Scholar
Bobick AF, Davis JW. The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell. 2001;23(3):257–67.
Google Scholar
Blank M, Gorelick L, Shechtman E, Irani M, Basri R. Actions as space-time shapes. In: Tenth IEEE International Conference on Computer Vision, IEEE Xplore. 2005.
Sahoo SP, Srinivasu U, Ari S. 3D Features for human action recognition with semi-supervised learning. IET Image Proc. 2019;13(6):983–90.
Google Scholar
Wang H, Schmid C. Action recognition with improved trajectories. In: 2013 IEEE International Conference on Computer Vision, IEEE, 2014, pp. 3551–3558.
Yi Y, Zhang Z, Lin M. Realistic action recognition with salient foreground trajectories. Expert Syst Appl. 2017;75:44–55.
Google Scholar
Abdul-Azim HA, Hemayed EE. Human action recognition using trajectory-based representation. Egypt Inf J. 2015;16(2):187–98.
Google Scholar
Lucas BD, Kanade T. An iterative image registration technique with an application to stereo vision. Imaging. 1981;130:674–9.
Google Scholar
Zhu XD. Research on semantic topci model based human abnormal behaviour recognition. **’an: **’an University of Electronic Science and technology; 2011.
Google Scholar
Gruber A, Rosen-Zvi M, Weiss Y. Hidden topic Markov models. In: Proceedings of Artificial Intelligence and Statistics. 2007.
Chen C, Liu K, Kehtarnavaz N. Real-time human action recognition based on depth motion maps. J Real-Time Image Proc. 2016;12(1):155–63.
Google Scholar
Yang XD, Zhang CY, Tian YL. Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, ACM, 2012, pp. 1057.
Ij**a EP, Chalavadi KM. Human action recognition in RGB-D videos using motion sequence information and deep learning. Pattern Recogn. 2017;72:504–16.
Google Scholar
Luo J, Wang W, Qi H. Spatio-temporal feature extraction and representation for RGB-D human action recognition. Pattern Recogn Lett. 2014;50:139–48.
Google Scholar
Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. Advances in neural information processing systems. Berlin: Springer; 2014. p. 568–76.
Google Scholar
Wang LM, **ong YJ, Wang Z, Qiao Y. Towards good practices for very deep two-stream ConvNets. 2015;1–5. ar**v:1507.02159.
Feichtenhofer C, Pinz A, Zisserman A. Convolutional two-stream network fusion for video action recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1933–41.
Zhao YX, Man KL, Smith J, Siddique K, Guan SU. Improved two-stream model for human action recognition. EURASIP J Image Video Process. 2020;1:1–9.
Google Scholar
Zhang CC, He N. Human motion recognition based on key frame two-stream convolutional network. J Nan**g Univ Inf Sci Technol. 2019;11(06):716–21 (Natural Science Edition).
Google Scholar
Feichtenhofer C, Pinz A, Wildes RP. Spatiotemporal residual networks for video action recognition. 2016;1–9. ar**v:1611.02155v1.
He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–8.
Li C, Zhong QY, **e D, Pu SL. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 786–92.
Ji SW, Xu W, Yang M, Yu K. 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):221–31.
Google Scholar
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3d convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV), 2015, pp. 4489–97.
Tran D, Ray J, Shou Z, Chang SF, Paluri M. ConvNet architecture search for spatio temporal feature learning. 2017;1–12. ar**v:1708.05038.
Qiu ZF, Yao T, Mei T. Learning spatio-temporal representation with pseudo-3D residual networks. In: Proceedings of IEEE International Conference on Computer Vision, 2014, pp. 553–4.
Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 6645–9.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Google Scholar
Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A. Sequential deep learning for human action recognition. In: Proceedings of IEEE international workshop on human behavior understanding. Berlin: Springer; 2011. p. 29–39.
Google Scholar
Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Darrell T, Saenko K. Long–term recurrent convolutional networks for visual recognition and description. In: The IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 2625–34.
Zhang ZF, Lv ZM, Gan CQ, Zhu QY. Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions. Neurocomputing. 2020;410:304–16.
Google Scholar
Liu J, Shahroudy A, Xu D, Wang G. Spatio-temporal LSTM with trust gates for 3D human action recognition. Lecture notes in computer science. Berlin: Springer; 2016. p. 816–33.
Google Scholar
Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2015, pp. 1110−8.
Zhu H, Chen H, Brown R. A sequence-to-sequence model-based deep learning approach for recognizing activity of daily living for senior care. J Biomed Inform. 2018;84:148–58.
Google Scholar
Guo L, Wang L, Liu J, Zhou W, Lu B. HuAc: human activity recognition using crowdsourced WIFI signals and skeleton data. Wirel Commun Mobile Comput. 2018. https://doi.org/10.1155/2018/6163475.
Article Google Scholar
Wang F, Zhou SP, Panev S, Han JS, Huang D. Person-in- WiFi: Fine-grained Person Perception using WiFi. In: 2019 IEEE/CVF International Conference on Computer Vision, 2019, pp. 5451–60.
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Cham: Springer; 2015. p. 234–41.
Google Scholar
Zhao M, Li T, Alsheikh MA, Tian Y, Zhao H, Torralba A, Katabi D. Through-wall human pose estimation using radio signals. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
Rahmani H, Bennamoun M. Learning action recognition model from depth and skeleton videos. In: IEEE International Conference on Computer Vision, 2017, pp. 5833–42.
Tang Y, Tian Y, Lu J, Li P, Zhou J. Deep progressive reinforcement learning for skeleton-based action recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, pp. c5323–32.
Xu W, Yu J, Miao Z, Wan L, Ji Q. Spatio-temporal deep Q-networks for human activity localization. In: IEEE Transactions on Circuits and Systems for Video Technology, 2019, pp. 1–1.
Wang G, Wang W, Wang J, Bu Y. Better deep visual attention with reinforcement learning in action recognition. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, 2017, pp. 1–4.

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Shandong Province, China (NO. ZR2020MF148).

Author information

Authors and Affiliations

School of Computer and Control Engineering, Yantai University, Yantai, China
Meixue Yuan, Shouke Wei, **dong Zhao & Ming Sun
Deepsim Intelligence Technology Co., 32633 Simon Ave, Abbotsford, BC V2T 0G9, Canada
Shouke Wei

Authors

Meixue Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Shouke Wei
View author publications
You can also search for this author in PubMed Google Scholar
**dong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ming Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shouke Wei.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, M., Wei, S., Zhao, J. et al. A Systematic Survey on Human Behavior Recognition Methods. SN COMPUT. SCI. 3, 6 (2022). https://doi.org/10.1007/s42979-021-00932-x

Download citation

Received: 11 July 2021
Accepted: 08 October 2021
Published: 23 October 2021
DOI: https://doi.org/10.1007/s42979-021-00932-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Survey on Human Behavior Recognition Methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Sensor-Based Human Activity and Behavior Computing

A Review on Human Behavior Using Machine Learning for Ambient Assisted Living

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Systematic Survey on Human Behavior Recognition Methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Activity Recognition (HAR) Using Deep Learning: Review, Methodologies, Progress and Future Research Directions

Sensor-Based Human Activity and Behavior Computing

A Review on Human Behavior Using Machine Learning for Ambient Assisted Living

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation