Log in

Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Gait recognition has a variety of development potentials, such as noncontact potential. The preference for skeleton-based recognition arises due to challenges posed by self-occlusion and environmental factors affecting silhouette-based methods. Addressing the discriminative properties of long-term and short-term temporal cues, we propose spatiotemporal smoothing aggregation enhanced multiscale residual deep graph convolutional networks. This paper considers both long and short gait feature time series, enabling the learning of discriminative multiscale representations. In the baseline network, three scale features are sequentially extracted, followed by a reverse process to extract and fuse multiscale features. This method significantly bolsters the ability of graph convolution to effectively model the context knowledge of human poses effectively. This study investigated multiscale gait feature aggregation, which significantly mitigates oversmoothing effects. A spatiotemporal smoothing aggregation module with an embedded attention mechanism is introduced to hierarchically aggregate and enhance multiscale key joint features. This module alleviates oversmoothing in deep graph convolutional networks. The method underwent rigorous testing on the Chinese Academy of Sciences Institute of Automation(CASIA-B) dataset, achieving an average accuracy of 78.2%, ranking as the second highest performing skeletal-based gait recognition model currently available, and attaining rank-1 accuracies of 14.7 and 8.19 on Gait Recognition in the wild (GREW) and Gait3D datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability and access

The research in this paper based on three publicly available datasets that are CASIA-B Gait Database(http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp), Gait Recognition Evaluation Workshop (https://www.grew-benchmark.org/) and Gait3D Dataset (https://gait3d.github.io/). Requires permission to use from the data owner.

Code availability

Due to legal considerations, we are unable to open-source the code for this study at this moment. Your understanding and support are greatly appreciated.

References

  1. Li N, Zhao X (2023) A multi-modal dataset for gait recognition under occlusion. Appl Intell 53(2):1517–1534

    Article  Google Scholar 

  2. Li G, Guo L, Zhang R et al (2023) Transgait: Multimodal-based gait recognition with set transformer. Appl Intell 53(2):1535–1547

    Article  Google Scholar 

  3. Ben X, Gong C, Zhang P et al (2019) Coupled bilinear discriminant projection for cross-view gait recognition. IEEE Trans Circuits Syst Video Technol 30(3):734–747

    Article  Google Scholar 

  4. Chao H, He Y, Zhang J, et al (2019) Gaitset: Regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI conference on artificial intelligence, pp 8126–8133

  5. Dang L, Nie Y, Long C, et al (2021) Msrgcn: Multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11467–11476

  6. Huang X, Zhu D, Wang H, et al (2021) Context-sensitive temporal feature learning for gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 12909–12918

  7. Fan C, Peng Y, Cao C, et al (2020) Gaitpart: Temporal part-based model for gait recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14225–14233

  8. Yang Y, Yang X, Sakamoto T et al (2022) Unsupervised domain adaptation for disguised-gait-based person identification on micro-doppler signatures. IEEE Trans Circuits Syst Video Technol 32(9):6448–6460

    Article  Google Scholar 

  9. **ng Y, Zhu J, Li Y et al (2023) An improved spatial temporal graph convolutional network for robust skeleton-based action recognition. Appl Intell 53(4):4592–4608

    Article  Google Scholar 

  10. Yu L, Tian L, Du Q et al (2023) Multi-stream adaptive 3d attention graph convolution network for skeleton-based action recognition. Appl Intell 53(12):14838–14854

    Article  Google Scholar 

  11. Yang W, Zhang J, Cai J et al (2023) Hybridnet: Integrating gcn and cnn for skeleton-based action recognition. Appl Intell 53(1):574–585

    Article  Google Scholar 

  12. Sun K, **ao B, Liu D, et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5693–5703

  13. Gianaria E, Balossino N, Grangetto M, et al (2013) Gait characterization using dynamic skeleton acquisition. In: 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP). IEEE, pp 440–445

  14. Cao Z, Simon T, Wei SE, et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7291–7299

  15. Fang HS, **e S, Tai YW, et al (2017) Rmpe: Regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision. pp 2334–2343

  16. Chou CJ, Chien JT, Chen HT (2018) Self adversarial training for human pose estimation. In: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, pp 17–30

  17. Zheng J, Liu X, Liu W, et al (2022) Gait recognition in the wild with dense 3d representations and a benchmark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 20228–20237

  18. Liao R, Yu S, An W et al (2020) A model-based gait recognition method with body pose and human prior knowledge. Pattern Recogn 98:107069

    Article  Google Scholar 

  19. Teepe T, Khan A, Gilg J, et al (2021) Gaitgraph: Graph convolutional network for skeleton-based gait recognition. In: 2021 IEEE International Conference on Image Processing (ICIP). IEEE, pp 2314–2318

  20. Cosma A, Radoi IE (2021) Wildgait: Learning gait representations from raw surveillance streams. Sensors 21(24):8387

    Article  Google Scholar 

  21. Pinyoanuntapong E, Ali A, Wang P, et al (2023) Gaitmixer: skeleton-based gait representation learning via wide-spectrum multi-axial mixer. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1–5

  22. Hua G, Long C, Yang M, et al (2013) Collaborative active learning of a kernel machine ensemble for recognition. In: Proceedings of the IEEE international conference on computer vision. pp 1209–1216

  23. Hu T, Long C, **ao C (2021) A novel visual representation on text using diverse conditional gan for visual recognition. IEEE Trans Image Process 30:3499–3512

    Article  Google Scholar 

  24. Long C, Hua G (2015) Multi-class multi-annotator active learning with robust gaussian process for visual recognition. In: Proceedings of the IEEE international conference on computer vision. pp 2839–2847

  25. Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 13708–13715

  26. Zeng R, Huang W, Tan M, et al (2019) Graph convolutional networks for temporal action localization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 7094–7103

  27. Islam A, Long C, Radke R (2021) A hybrid attention mechanism for weakly-supervised temporal action localization. In: Proceedings of the AAAI conference on artificial intelligence. pp 1637–1645

  28. Shi L, Wang L, Long C, et al (2021) Sgcn: Sparse graph convolution network for pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8994–9003

  29. Yan S, **ong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence

  30. Shi L, Zhang Y, Cheng J, et al (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035

  31. Teepe T, Gilg J, Herzog F, et al (2022) Towards a deeper understanding of skeleton-based gait recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1569–1577

  32. Liao R, Cao C, Garcia EB, et al (2017) Pose-based temporal-spatial network (ptsn) for gait recognition with carrying and clothing variations. In: Biometric Recognition: 12th Chinese Conference, CCBR 2017, Shenzhen, China, October 28-29, 2017, Proceedings 12. Springer, pp 474–483

  33. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241

  34. Sokolova A, Konushin A (2019) Pose-based deep gait recognition. IET. Biometrics 8(2):134–143

    Google Scholar 

  35. Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th international conference on pattern recognition (ICPR’06). IEEE, pp 441–444

  36. Liu X, You Z, He Y et al (2022) Symmetry-driven hyper feature gcn for skeleton-based gait recognition. Pattern Recogn 125:108520

    Article  Google Scholar 

  37. Tian H, Ma X, Wu H et al (2022) Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks. Neurocomputing 473:116–126

  38. Liao R, Li Z, Bhattacharyya SS et al (2022) Posemapgait: A model-based gait recognition method with pose estimation maps and graph convolutional networks. Neurocomputing 501:514–528

  39. Li Q, Han Z, Wu XM (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence

  40. Mao W, Liu M, Salzmann M, et al (2019) Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9489–9497

  41. Song YF, Zhang Z, Shan C, et al (2020) Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition. In: proceedings of the 28th ACM international conference on multimedia. pp 1625–1633

  42. Khosla P, Teterwak P, Wang C et al (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673

    Google Scholar 

  43. Cheng K, Zhang Y, Cao C, et al (2020) Decoupling gcn with dropgraph module for skeleton-based action recognition. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16. Springer, pp 536–553

  44. Song C, Huang Y, Huang Y et al (2019) Gaitnet: An end-to-end network for gait based human identification. Pattern Recogn 96:106988

    Article  Google Scholar 

  45. Wu Z, Huang Y, Wang L et al (2016) A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans Pattern Anal Mach Intell 39(2):209–226

    Article  Google Scholar 

  46. Han J, Bhanu B (2005) Individual recognition using gait energy image. IEEE Trans Pattern Anal Mach Intell 28(2):316–322

    Article  Google Scholar 

  47. Wang C, Zhang J, Pu J, et al (2010) Chrono-gait image: A novel temporal template for gait recognition. In: Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part I 11. Springer, pp 257–270

  48. Zhang Y, Huang Y, Yu S et al (2019) Cross-view gait recognition by discriminative feature learning. IEEE Trans Image Process 29:1001–1015

    Article  MathSciNet  Google Scholar 

  49. Xu C, Makihara Y, Li X et al (2020) Cross-view gait recognition using pairwise spatial transformer networks. IEEE Trans Circuits Syst Video Technol 31(1):260–274

    Article  Google Scholar 

  50. Takemura N, Makihara Y, Muramatsu D et al (2017) On input/output architectures for convolutional neural network-based cross-view gait recognition. IEEE Trans Circuits Syst Video Technol 29(9):2708–2719

    Article  Google Scholar 

  51. Lin B, Zhang S, Bao F (2020) Gait recognition with multiple-temporal-scale 3d convolutional neural network. In: Proceedings of the 28th ACM international conference on multimedia. pp 3054–3062

  52. Si C, Chen W, Wang W, et al (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1227–1236

  53. Li N, Zhao X, Ma C (2020) Jointsgait: A model-based gait recognition method based on gait graph convolutional networks and joints relationship pyramid map**. ar**v:2005.08625

  54. Smith LN, Topin N (2019) Super-convergence: Very fast training of neural networks using large learning rates. In: Artificial intelligence and machine learning for multi-domain operations applications, SPIE, pp 369–386

  55. Zhu Z, Guo X, Yang T, et al (2021) Gait recognition in the wild: A benchmark. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14789–14799

  56. Selvaraju RR, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626

  57. Yu S, Chen H, Garcia Reyes EB, et al (2017) Gaitgan: Invariant gait feature extraction using generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 30–37

  58. Yu S, Liao R, An W et al (2019) Gaitganv 2: Invariant gait feature extraction using generative adversarial networks. Pattern Recogn 87:179–189

    Article  Google Scholar 

  59. He Y, Zhang J, Shan H et al (2018) Multi-task gans for view-specific feature learning in gait recognition. IEEE Trans Inf Forensics Secur 14(1):102–113

    Article  Google Scholar 

  60. Shiraga K, Makihara Y, Muramatsu D, et al (2016) Geinet: View-invariant gait recognition using a convolutional neural network. In: 2016 international conference on biometrics (ICB). IEEE, pp 1–8

  61. Wu Z, Huang Y, Wang L et al (2016) A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans Pattern Anal Mach Intell 39(2):209–226

    Article  Google Scholar 

  62. Hou S, Cao C, Liu X, et al (2020) Gait lateral network: Learning discriminative and compact representations for gait recognition. In: European conference on computer vision, Springer, pp 382–398

  63. Lin B, Zhang S, Yu X (2021) Gait recognition via effective global-local feature representation and local temporal aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 14648–14656

  64. Wu Y, Wang Y, Li Y et al (2022) Top-k self-adaptive contrast sequential pattern mining. IEEE transactions on cybernetics 52(11):11819–11833

    Article  Google Scholar 

Download references

Funding

This study was funded by Shenzhen Startup Funding (No. QD2023014C) and National Natural Science Fundation of China (No. 61906074).

Author information

Authors and Affiliations

Authors

Contributions

Guanghai Chen developed and implemented the algorithms and models used in the study and wrote the first draft. **n Chen was responsible for the project design and overall plan of the study. Chengzhi Zheng was responsible for data organization. Junshu Wang was responsible for reviewing the paper manuscript and suggesting important changes to the paper content. **nchao Liu was responsible for the visualization and graphing of the experimental results. Yuxing Han is the corresponding author of this study. First Author and Second Author contribute equally to this work and should be considered co-first authors.

Corresponding author

Correspondence to Yuxing Han.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Ethical and informed consent for data used

Not applicable. The work in this paper has no ethical or moral implications such as human or animal experimentation. The work presented in this article is entirely original and has not been published in any other journals. This journal is the premiere and exclusive contributing journal for the paper. There are no violations of academic ethics. The right to use the data used in this study has been approved by the owner.

Consent to participate

All participants in this study were informed and consented to participate in the study.

Consent for publication

Participants in this study gave their consent for the results to be used for publication, presentation, or sharing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, G., Chen, X., Zheng, C. et al. Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition. Appl Intell 54, 6154–6174 (2024). https://doi.org/10.1007/s10489-024-05422-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05422-0

Keywords

Navigation