Log in

Multi-person pose estimation based on graph grou** optimization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Multi-person pose estimation has been an increasingly popular topic with the advancements of all kinds of computer vision and human-machine interaction tasks. This study field could further enhance the understanding of human poses and activities. The current mainstream multi-person pose estimation methods are generally divided into two categories: top-down and bottom-up methods. Although top-down methods are capable of achieving better performance by simplifying the problem to single-person pose estimation, while this strategy somewhat greatly increases the time complexity as a trade-off for better accuracy. The bottom-up methods could directly locate all the keypoints in the image, which can be potentially more effective and can be made real-time. However, most of the current bottom-up methods have separated the detection and grou** of keypoints into two independent steps. This greatly hindered the overall performance and computation efficiency of the algorithms. To address this issue, our study proposes an end-to-end bottom-up framework for multi-person pose estimation. Using the HRNet as the backbone structure, we add a deconvolution module to acquire high-resolution feature maps in the keypoints proposal stage. The graph neural network is leveraged in the grou** stage, which is integrated to the backbone so that the whole framework can be trained in an end-to-end manner. Using the keypoint candidates as nodes, two discriminators are exploited to supervise the grou** process. Lastly, a graph-based pose optimization algorithm is explored to refine the results. Experiments on the COCO and CrowdPose datasets show that our method achieves better accuracy and greatly reduce the computation time as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Cao Z, Simon T, Wei SE., Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7291–7299

  2. Chen Y, Wang Z, Peng Y, Zhang Z, Yu G Sun J (2018) Cascaded pyramid network for multi-person pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7103–7112

  3. Chen Y, Rohrbach M, Yan Z, Shuicheng Y, Feng J, Kalantidis Y (2019) Graph-based global reasoning networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 433–442

  4. Cheng B, **ao B, Wang J, Shi H, Huang TS, Zhang L (2020) Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5386–5395

  5. Dhillon IS, Guan Y, Kulis B (2007) Weighted graph cuts without eigenvectors a multilevel approach. IEEE Trans Pattern Anal Mach Intell 29(11):1944–1957

    Article  Google Scholar 

  6. Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. Adv Neural Inf Proces Syst 28:2224–2232

    Google Scholar 

  7. Estrach JB, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and deep locally connected networks on graphs. In 2nd International Conference on Learning Representations. pp 1–14

  8. Fang HS, **. Adv Neural Inf Proces Syst 2017:2278–2288

    Google Scholar 

  9. Nie X, Feng J, Zhang J, Yan, S (2019) Single-stage multi-person pose machines. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6951–6960

  10. Papandreou G, Zhu T, Kanazawa N, Toshev A, Tompson J, Bregler C, Murphy K (2017) Towards accurate multi-person pose estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4903–4911

  11. Papandreou G, Zhu T, Chen LC, Gidaris S, Tompson J, Murphy K (2018) Personlab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In Proceedings of the European Conference on Computer Vision (ECCV). 269–286

  12. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Proces Syst 28:91–99

    Google Scholar 

  13. Sun K, **ao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5693–5703

  14. Sun X, **ao B, Wei F, Liang S, Wei Y (2018) Integral human pose regression. In Proceedings of the European Conference on Computer Vision (ECCV). 529–545

  15. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph cnn for learning on point clouds. Acm Trans Graphics (tog) 38(5):1–12

    Article  Google Scholar 

  16. Wei SE, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. pp 4724–4732

  17. **ao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In Proceedings of the European conference on computer vision (ECCV). 466–481

  18. Yan S, **ong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In Thirty-second AAAI conference on artificial intelligence. 7444–7452

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingsong Hu.

Ethics declarations

Conflict of interests

No conflicts of interests in this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zeng, Q., Hu, Y., Li, D. et al. Multi-person pose estimation based on graph grou** optimization. Multimed Tools Appl 82, 7039–7053 (2023). https://doi.org/10.1007/s11042-022-13445-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13445-3

Keywords

Navigation