Log in

Real-time 3D motion capture by monocular vision and virtual rendering

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Networked 3D virtual environments allow multiple users to interact over the Internet by means of avatars and to get some feeling of a virtual telepresence. However, avatar control may be tedious. 3D sensors for motion capture systems based on 3D sensors have reached the consumer market, but webcams remain more widespread and cheaper. This work aims at animating a user’s avatar by real-time motion capture using a personal computer and a plain webcam. In a classical model-based approach, we register a 3D articulated upper-body model onto video sequences and propose a number of heuristics to accelerate particle filtering while robustly tracking user motion. Describing the body pose using wrists 3D positions rather than joint angles allows efficient handling of depth ambiguities for probabilistic tracking. We demonstrate experimentally the robustness of our 3D body tracking by real-time monocular vision, even in the case of partial occlusions and motion in the depth direction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37
Fig. 38
Fig. 39

Similar content being viewed by others

References

  1. Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular image. IEEE Trans. Pattern Anal. Mach.Intell. 12, 44–58 (2006)

    Article  Google Scholar 

  2. Andriluka, M., Roth, S., Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 623–630, San Francisco, California (2010)

  3. Balan, A.O., Sigal, L., Black, M.J.: A quantitative evaluation of video-based 3D person tracking. In: Proceedings of ICCCN 2005, pp. 349–356 (2005)

  4. Bernier, O., Cheung-Mon-Chang, P.: Real-time 3D articulated pose tracking using particle filtering and belief propagation on factor graphs. BMVC 1, 27–36 (2006)

    Google Scholar 

  5. Bernier, O., Cheung-Mon-Chang, P., Bouguet, A.: Fast nonparametric belief propagation for real-time stereo articulated body tracking. Comput. Vis. Image Underst. 113(1), 29–47 (2009)

    Article  Google Scholar 

  6. Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graph. Image Process. 34, 344–371 (1986)

    Article  Google Scholar 

  7. Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. Int. J. Comput. Vis. 56, 179–194 (2004)

    Article  Google Scholar 

  8. CMU: Carnegie mellon university graphics lab, motion capture database. http://mocap.cs.cmu.edu/ (2017)

  9. Delamarre, Q., Faugeras, O.: 3D articulated models and multiview tracking with physical forces. J. Comput. Vis. Image Underst. 81(3), 328–357 (2001)

    Article  MATH  Google Scholar 

  10. Deriche, R.: Fast algorithms for low-level vision. IEEE Trans. Pattern Anal. Mach. Intell. 12, 78–87 (1990)

    Article  Google Scholar 

  11. Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 126–133 (2000)

  12. Deutscher, J., North, B., Bascle, B., Blake, A.: Tracking through singularities and discontinuities by random sampling. In: ICCV, pp. 1144–1149 (1999)

  13. Elgammal, A.M., Lee, C.S.: Inferring 3D body pose from silhouettes using activity manifold learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR04), pp. 681–688 (2004)

  14. Fontmarty, M., Lerasle, F., Danès, P.: Data fusion within a modified annealed particle filter dedicated to human motion capture. In: In International Conference on Intelligent Robots and Systems, San Diego, CA, USA (2007)

  15. Gelencsér-Horváth, A., Tornai, G., Horváth, A., Cserey, G.: Fast, parallel implementation of particle filtering on the gpu architecture. EURASIP J. Adv. Signal Process. 148, 1–16 (2013)

    Google Scholar 

  16. Gonczarek, A., Tomczak, J.M.: Articulated tracking with manifold regularized particle filter. Mach. Vis. Appl. 27(2), 275–286 (2016)

    Article  MATH  Google Scholar 

  17. HANIM: Hanim specification. http://www.h-anim.org/ (2017)

  18. Hartmann, B., Mancini, M., Pelachaud, C.: Implementing expressive gesture synthesis for embodied conversational agents. In: Gesture Workshop. LNAI. Springer (2005)

  19. Hauberg, S.: Three dimensional monocular human motion analysis in end-effector space. In: EMMCVPR 2009. Lecture Notes in Computer Science, pp. 235–248. Springer (2009)

  20. Horain, P., Soares, J.M., Kumar, P., Bideau, A.: Virtually enhancing the perception of user actions. In: 15th International Conference on Artificial Reality and Telexistence (ICAT 2005), p. 245246, Christchurch, New Zealand (2005)

  21. Howe, N.R.: A recognition-based motion capture baseline on the HumanEva II test data. Mach. Vis. Appl. 22(6), 995–1008 (2011)

    Article  Google Scholar 

  22. Hua, G., Wu, Y.: A decentralized probabilistic approach to articulated body tracking. J. Comput. Vis. Image Underst. 108(3), 272–283 (2007)

    Article  Google Scholar 

  23. I-Maginer: Open source platform for 3D environments. http://www.openspace3d.com/ (2010)

  24. Isard, M., Blake, A.: Condensation—conditional density propagation for visual tracking. IJCV Int. J. Comput. Vis. 29, 5–28 (1998)

    Article  Google Scholar 

  25. ISO/IEC:: Information technology-coding of audio-visual objects-part 2: visual international standard 14996-2 (2001)

  26. Jáuregui, D.A.G., Horain, P.: Region-based vs. edge-based registration for 3D motion capture by real time monoscopic vision. In: A. Gagalowicz, L.. W. Philips (eds.) Proceedings of MIRAGE 2009, pp. 344–355. INRIA Rocquencourt, France (2009)

  27. Jáuregui, D.A.G., Horain, P.: Real-time 3D motion capture by monocular vision and virtual rendering. In: A. Fusiello, V. Murino, L.V.. R. Cucchiara (editors) (eds.) Computer Vision ECCV 2012. Workshops and Demonstrations, pp. 663–666, Florence, Italy (2012)

  28. Kalman, R.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82D, 34–45 (1960)

    Google Scholar 

  29. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  30. Lee, M.W., Cohen, I., Jung, S.K.: Particle filter with analytical inference for human body tracking. In: IEEE Workshop on Motion and Video Computing, pp. 159–165 (2002)

  31. Lenz, C., Panin, G., Knoll, A.: A gpu-accelerated particle filter with pixel-level likelihood. In: In International Workshop on Vision, Modeling and Visualization (VMV), Konstanz, Germany (2008)

  32. Lindner, M., Schiller, I., Kolb, A., Koch, R.: Time-of-flight sensor calibration for accurate range sensing. Comput. Vis. Image Underst. 114(12), 1318–1328 (2010)

    Article  Google Scholar 

  33. Lopez, F., Zhang, L., Mok, A.K., Beaman, J.: Particle filtering on gpu architectures for manufacturing applications. Comput. Ind. 71, 116–127 (2015)

    Article  Google Scholar 

  34. Lozano, O.M., Otsuka, K.: Real-time visual tracker by stream processing. J. Signal Process. Syst. 57(2), 285–295 (2009)

    Article  Google Scholar 

  35. Lu, Z., Carreira-Perpinan, M., Sminchisescu, C.: People tracking with the Laplacian Eigenmaps latent variable model. Adv. Neural Inf. Process. Syst. 20, 1705–1712 (2008)

    Google Scholar 

  36. MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: European Conference on Computer Vision, vol. 2, pp. 3–19, Dublin, Irlande (2000)

  37. Marques-Soares, J., Horain, P., Bideau, A., Nguyen, M.: Acquisition 3D du geste par vision monoscopique en temps réel et téléprésence. In: Actes de l’atelier Acquisition du geste humain par vision artificielle et applications, pp. 23–27, Toulouse (2004)

  38. Microsoft: Kinect - xbox.com. http://www.xbox.com/en-US/Kinect (2017)

  39. Moeslund, T., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. Int. J. Comput. Vis. Image Underst. (CVIU’06) 104, 90126 (2006)

    Google Scholar 

  40. Montemayor, A.S., Pantrigo, J.J., Cabido, R., Payne, B.: Bandwidth-improved gpu particle filter for visual tracking. In: In proceedings of the Ibero-American Symposium on Computer Graphics—SIACG (2006), Santiago de Compostela, Spain (2006)

  41. Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7, 208–313 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  42. Ning, H., Tan, T., Wang, L., Hu, W.: People tracking based on motion model and motion constraints with automatic initialization. Pattern Recognit. 37(7), 1423–1440 (2004)

    Article  Google Scholar 

  43. Niskanen, M., Boyer, E., Horaud, R.: Articulated motion capture from 3-D points and normals. In: British Machine Vision Conference, pp. 439–448, Oxford, United Kingdom (2005)

  44. Noriega, P., Bernier, O.: Multicues 3D monocular upper body tracking using constrained belief propagation. In: British Machine Vision Conference, vol. 2, pp. 10–13, Warwick, United Kingdom (2007)

  45. Ouhaddi, H., Horain, P.: Hand tracking by 3D model registration. In: Subsol, G. (ed.) Colloque Scientifique International Ralit virtuelle et prototypage, pp. 51–59, Laval, France (1999)

  46. Poppe, R.W.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108, 4–18 (2007)

    Article  Google Scholar 

  47. Raskar, R.: Hardware support for non-photorealistic rendering. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, pp. 41–46. ACM Press (2001)

  48. Rius, I., Gonzilez, J., Varona, J., Roca, F.X.: Action-specific motion prior for efficient bayesian 3D human body tracking. Pattern Recognit. 42(11), 2907–2921 (2009)

    Article  MATH  Google Scholar 

  49. Rohr, K.: Towards model-based recognition of human movements in image sequences. CVGIP Image Underst. 59(1), 94–115 (1994)

    Article  Google Scholar 

  50. Saboune, J., Charpillet, F.: Using interval particle filtering for marker less 3D human motion capture. In: IEEE International Conference on Tools with Artificial Intelligence, pp. 621–627 (2005)

  51. Shoemake, K.: Graphic Gems IV. Academic Press, Cambridge (1994). ISBN:0123361657

  52. Sigal, L., Balan, A.O., Black, M.J.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1—-2), 4–27 (2010). doi:10.1007/s11263-009-0273-6

    Article  Google Scholar 

  53. Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 421–428 (2004)

  54. Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR05), vol. 1, pp. 390–397, San Diego, CA (2005)

  55. Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: Conference on Computer Vision and Pattern Recognition, Hawaii (2001)

  56. Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3D human tracking. In: International Conference on Computer Vision and Pattern Recognition, pp. 69–76, Madison, WI (2003)

  57. Tolani, D., Goswami, A., Badler, N.: Real-time inverse kinematics techniques for anthropomorphic limbs. Graphical Models and Image Process in archive 62, 353–388 (2000)

  58. Toyama, K., Blake, A.: Probabilistic tracking with exemplars in a metric space. Int. J. Comput. Vis. 48(1), 9–19 (2002)

    Article  MATH  Google Scholar 

  59. Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with gaussian process dynamical models. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR06), vol. 1, p. 238245, New York, NY (2006)

  60. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE Comput. Vis. Pattern Recognit. 1, 511–518 (2001)

    Google Scholar 

  61. Wright, R.S.J., Lipchak, B., Haemel, N.: OpenGL SuperBible: Comprehensive Tutorial and Reference, 4rth edn. Addison-Wesley Professional, Michigan (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Antonio Gómez Jáuregui.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jáuregui, D.A.G., Horain, P. Real-time 3D motion capture by monocular vision and virtual rendering. Machine Vision and Applications 28, 839–858 (2017). https://doi.org/10.1007/s00138-017-0861-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-017-0861-3

Keywords

Navigation