Real-time 3D motion capture by monocular vision and virtual rendering

Jáuregui, David Antonio Gómez; Horain, Patrick

doi:10.1007/s00138-017-0861-3

Real-time 3D motion capture by monocular vision and virtual rendering

Original Paper
Published: 03 August 2017

Volume 28, pages 839–858, (2017)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

452 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

Networked 3D virtual environments allow multiple users to interact over the Internet by means of avatars and to get some feeling of a virtual telepresence. However, avatar control may be tedious. 3D sensors for motion capture systems based on 3D sensors have reached the consumer market, but webcams remain more widespread and cheaper. This work aims at animating a user’s avatar by real-time motion capture using a personal computer and a plain webcam. In a classical model-based approach, we register a 3D articulated upper-body model onto video sequences and propose a number of heuristics to accelerate particle filtering while robustly tracking user motion. Describing the body pose using wrists 3D positions rather than joint angles allows efficient handling of depth ambiguities for probabilistic tracking. We demonstrate experimentally the robustness of our 3D body tracking by real-time monocular vision, even in the case of partial occlusions and motion in the depth direction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular image. IEEE Trans. Pattern Anal. Mach.Intell. 12, 44–58 (2006)
Article Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 623–630, San Francisco, California (2010)
Balan, A.O., Sigal, L., Black, M.J.: A quantitative evaluation of video-based 3D person tracking. In: Proceedings of ICCCN 2005, pp. 349–356 (2005)
Bernier, O., Cheung-Mon-Chang, P.: Real-time 3D articulated pose tracking using particle filtering and belief propagation on factor graphs. BMVC 1, 27–36 (2006)
Google Scholar
Bernier, O., Cheung-Mon-Chang, P., Bouguet, A.: Fast nonparametric belief propagation for real-time stereo articulated body tracking. Comput. Vis. Image Underst. 113(1), 29–47 (2009)
Article Google Scholar
Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graph. Image Process. 34, 344–371 (1986)
Article Google Scholar
Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. Int. J. Comput. Vis. 56, 179–194 (2004)
Article Google Scholar
CMU: Carnegie mellon university graphics lab, motion capture database. http://mocap.cs.cmu.edu/ (2017)
Delamarre, Q., Faugeras, O.: 3D articulated models and multiview tracking with physical forces. J. Comput. Vis. Image Underst. 81(3), 328–357 (2001)
Article MATH Google Scholar
Deriche, R.: Fast algorithms for low-level vision. IEEE Trans. Pattern Anal. Mach. Intell. 12, 78–87 (1990)
Article Google Scholar
Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 126–133 (2000)
Deutscher, J., North, B., Bascle, B., Blake, A.: Tracking through singularities and discontinuities by random sampling. In: ICCV, pp. 1144–1149 (1999)
Elgammal, A.M., Lee, C.S.: Inferring 3D body pose from silhouettes using activity manifold learning. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR04), pp. 681–688 (2004)
Fontmarty, M., Lerasle, F., Danès, P.: Data fusion within a modified annealed particle filter dedicated to human motion capture. In: In International Conference on Intelligent Robots and Systems, San Diego, CA, USA (2007)
Gelencsér-Horváth, A., Tornai, G., Horváth, A., Cserey, G.: Fast, parallel implementation of particle filtering on the gpu architecture. EURASIP J. Adv. Signal Process. 148, 1–16 (2013)
Google Scholar
Gonczarek, A., Tomczak, J.M.: Articulated tracking with manifold regularized particle filter. Mach. Vis. Appl. 27(2), 275–286 (2016)
Article MATH Google Scholar
HANIM: Hanim specification. http://www.h-anim.org/ (2017)
Hartmann, B., Mancini, M., Pelachaud, C.: Implementing expressive gesture synthesis for embodied conversational agents. In: Gesture Workshop. LNAI. Springer (2005)
Hauberg, S.: Three dimensional monocular human motion analysis in end-effector space. In: EMMCVPR 2009. Lecture Notes in Computer Science, pp. 235–248. Springer (2009)
Horain, P., Soares, J.M., Kumar, P., Bideau, A.: Virtually enhancing the perception of user actions. In: 15th International Conference on Artificial Reality and Telexistence (ICAT 2005), p. 245246, Christchurch, New Zealand (2005)
Howe, N.R.: A recognition-based motion capture baseline on the HumanEva II test data. Mach. Vis. Appl. 22(6), 995–1008 (2011)
Article Google Scholar
Hua, G., Wu, Y.: A decentralized probabilistic approach to articulated body tracking. J. Comput. Vis. Image Underst. 108(3), 272–283 (2007)
Article Google Scholar
I-Maginer: Open source platform for 3D environments. http://www.openspace3d.com/ (2010)
Isard, M., Blake, A.: Condensation—conditional density propagation for visual tracking. IJCV Int. J. Comput. Vis. 29, 5–28 (1998)
Article Google Scholar
ISO/IEC:: Information technology-coding of audio-visual objects-part 2: visual international standard 14996-2 (2001)
Jáuregui, D.A.G., Horain, P.: Region-based vs. edge-based registration for 3D motion capture by real time monoscopic vision. In: A. Gagalowicz, L.. W. Philips (eds.) Proceedings of MIRAGE 2009, pp. 344–355. INRIA Rocquencourt, France (2009)
Jáuregui, D.A.G., Horain, P.: Real-time 3D motion capture by monocular vision and virtual rendering. In: A. Fusiello, V. Murino, L.V.. R. Cucchiara (editors) (eds.) Computer Vision ECCV 2012. Workshops and Demonstrations, pp. 663–666, Florence, Italy (2012)
Kalman, R.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82D, 34–45 (1960)
Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Article MATH MathSciNet Google Scholar
Lee, M.W., Cohen, I., Jung, S.K.: Particle filter with analytical inference for human body tracking. In: IEEE Workshop on Motion and Video Computing, pp. 159–165 (2002)
Lenz, C., Panin, G., Knoll, A.: A gpu-accelerated particle filter with pixel-level likelihood. In: In International Workshop on Vision, Modeling and Visualization (VMV), Konstanz, Germany (2008)
Lindner, M., Schiller, I., Kolb, A., Koch, R.: Time-of-flight sensor calibration for accurate range sensing. Comput. Vis. Image Underst. 114(12), 1318–1328 (2010)
Article Google Scholar
Lopez, F., Zhang, L., Mok, A.K., Beaman, J.: Particle filtering on gpu architectures for manufacturing applications. Comput. Ind. 71, 116–127 (2015)
Article Google Scholar
Lozano, O.M., Otsuka, K.: Real-time visual tracker by stream processing. J. Signal Process. Syst. 57(2), 285–295 (2009)
Article Google Scholar
Lu, Z., Carreira-Perpinan, M., Sminchisescu, C.: People tracking with the Laplacian Eigenmaps latent variable model. Adv. Neural Inf. Process. Syst. 20, 1705–1712 (2008)
Google Scholar
MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: European Conference on Computer Vision, vol. 2, pp. 3–19, Dublin, Irlande (2000)
Marques-Soares, J., Horain, P., Bideau, A., Nguyen, M.: Acquisition 3D du geste par vision monoscopique en temps réel et téléprésence. In: Actes de l’atelier Acquisition du geste humain par vision artificielle et applications, pp. 23–27, Toulouse (2004)
Microsoft: Kinect - xbox.com. http://www.xbox.com/en-US/Kinect (2017)
Moeslund, T., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. Int. J. Comput. Vis. Image Underst. (CVIU’06) 104, 90126 (2006)
Google Scholar
Montemayor, A.S., Pantrigo, J.J., Cabido, R., Payne, B.: Bandwidth-improved gpu particle filter for visual tracking. In: In proceedings of the Ibero-American Symposium on Computer Graphics—SIACG (2006), Santiago de Compostela, Spain (2006)
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7, 208–313 (1965)
Article MATH MathSciNet Google Scholar
Ning, H., Tan, T., Wang, L., Hu, W.: People tracking based on motion model and motion constraints with automatic initialization. Pattern Recognit. 37(7), 1423–1440 (2004)
Article Google Scholar
Niskanen, M., Boyer, E., Horaud, R.: Articulated motion capture from 3-D points and normals. In: British Machine Vision Conference, pp. 439–448, Oxford, United Kingdom (2005)
Noriega, P., Bernier, O.: Multicues 3D monocular upper body tracking using constrained belief propagation. In: British Machine Vision Conference, vol. 2, pp. 10–13, Warwick, United Kingdom (2007)
Ouhaddi, H., Horain, P.: Hand tracking by 3D model registration. In: Subsol, G. (ed.) Colloque Scientifique International Ralit virtuelle et prototypage, pp. 51–59, Laval, France (1999)
Poppe, R.W.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108, 4–18 (2007)
Article Google Scholar
Raskar, R.: Hardware support for non-photorealistic rendering. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, pp. 41–46. ACM Press (2001)
Rius, I., Gonzilez, J., Varona, J., Roca, F.X.: Action-specific motion prior for efficient bayesian 3D human body tracking. Pattern Recognit. 42(11), 2907–2921 (2009)
Article MATH Google Scholar
Rohr, K.: Towards model-based recognition of human movements in image sequences. CVGIP Image Underst. 59(1), 94–115 (1994)
Article Google Scholar
Saboune, J., Charpillet, F.: Using interval particle filtering for marker less 3D human motion capture. In: IEEE International Conference on Tools with Artificial Intelligence, pp. 621–627 (2005)
Shoemake, K.: Graphic Gems IV. Academic Press, Cambridge (1994). ISBN:0123361657
Sigal, L., Balan, A.O., Black, M.J.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1—-2), 4–27 (2010). doi:10.1007/s11263-009-0273-6
Article Google Scholar
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 421–428 (2004)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR05), vol. 1, pp. 390–397, San Diego, CA (2005)
Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: Conference on Computer Vision and Pattern Recognition, Hawaii (2001)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3D human tracking. In: International Conference on Computer Vision and Pattern Recognition, pp. 69–76, Madison, WI (2003)
Tolani, D., Goswami, A., Badler, N.: Real-time inverse kinematics techniques for anthropomorphic limbs. Graphical Models and Image Process in archive 62, 353–388 (2000)
Toyama, K., Blake, A.: Probabilistic tracking with exemplars in a metric space. Int. J. Comput. Vis. 48(1), 9–19 (2002)
Article MATH Google Scholar
Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with gaussian process dynamical models. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR06), vol. 1, p. 238245, New York, NY (2006)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE Comput. Vis. Pattern Recognit. 1, 511–518 (2001)
Google Scholar
Wright, R.S.J., Lipchak, B., Haemel, N.: OpenGL SuperBible: Comprehensive Tutorial and Reference, 4rth edn. Addison-Wesley Professional, Michigan (2007)
Google Scholar

Download references

Author information

David Antonio Gómez Jáuregui
Present address: ESTIA, 64210, Bidart, France

Authors and Affiliations

Institut Mines-Télécom, Télécom SudParis, 9 rue Charles Fourier, 91011, Evry Cedex, France
David Antonio Gómez Jáuregui & Patrick Horain

Authors

David Antonio Gómez Jáuregui
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Horain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Antonio Gómez Jáuregui.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jáuregui, D.A.G., Horain, P. Real-time 3D motion capture by monocular vision and virtual rendering. Machine Vision and Applications 28, 839–858 (2017). https://doi.org/10.1007/s00138-017-0861-3

Download citation

Received: 01 December 2015
Revised: 24 June 2017
Accepted: 28 June 2017
Published: 03 August 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s00138-017-0861-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Real-time 3D motion capture by monocular vision and virtual rendering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

3D Human Tracking in a Top View Using Depth Information Recorded by the Xtion Pro-Live Camera

Operating Virtual Panels with Hand Gestures in Immersive VR Games

3D Markerless Motion Capture: A Low Cost Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Real-time 3D motion capture by monocular vision and virtual rendering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

3D Human Tracking in a Top View Using Depth Information Recorded by the Xtion Pro-Live Camera

Operating Virtual Panels with Hand Gestures in Immersive VR Games

3D Markerless Motion Capture: A Low Cost Approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation