Expressive Telepresence via Modular Codec Avatars

Chu, Hang; Ma, Shugao; De la Torre, Fernando; Fidler, Sanja; Sheikh, Yaser

doi:10.1007/978-3-030-58610-2_20

Hang Chu^12,13,
Shugao Ma¹⁴,
Fernando De la Torre¹⁴,
Sanja Fidler^12,13 &
…
Yaser Sheikh¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12357))

Included in the following conference series:

European Conference on Computer Vision

Abstract

VR telepresence consists of interacting with another human in a virtual space represented by an avatar. Today most avatars are cartoon-like, but soon the technology will allow video-realistic ones. This paper aims in this direction, and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic faces driven by the cameras in the VR headset. MCA extends traditional Codec Avatars (CA) by replacing the holistic models with a learned modular representation. It is important to note that traditional person-specific CAs are learned from few training samples, and typically lack robustness as well as limited expressiveness when transferring facial expressions. MCAs solve these issues by learning a modulated adaptive blending of different facial components as well as an exemplar-based latent alignment. We demonstrate that MCA achieves improved expressiveness and robustness w.r.t to CA in a variety of real-world datasets and practical scenarios. Finally, we showcase new applications in VR telepresence enabled by the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Photo-Realistic 360 $$^{\circ }$$ Head Avatars in the Wild

Literature Review of Audio-Driven 2D Avatar Video Generation Algorithms

Reconstructing Facial Expressions of HMD Users for Avatars in VR

References

Wei, S.E., et al.: VR facial animation via multiview image translation. In: SIGGRAPH (2019)
Google Scholar
Heymann, D.L., Shindo, N.: Covid-19: what is next for public health? Lancet 395, 542–545 (2020)
Article Google Scholar
Orts-Escolano, S., et al.: Holoportation: virtual 3D teleportation in real-time. In: UIST (2016)
Google Scholar
Lombardi, S., Saragih, J., Simon, T., Sheikh, Y.: Deep appearance models for face rendering. In: SIGGRAPH (2018)
Google Scholar
Tewari, A., et al.: FML: face model learning from videos. In: CVPR (2019)
Google Scholar
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: real-time face capture and reenactment of RGB videos. In: CVPR (2016)
Google Scholar
Elgharib, M., et al.: Egoface: egocentric face performance capture and videorealistic reenactment. ar**v:1905.10822 (2019)
Nagano, K., et al.: PaGAN: real-time avatars using dynamic textures. In: SIGGRAPH (2018)
Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. TPAMI 23(6), 681–685 (2001)
Article Google Scholar
Blanz, V., Vetter, T., et al.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH (1999)
Google Scholar
Tena, J.R., De la Torre, F., Matthews, I.: Interactive region-based linear 3D face models. In: SIGGRAPH (2011)
Google Scholar
Neumann, T., Varanasi, K., Wenger, S., Wacker, M., Magnor, M., Theobalt, C.: Sparse localized deformation components. TOG 32(6), 1–10 (2013)
Article Google Scholar
Cao, C., Chai, M., Woodford, O., Luo, L.: Stabilized real-time face tracking via a learned dynamic rigidity prior. TOG 37(6), 1–11 (2018)
Article Google Scholar
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. TVCG 20(3), 413–425 (2013)
Google Scholar
Ghafourzadeh, D., et al.: Part-based 3D face morphable model with anthropometric local control. In: EuroGraphics (2020)
Google Scholar
Seyama, J., Nagayama, R.S.: The uncanny valley: effect of realism on the impression of artificial human faces. Presence: Teleoper. Virtual Environ. 16(4), 337–351 (2007)
Article Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. ar**v:1312.6114 (2013)
Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. TOG 38(4), 65 (2019)
Article Google Scholar
Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. TOG 33(4), 1–10 (2014)
Google Scholar
Li, H., et al.: Facial performance sensing head-mounted display. TOG 34(4), 1–9 (2015)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar
Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. ar**v:1803.01271 (2018)
Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2014)
Google Scholar
Wikipedia: structural similarity. https://en.wikipedia.org/wiki/structural_similarity

Download references

Author information

Authors and Affiliations

University of Toronto, Toronto, Canada
Hang Chu & Sanja Fidler
Vector Institute, Toronto, Canada
Hang Chu & Sanja Fidler
Facebook Reality Lab, Pittsburgh, USA
Shugao Ma, Fernando De la Torre & Yaser Sheikh

Authors

Hang Chu
View author publications
You can also search for this author in PubMed Google Scholar
Shugao Ma
View author publications
You can also search for this author in PubMed Google Scholar
Fernando De la Torre
View author publications
You can also search for this author in PubMed Google Scholar
Sanja Fidler
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Sheikh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hang Chu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 294 KB)

Supplementary material 2 (mp4 26829 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chu, H., Ma, S., De la Torre, F., Fidler, S., Sheikh, Y. (2020). Expressive Telepresence via Modular Codec Avatars. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12357. Springer, Cham. https://doi.org/10.1007/978-3-030-58610-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-58610-2_20
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58609-6
Online ISBN: 978-3-030-58610-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Expressive Telepresence via Modular Codec Avatars

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Photo-Realistic 360 $$^{\circ }$$ Head Avatars in the Wild

Literature Review of Audio-Driven 2D Avatar Video Generation Algorithms

Reconstructing Facial Expressions of HMD Users for Avatars in VR

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 294 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Expressive Telepresence via Modular Codec Avatars

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Photo-Realistic 360 $$^{\circ }$$ Head Avatars in the Wild

Literature Review of Audio-Driven 2D Avatar Video Generation Algorithms

Reconstructing Facial Expressions of HMD Users for Avatars in VR

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 294 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation