Object-Compositional Neural Implicit Surfaces

Wu, Qianyi; Liu, **an; Chen, Yuedong; Li, Kejie; Zheng, Chuanxia; Cai, Jianfei; Zheng, Jianmin

doi:10.1007/978-3-031-19812-0_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13687))

Included in the following conference series:

European Conference on Computer Vision

2708 Accesses
13 Citations

Abstract

The neural implicit representation has shown its effectiveness in novel view synthesis and high-quality 3D reconstruction from multi-view images. However, most approaches focus on holistic scene representation yet ignore individual objects inside it, thus limiting potential downstream applications. In order to learn object-compositional representation, a few works incorporate the 2D semantic map as a cue in training to grasp the difference between objects. But they neglect the strong connections between object geometry and instance semantic information, which leads to inaccurate modeling of individual instance. This paper proposes a novel framework, ObjectSDF, to build an object-compositional neural implicit representation with high fidelity in 3D reconstruction and object representation. Observing the ambiguity of conventional volume rendering pipelines, we model the scene by combining the Signed Distance Functions (SDF) of individual object to exert explicit surface constraint. The key in distinguishing different instances is to revisit the strong association between an individual object’s SDF and semantic label. Particularly, we convert the semantic information to a function of object SDF and develop a unified and compact representation for scene and objects. Experimental results show the superiority of ObjectSDF framework in representing both the holistic object-compositional scene and the individual instances. Code can be found at https://qianyiwu.github.io/objectsdf/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 93.08; Price includes VAT (Germany)

Softcover Book: EUR 117.69; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

NIST: Learning Neural Implicit Surfaces and Textures for Multi-view Reconstruction

Neural Mesh-Based Graphics

References

Atzmon, M., Lipman, Y.: Sal: sign agnostic learning of shapes from raw data. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
Google Scholar
Chen, Y., Wu, Q., Zheng, C., Cham, T.J., Cai, J.: Sem2nerf: converting single-view semantic masks to neural radiance fields. ar**v preprint ar**v:2203.10821 (2022)
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: Richly-annotated 3d reconstructions of indoor scenes. In: CVPR (2017)
Google Scholar
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: Fewer views and faster training for free. ar**v preprint ar**v:2107.02791 (2021)
Deng, Y., Yang, J., **ang, J., Tong, X.: Gram: Generative radiance manifolds for 3d-aware image generation. ar**v preprint ar**v:2112.08867 (2021)
Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric regularization for learning shapes. ar**v preprint ar**v:2002.10099 (2020)
Guo, M., Fathi, A., Wu, J., Funkhouser, T.: Object-centric neural scene rendering. ar**v preprint ar**v:2012.08503 (2020)
Hassan, M., Choutas, V., Tzionas, D., Black, M.J.: Resolving 3d human pose ambiguities with 3d scene constraints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2282–2292 (2019)
Google Scholar
Kajiya, J.T., Von Herzen, B.P.: Ray tracing volume densities. ACM SIGGRAPH Comput. Graph. 18(3), 165–174 (1984)
Article Google Scholar
Kohli, A., Sitzmann, V., Wetzstein, G.: Semantic implicit neural scene representations with semi-supervised training. In: International Conference on 3D Vision (3DV) (2020)
Google Scholar
Li, K., Rezatofighi, H., Reid, I.: Moltr: multiple object localization, tracking and reconstruction from monocular RGB videos. IEEE Robot. Autom. Lett. 6(2), 3341–3348 (2021)
Article Google Scholar
Liu, L., Gu, J., Lin, K.Z., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. ar**v preprint ar**v:2007.11571 (2020)
Liu, X., Xu, Y., Wu, Q., Zhou, H., Wu, W., Zhou, B.: Semantic-aware implicit neural audio-driven video portrait generation. ar**v preprint ar**v:2201.07786 (2022)
Luan, F., Zhao, S., Bala, K., Dong, Z.: Unified shape and SVBRDF recovery using differentiable monte carlo rendering. In: Computer Graphics Forum, vol. 40, pp. 101–113. Wiley Online Library (2021)
Google Scholar
Max, N.: Optical models for direct volume rendering. IEEE Trans. Vis. Comput. Graph. 1(2), 99–108 (1995)
Article Google Scholar
McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: dense 3d semantic map** with convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635. IEEE (2017)
Google Scholar
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Nguyen-Phuoc, T.H., Richardt, C., Mai, L., Yang, Y., Mitra, N.: BlockGAN: learning 3d object-aware scene representations from unlabelled images. Adv. Neural Inf. Process. Syst. 33, 6767–6778 (2020)
Google Scholar
Nie, Y., Han, X., Guo, S., Zheng, Y., Chang, J., Zhang, J.J.: Total3DUnderstanding: joint layout, object pose and mesh reconstruction for indoor scenes from a single image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 55–64 (2020)
Google Scholar
Niemeyer, M., Geiger, A.: Giraffe: representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11453–11464 (2021)
Google Scholar
Oechsle, M., Peng, S., Geiger, A.: UNISURF: unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5589–5599 (2021)
Google Scholar
Ost, J., Mannan, F., Thuerey, N., Knodt, J., Heide, F.: Neural scene graphs for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2856–2865 (2021)
Google Scholar
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepSDF: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 165–174 (2019)
Google Scholar
Park, K., et al.: Nerfies: deformable neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5865–5874 (2021)
Google Scholar
Prajwal, K., Mukhopadhyay, R., Namboodiri, V.P., Jawahar, C.: A lip sync expert is all you need for speech to lip generation in the wild. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 484–492 (2020)
Google Scholar
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: neural radiance fields for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10318–10327 (2021)
Google Scholar
Rebain, D., Jiang, W., Yazdani, S., Li, K., Yi, K.M., Tagliasacchi, A.: DeRF: decomposed radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14153–14161 (2021)
Google Scholar
Reiser, C., Peng, S., Liao, Y., Geiger, A.: KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14335–14345 (2021)
Google Scholar
Rosinol, A., Gupta, A., Abate, M., Shi, J., Carlone, L.: 3d dynamic scene graphs: actionable spatial perception with places, objects, and humans. ar**v preprint ar**v:2002.06289 (2020)
Sitzmann, V., Zollhöfer, M., Wetzstein, G.: Scene representation networks: continuous 3d-structure-aware neural scene representations. ar**v preprint ar**v:1906.01618 (2019)
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-NeRF: structured view-dependent appearance for neural radiance fields. In: CVPR (2022)
Google Scholar
Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: NeuS: learning neural implicit surfaces by volume rendering for multi-view reconstruction. In: NeurIPS (2021)
Google Scholar
Yang, B., et al.: Learning object-compositional neural radiance field for editable scene rendering. In: International Conference on Computer Vision (ICCV), October 2021
Google Scholar
Yariv, L., Gu, J., Kasten, Y., Lipman, Y.: Volume rendering of neural implicit surfaces. ar**v preprint ar**v:2106.12052 (2021)
Yariv, L., et al.: Multiview neural surface reconstruction by disentangling geometry and appearance. Adv. Neural Inf. Process. Syst. 33, 2492–2502 (2020)
Google Scholar
Yu, H.X., Guibas, L., Wu, J.: Unsupervised discovery of object radiance fields. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=rwE8SshAlxw
Zhang, K., Luan, F., Li, Z., Snavely, N.: IRON: Inverse rendering by optimizing neural SDFs and materials from photometric images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5565–5574 (2022)
Google Scholar
Zhang, K., Riegler, G., Snavely, N., Koltun, V.: Nerf++: analyzing and improving neural radiance fields. ar**v preprint ar**v:2010.07492 (2020)
Zhang, X., Srinivasan, P.P., Deng, B., Debevec, P., Freeman, W.T., Barron, J.T.: NeRFactor: neural factorization of shape and reflectance under an unknown illumination. ACM Trans. Graph (TOG) 40(6), 1–18 (2021)
Article Google Scholar
Zhi, S., Laidlow, T., Leutenegger, S., Davison, A.: In-place scene labelling and understanding with implicit scene representation. In: Proceedings of the International Conference on Computer Vision (ICCV) (2021)
Google Scholar

Download references

Acknowledgements

This research is partially supported by Monash FIT Start-up Grant and SenseTime Gift Fund.

Author information

Authors and Affiliations

Monash University, Melbourne, Australia
Qianyi Wu, Yuedong Chen, Chuanxia Zheng & Jianfei Cai
The Chinese University of Hong Kong, Hong Kong, China
**an Liu
University of Oxford, Oxford, England
Kejie Li
Nanyang Technological University, Singapore, Singapore
Jianmin Zheng

Authors

Qianyi Wu
View author publications
You can also search for this author in PubMed Google Scholar
**an Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yuedong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kejie Li
View author publications
You can also search for this author in PubMed Google Scholar
Chuanxia Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jianfei Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jianmin Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qianyi Wu .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 707 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Q. et al. (2022). Object-Compositional Neural Implicit Surfaces. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13687. Springer, Cham. https://doi.org/10.1007/978-3-031-19812-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-19812-0_12
Published: 30 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19811-3
Online ISBN: 978-3-031-19812-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Object-Compositional Neural Implicit Surfaces

Abstract

Access this chapter

Similar content being viewed by others

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

NIST: Learning Neural Implicit Surfaces and Textures for Multi-view Reconstruction

Neural Mesh-Based Graphics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 707 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Object-Compositional Neural Implicit Surfaces

Abstract

Access this chapter

Similar content being viewed by others

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

NIST: Learning Neural Implicit Surfaces and Textures for Multi-view Reconstruction

Neural Mesh-Based Graphics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 707 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation