Log in

MSG-Voxel-GAN: multi-scale gradient voxel GAN for 3D object generation

  • 1233: Robust Enhancement, Understanding and Assessment of Low-quality Multimedia Data
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The Generative Adversarial Network (GAN) has been the subject of significant attention since it was introduced. It has been widely used in the image domain. However, there has been less research conducted in three dimensions (3D). Moreover, further research in the field of 3D generation has focused on the direct processing of point clouds. Voxel-based methods for 3D object generation were introduced in the early years, but there have been rare subsequent studies. Current methods generate 3D objects of subpar quality. To improve the quality of generated 3D objects, the Multi-Scale Gradient Voxel GAN (MSG-Voxel-GAN) is proposed. Voxel-based methods have achieved promising results in 3D object detection for their fast computation speed and accurate feature description. Therefore, in this paper, we propose a 3D object classification method based on Voxel-RCNN and incorporate it into the discriminator of GAN to generate 3D objects. We apply the network architecture of Multi-Scale Gradient GAN (MSG-GAN) for stable training. Experimental results show that the voxel-based feature extraction method can accurately describe the features of 3D objects, leading to precise classification. The training process of the proposed method is stable, and the quality of generated 3D objects significantly exceeds that of other methods in both subjective visual and objective evaluation metrics. This method can facilitate the development of 3D generative techniques based on GAN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Availability of data and materials

Modelnet40 datasets were provided by Princeton University’s Vision & Robotics Labs. It is available at https://modelnet.cs.princeton.edu/.

References

  1. Goodfellow I et al (2014) Generative adversarial nets. Proc Adv Neural Inf Process Syst 2672–2680

  2. Liao K, Lin C, Zhao Y, Gabbouj M (2020) DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time. IEEE Transactions on Circuits and Systems for Video Technology 30(3):725–733

    Article  Google Scholar 

  3. Yang S, Lin C, Liao K, Zhang C, Zhao Y (2021) ”Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA, 6344–6353

  4. Chen B, Liu X, Zheng Y, Zhao G, Shi Y (2022) A robust GAN-generated face detection method based on dual-color spaces and an improved Xception. IEEE Trans Circuits Syst Video Technol 32(6):3527–3538

    Article  Google Scholar 

  5. Chen Y, ** C, Li G, Li TH, Gao W (2023) Mitigating Label Noise in GANs via Enhanced Spectral Normalization, IEEE Trans Circuits Syst Video Technol (TCSVT). accepted in January 2023

  6. Wu J, Zhang C, Xue T et al (2016) Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling, Jan. 2016, ar**v:1610.07584. [Online]. Available: ar**v:1610.07584

  7. Chan ER, Lin CZ, Chan MA et al (2022) Efficient Geometry-aware 3D Generative Adversarial Networks. IEEE Conf Comput Vis Pattern Recognit 16102–16112

  8. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN, Dec. 2017. ar**v: 1701.07875. [Online]. Available: ar**v:1701.07875

  9. Karras T, Aila T, Laine S et al (2017) Progressive growing of GANs for improved quality, stability, and variation, Oct. 2017, ar**v:1710.10196. [Online]. Available: ar**v:1710.10196

  10. Karnewar A, Wang O (2019) MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks, Mar 2019, ar**v:1903.06048v3. [Online]. Available: ar**v:1903.06048v3

  11. Karras T, Laine S, Aittala M et al (2020) Analyzing and Improving the Image Quality of StyleGAN. IEEE Conf Comput Vis Pattern Recognit 8107–8116

  12. Zhou Y, Tuzel O (2018) VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. IEEE Conf Comput Vis Pattern Recognit 4490–4490

  13. Charles RQ, Su H, Kaichun M et al (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. IEEE Conf Comput Vis Pattern Recognit pp. 77–85

  14. Yan Y, Mao Y, Li B et al (2018) SECOND: Sparsely Embedded Convolutional Detection. Sensors 18:3337–3353

    Article  Google Scholar 

  15. Shi S, Guo C, Jiang ML et al (2020) PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. IEEE Conf Comput Vis Pattern Recognit 10526–10535

  16. Deng J, Shi S, Li P et al (2021) Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection. AAAI Conf Artif Intell 1201–1209

  17. Radford A, Metz L, Chintala S (2015) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Comput Sci

  18. **e J, Xu Y, Zheng Z et al (2020) Generative PointNet: Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification. Apr. 2020. ar**v:2004.01301. [Online]. Available: ar**v:2004.01301

  19. Shu DW, Park SW, Kwon J (2019) 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions. May. 2019,ar**v:1905.06292,[Online]. Available: ar**v:1905.06292

  20. Valsesia D, Fracastoro G, Magli E (2021) Learning Localized Representations of Point Clouds With Graph-Convolutional Generative Adversarial Networks. IEEE Trans Multimed 23:402–414

    Article  Google Scholar 

  21. Ramasinghe S, Khan S, Barnes N et al (2020) Spectral-GANs for High-Resolution 3D Point-cloud Generation. in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems 1–14

  22. Li CL, Zaheer M, Zhang Y et al (2018) Point Cloud GAN, Oct. 2018, ar**v:1810.05795. [Online]. Available: ar**v:1810.05795

  23. Nash C, Williams C (2017) The shape variational autoencoder: A deep generative model of partsegmented 3D objects. Comput Graph Forum 36(5):1–12

    Article  Google Scholar 

  24. Lang AH, Vora S, Caesar H et al (2019) PointPillars: Fast Encoders for Object Detection From Point Clouds. IEEE Conf Comput Vis Pattern Recognit 12689–12697

  25. Wu Z, Song S, Khosla A et al (2015) 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling. IEEE Conf Comput Vis Pattern Recognit 1912–1920

Download references

Funding

This research was funded by the 13th Five-Year Plan Funding of China, the Funding 41419029102 and the Funding 41419020107, the 14th Five-Year Plan Funding of China, the Funding 50916040401 and the Funding 514010503-201.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, J.L. and B.W.; methodology, B.W.; software, F.L.; validation, B.W. and F.L.; formal analysis, J.L.; investigation, B.W.; resources, F.L.; data curation, F.L.; writing—original draft preparation, B.W.; writing—review and editing, J.L.; visualization, F.L.; supervision, J.L. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to **hui Lan.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Bingxu Wang and **hui Lan are co-first authors with equal contributions.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, B., Lan, J. & Li, F. MSG-Voxel-GAN: multi-scale gradient voxel GAN for 3D object generation. Multimed Tools Appl (2023). https://doi.org/10.1007/s11042-023-17116-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-023-17116-9

Keywords

Navigation