Abstract
The Generative Adversarial Network (GAN) has been the subject of significant attention since it was introduced. It has been widely used in the image domain. However, there has been less research conducted in three dimensions (3D). Moreover, further research in the field of 3D generation has focused on the direct processing of point clouds. Voxel-based methods for 3D object generation were introduced in the early years, but there have been rare subsequent studies. Current methods generate 3D objects of subpar quality. To improve the quality of generated 3D objects, the Multi-Scale Gradient Voxel GAN (MSG-Voxel-GAN) is proposed. Voxel-based methods have achieved promising results in 3D object detection for their fast computation speed and accurate feature description. Therefore, in this paper, we propose a 3D object classification method based on Voxel-RCNN and incorporate it into the discriminator of GAN to generate 3D objects. We apply the network architecture of Multi-Scale Gradient GAN (MSG-GAN) for stable training. Experimental results show that the voxel-based feature extraction method can accurately describe the features of 3D objects, leading to precise classification. The training process of the proposed method is stable, and the quality of generated 3D objects significantly exceeds that of other methods in both subjective visual and objective evaluation metrics. This method can facilitate the development of 3D generative techniques based on GAN.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-023-17116-9/MediaObjects/11042_2023_17116_Fig13_HTML.png)
Similar content being viewed by others
Availability of data and materials
Modelnet40 datasets were provided by Princeton University’s Vision & Robotics Labs. It is available at https://modelnet.cs.princeton.edu/.
References
Goodfellow I et al (2014) Generative adversarial nets. Proc Adv Neural Inf Process Syst 2672–2680
Liao K, Lin C, Zhao Y, Gabbouj M (2020) DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time. IEEE Transactions on Circuits and Systems for Video Technology 30(3):725–733
Yang S, Lin C, Liao K, Zhang C, Zhao Y (2021) ”Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA, 6344–6353
Chen B, Liu X, Zheng Y, Zhao G, Shi Y (2022) A robust GAN-generated face detection method based on dual-color spaces and an improved Xception. IEEE Trans Circuits Syst Video Technol 32(6):3527–3538
Chen Y, ** C, Li G, Li TH, Gao W (2023) Mitigating Label Noise in GANs via Enhanced Spectral Normalization, IEEE Trans Circuits Syst Video Technol (TCSVT). accepted in January 2023
Wu J, Zhang C, Xue T et al (2016) Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling, Jan. 2016, ar**v:1610.07584. [Online]. Available: ar**v:1610.07584
Chan ER, Lin CZ, Chan MA et al (2022) Efficient Geometry-aware 3D Generative Adversarial Networks. IEEE Conf Comput Vis Pattern Recognit 16102–16112
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN, Dec. 2017. ar**v: 1701.07875. [Online]. Available: ar**v:1701.07875
Karras T, Aila T, Laine S et al (2017) Progressive growing of GANs for improved quality, stability, and variation, Oct. 2017, ar**v:1710.10196. [Online]. Available: ar**v:1710.10196
Karnewar A, Wang O (2019) MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks, Mar 2019, ar**v:1903.06048v3. [Online]. Available: ar**v:1903.06048v3
Karras T, Laine S, Aittala M et al (2020) Analyzing and Improving the Image Quality of StyleGAN. IEEE Conf Comput Vis Pattern Recognit 8107–8116
Zhou Y, Tuzel O (2018) VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. IEEE Conf Comput Vis Pattern Recognit 4490–4490
Charles RQ, Su H, Kaichun M et al (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. IEEE Conf Comput Vis Pattern Recognit pp. 77–85
Yan Y, Mao Y, Li B et al (2018) SECOND: Sparsely Embedded Convolutional Detection. Sensors 18:3337–3353
Shi S, Guo C, Jiang ML et al (2020) PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. IEEE Conf Comput Vis Pattern Recognit 10526–10535
Deng J, Shi S, Li P et al (2021) Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection. AAAI Conf Artif Intell 1201–1209
Radford A, Metz L, Chintala S (2015) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Comput Sci
**e J, Xu Y, Zheng Z et al (2020) Generative PointNet: Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification. Apr. 2020. ar**v:2004.01301. [Online]. Available: ar**v:2004.01301
Shu DW, Park SW, Kwon J (2019) 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions. May. 2019,ar**v:1905.06292,[Online]. Available: ar**v:1905.06292
Valsesia D, Fracastoro G, Magli E (2021) Learning Localized Representations of Point Clouds With Graph-Convolutional Generative Adversarial Networks. IEEE Trans Multimed 23:402–414
Ramasinghe S, Khan S, Barnes N et al (2020) Spectral-GANs for High-Resolution 3D Point-cloud Generation. in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems 1–14
Li CL, Zaheer M, Zhang Y et al (2018) Point Cloud GAN, Oct. 2018, ar**v:1810.05795. [Online]. Available: ar**v:1810.05795
Nash C, Williams C (2017) The shape variational autoencoder: A deep generative model of partsegmented 3D objects. Comput Graph Forum 36(5):1–12
Lang AH, Vora S, Caesar H et al (2019) PointPillars: Fast Encoders for Object Detection From Point Clouds. IEEE Conf Comput Vis Pattern Recognit 12689–12697
Wu Z, Song S, Khosla A et al (2015) 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling. IEEE Conf Comput Vis Pattern Recognit 1912–1920
Funding
This research was funded by the 13th Five-Year Plan Funding of China, the Funding 41419029102 and the Funding 41419020107, the 14th Five-Year Plan Funding of China, the Funding 50916040401 and the Funding 514010503-201.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.L. and B.W.; methodology, B.W.; software, F.L.; validation, B.W. and F.L.; formal analysis, J.L.; investigation, B.W.; resources, F.L.; data curation, F.L.; writing—original draft preparation, B.W.; writing—review and editing, J.L.; visualization, F.L.; supervision, J.L. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Bingxu Wang and **hui Lan are co-first authors with equal contributions.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, B., Lan, J. & Li, F. MSG-Voxel-GAN: multi-scale gradient voxel GAN for 3D object generation. Multimed Tools Appl (2023). https://doi.org/10.1007/s11042-023-17116-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-023-17116-9