Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13693))

Included in the following conference series:

Abstract

Semantic analyses of object point clouds are largely driven by releasing of benchmarking datasets, including synthetic ones whose instances are sampled from object CAD models. However, learning from synthetic data may not generalize to practical scenarios, where point clouds are typically incomplete, non-uniformly distributed, and noisy. Such a challenge of Simulation-to-Reality (Sim2Real) domain gap could be mitigated via learning algorithms of domain adaptation; however, we argue that generation of synthetic point clouds via more physically realistic rendering is a powerful alternative, as systematic non-uniform noise patterns can be captured. To this end, we propose an integrated scheme consisting of physically realistic synthesis of object point clouds via rendering stereo images via projection of speckle patterns onto CAD models and a novel quasi-balanced self-training designed for more balanced data distribution by sparsity-driven selection of pseudo labeled samples for long tailed classes. Experiment results can verify the effectiveness of our method as well as both of its modules for unsupervised domain adaptation on point cloud classification, achieving the state-of-the-art performance. Source codes and the SpeckleNet synthetic dataset are available at https://github.com/Gorilla-Lab-SCUT/QS3.

Y. Chen and Z. Wang—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Achituve, I., Maron, H., Chechik, G.: Self-supervised learning for domain adaptation on point clouds. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 123–133 (2021)

    Google Scholar 

  2. Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M.: Domain-adversarial neural networks. Stat 1050, 15 (2014)

    MATH  Google Scholar 

  3. Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Pseudo-labeling and confirmation bias in deep semi-supervised learning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)

    Google Scholar 

  4. Barron, J.T., Malik, J.: Intrinsic scene properties from a single RGB-D image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2013)

    Google Scholar 

  5. Bartell, F.O., Dereniak, E.L., Wolfe, W.L.: The theory and measurement of bidirectional reflectance distribution function (BRDF) and bidirectional transmittance distribution function (BTDF). In: Radiation Scattering in Optical Systems (RSOS), vol. 257, pp. 154–160. SPIE (1981)

    Google Scholar 

  6. Bohg, J., Romero, J., Herzog, A., Schaal, S.: Robot arm pose estimation through pixel-wise part classification. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3143–3150. IEEE (2014)

    Google Scholar 

  7. Chang, A.X., et al.: ShapeNet: an information-rich 3d model repository. ar**v preprint ar**v:1512.03012 (2015)

  8. Chen, W., Jia, X., Chang, H.J., Duan, J., Shen, L., Leonardis, A.: FS-Net: fast shape-based network for category-level 6d object pose estimation with decoupled rotation mechanism. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1581–1590 (2021)

    Google Scholar 

  9. Blender Online Community: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). www.blender.org

  10. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5828–5839 (2017)

    Google Scholar 

  11. Deng, S., Liang, Z., Sun, L., Jia, K.: VISTA: boosting 3d object detection via dual cross-view spatial attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8448–8457 (2022)

    Google Scholar 

  12. Denninger, M., et al.: BlenderProc: reducing the reality gap with photorealistic rendering. In: Robotics: Science and Systems (RSS) (2020)

    Google Scholar 

  13. Fang, J., et al.: Augmented lidar simulator for autonomous driving. IEEE Robot. Autom. Lett. (RA-L) 5(2), 1931–1938 (2020)

    Article  Google Scholar 

  14. Gao, G., Lauri, M., Wang, Y., Hu, X., Zhang, J., Frintrop, S.: 6d object pose regression via supervised learning on point clouds. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3643–3649 (2020)

    Google Scholar 

  15. Goyal, A., Law, H., Liu, B., Newell, A., Deng, J.: Revisiting point cloud shape classification with a simple and effective baseline. In: International Conference on Machine Learning (ICML), pp. 3809–3820. PMLR (2021)

    Google Scholar 

  16. Grans, S., Tingelstad, L.: Blazer: laser scanning simulation using physically based rendering. ar**v preprint ar**v:2104.05430 (2021)

  17. Gschwandtner, M., Kwitt, R., Uhl, A., Pree, W.: BlenSor: blender sensor simulation toolbox. In: Bebis, G., et al. (eds.) Blensor: Blender sensor simulation toolbox. LNCS, vol. 6939, pp. 199–208. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24031-7_20

    Chapter  Google Scholar 

  18. Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Understanding real world indoor scenes with synthetic data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4077–4085 (2016)

    Google Scholar 

  19. Handa, A., Whelan, T., McDonald, J., Davison, A.J.: A benchmark for RGB-D visual odometry, 3d reconstruction and slam. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531. IEEE (2014)

    Google Scholar 

  20. Heindl, C., Brunner, L., Zambal, S., Scharinger, J.: BlendTorch: a real-time, adaptive domain randomization library. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12664, pp. 538–551. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68799-1_39

    Chapter  Google Scholar 

  21. Kim, J., Hur, Y., Park, S., Yang, E., Hwang, S.J., Shin, J.: Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33 (2020)

    Google Scholar 

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  23. Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1189–1197 (2010)

    Google Scholar 

  24. Landau, M.J., Choo, B.Y., Beling, P.A.: Simulating kinect infrared and depth images. IEEE Trans. Cybernet. 46(12), 3018–3031 (2015)

    Article  Google Scholar 

  25. Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICML Workshop on Challenges in Representation Learning (WREPL) (2013)

    Google Scholar 

  26. Li, B., Zhang, T., **a, T.: Vehicle detection from 3d lidar using fully convolutional network. In: Robotics: Science and Systems (RSS) (2016)

    Google Scholar 

  27. Li, Wet al.: InteriorNet: mega-scale multi-sensor photo-realistic indoor scenes dataset. In: British Machine Vision Conference (BMVC) (2018)

    Google Scholar 

  28. Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 31, pp. 820–830 (2018)

    Google Scholar 

  29. Lian, Q., Lv, F., Duan, L., Gong, B.: Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: a non-adversarial approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6758–6767 (2019)

    Google Scholar 

  30. Lin, J., Wei, Z., Li, Z., Xu, S., Jia, K., Li, Y.: DualposeNet: category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3560–3569 (2021)

    Google Scholar 

  31. Lin, X., Chen, K., Jia, K.: Object point cloud classification via poly-convolutional architecture search. In: Proceedings of the ACM International Conference on Multimedia (MM), pp. 807–815 (2021)

    Google Scholar 

  32. Liu, Y., Fan, B., **ang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8895–8904 (2019)

    Google Scholar 

  33. Mallick, T., Das, P.P., Majumdar, A.K.: Characterizations of noise in kinect depth images: a review. IEEE Sens. J. 14(6), 1731–1740 (2014)

    Google Scholar 

  34. Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11167–11176 (2020)

    Google Scholar 

  35. Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT), pp. 524–530. IEEE (2012)

    Google Scholar 

  36. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8026–8037 (2019)

    Google Scholar 

  37. Planche, B., Singh, R.V.: Physics-based differentiable depth sensor simulation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14387–14397 (2021)

    Google Scholar 

  38. Planche, B., et al.: DepthSynth: real-time realistic synthetic data generation from cad models for 2.5 d recognition. In: International Conference on 3D Vision (3DV), pp. 1–10. IEEE (2017)

    Google Scholar 

  39. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 652–660 (2017)

    Google Scholar 

  40. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 30 (2017)

    Google Scholar 

  41. Qin, C., You, H., Wang, L., Kuo, C.C.J., Fu, Y.: PointDAN: a multi-scale 3d domain adaption network for point cloud representation. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32 (2019)

    Google Scholar 

  42. Reitmann, S., Neumann, L., Jung, B.: Blainder-a blender AI add-on for generation of semantically labeled depth-sensing data. Sensors 21(6), 2144 (2021)

    Article  Google Scholar 

  43. Roy, S., Siarohin, A., Sangineto, E., Bulò, S.R., Sebe, N., Ricci, E.: Unsupervised domain adaptation using feature-whitening and consensus loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9463–9472 (2019)

    Google Scholar 

  44. Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. In: International Conference on Machine Learning (ICML), vol. 70, pp. 2988–2997 (2017)

    Google Scholar 

  45. Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 596–608 (2020)

    Google Scholar 

  46. Straub, J., et al.: The replica dataset: a digital replica of indoor spaces. ar**v preprint ar**v:1906.05797 (2019)

  47. Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, London (2010). https://doi.org/10.1007/978-1-84882-935-0

    Book  MATH  Google Scholar 

  48. Tallavajhula, A., Meriçli, Ç., Kelly, A.: Off-road lidar simulation with data-driven terrain primitives. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 7470–7477. IEEE (2018)

    Google Scholar 

  49. Tang, H., Chen, K., Jia, K.: Unsupervised domain adaptation via structurally regularized deep clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  50. Uy, M.A., Pham, Q., Hua, B., Nguyen, T., Yeung, S.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1588–1597 (2019)

    Google Scholar 

  51. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)

    Article  Google Scholar 

  52. Wei, C., Sohn, K., Mellina, C., Yuille, A., Yang, F.: CReST: a class-rebalancing self-training framework for imbalanced semi-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10857–10866 (2021)

    Google Scholar 

  53. Wu, Z., et al.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1912–1920 (2015)

    Google Scholar 

  54. Xu, Z., Chen, K., Liu, K., Ding, C., Wang, Y., Jia, K.: Classification of single-view object point clouds. ar**v preprint ar**v:2012.10042 (2020)

  55. Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3d object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7652–7660 (2018)

    Google Scholar 

  56. Zhang, Y., et al.: Physically-based rendering for indoor scene understanding using convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  57. Zhang, Y., Lin, J., He, C., Chen, Y., Jia, K., Zhang, L.: Masked surfel prediction for self-supervised point cloud learning. ar**v preprint ar**v:2207.03111 (2022)

  58. Zou, L., Tang, H., Chen, K., Jia, K.: Geometry-aware self-training for unsupervised domain adaptation on object point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6403–6412 (2021)

    Google Scholar 

  59. Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 297–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_18

    Chapter  Google Scholar 

  60. Zou, Y., Yu, Z., Liu, X., Kumar, B.V., Wang, J.: Confidence regularized self-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (Grant No.: 61902131), the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No.: 2017ZT07X183), the Guangdong Provincial Key Laboratory of Human Digital Twin (Grant No.: 2022B1212010004).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ke Chen or Kui Jia .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7351 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Y., Wang, Z., Zou, L., Chen, K., Jia, K. (2022). Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13693. Springer, Cham. https://doi.org/10.1007/978-3-031-19827-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19827-4_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19826-7

  • Online ISBN: 978-3-031-19827-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation