Log in

AFpoint: adaptively fusing local and global features for point cloud

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the sparseness and irregularity of the point cloud, accurate extraction of internal structural details from the point cloud as well as fast identification of the overall contour remains a challenging task. Currently, most studies focus on introducing sophisticated designs to unilaterally capture local or global features of point cloud, and rarely combine local features with global features. More importantly, it is easy to increase the computational burden while pursuing efficiency. We propose a lightweight feature extractor that efficiently extract and fuse local and global features of point cloud, which is named as AFpoint. Specifically, AFpoint is composed of two modules: the Local-Global Parallelized Feature Extraction module (LGP) and the Adaptive Feature Fusion module (AFF). The LGP module encodes local and global features in parallel by using point-by-point convolution and relative attention mechanism, respectively. It simultaneously performs the task that extracts the fine-grained structure and captures the global relationships. The AFF module adaptively selects and integrates the local and global features by estimating the attention maps of encoded features and helps the model to autonomously focus on important regions. Note that AFpoint is a plug-and-play and universal module. We use AFpoint to construct classification and segmentation networks of point cloud, which greatly improves the accuracy and robustness of the baseline model and reduces the parameters by nearly half. Experiments on the widely adopted ModelNet40, ScanObjectNN classification dataset demonstrate the state-of-the-art performance of our network and also show good results on experiments with the ShapeNetPart part segmentation dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 770–779

  2. Halim Z, Rehan M (2020) On identification of driving-induced stress using electroencephalogram signals: a framework based on wearable safety-critical scheme and machine learning. Inf Fusion 53:66–79

  3. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adva Neural Inf Process Syst

  4. Christoph G, Ralf K, Tim R, Andreas T, Gernot E, Martin K, Ferdinand H, Thomas V, Andreas H, Karl S, Arne N (2023) Measurement of individual tree parameters with carriage-based laser scanning in cable yarding operations. Croat J For Eng: J Theor Appl For Eng 44(2):401–417

    Article  Google Scholar 

  5. Rao Y, Lu J, Zhou J (2019) Spherical fractal convolutional neural networks for point cloud recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 452–460

  6. Yi L, Su H, Guo X, Guibas LJ (2017) Syncspeccnn: synchronized spectral cnn for 3d shape segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2282–2290

  7. Zhang Z, Li K, Yin X, Piao X, Wang Y, Yang X, Yin B (2020) Point cloud semantic scene segmentation based on coordinate convolution. Comput Animat Virtual Worlds 31(4–5):e1948

    Article  Google Scholar 

  8. Shi S, Wang Z, Shi J, Wang X, Li H (2020) From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE Trans Pattern Anal Mach Intell 43(8):2647–2664

    Google Scholar 

  9. Chen Y, Ni J, Tang G, Cao W, Yang SX (2023) An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images. Multimedia Tools Appl 1–26

  10. Ni J, Shen K, Chen Y, Yang SX (2023) An improved ssd-like deep network-based object detection method for indoor scenes. IEEE Trans Instrum Meas 72:1–15

    Google Scholar 

  11. Peng L, Liu F, Yu Z, Yan S, Deng D, Yang Z, Liu H, Cai D (2022) Lidar point cloud guided monocular 3d object detection. In: European conference on computer vision. pp 123–139

  12. Yang H, Shi J, Carlone L (2020) Teaser: fast and certifiable point cloud registration. IEEE Trans Robot 37(2):314–333

  13. Wang Z, Lu F (2019) Voxsegnet: volumetric cnns for semantic part segmentation of 3d shapes. IEEE Trans Vis Comput Graph 26(9):2919–2930

  14. Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H (2020) Pv-rcnn: point-voxel feature set abstraction for 3d object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10529–10538

  15. Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 652–660

  16. Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst

  17. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph cnn for learning on point clouds. ACM Trans Graph (tog) 38(5):1–12

    Article  Google Scholar 

  18. Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. Adv Neural Inf Process Syst

  19. Xu M, Ding R, Zhao H, Qi X (2021) Paconv: position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3173–3182

  20. Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) Pct: point cloud transformer. Comput Vis Media 7(2):187–199

  21. Zhao H, Jiang L, Jia J, Torr P HS, Koltun V (2021) Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision. pp 16259–16268

  22. Yan X, Zheng C, Li Z, Wang S, Cui S (2020) Pointasnl: robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5589–5598

  23. Liu X, Han Z, Liu Y-S, Zwicker M (2019) Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. Proceedings of the AAAI conference on artificial intelligence 33:8778–8785

  24. Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9621–9630

  25. Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 6411–6420

  26. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ar**v:1409.0473

  27. Lin Z, Feng M, dos Santos CN, Yu M, **ang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. ar**v:1703.03130

  28. Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. In: Proceedings of the conference of the north american chapter of the association for computational linguistics. pp 464–468

  29. Tahir M, Halim Z, Waqas M, Sukhia KN, Tu S (2023) Emotion detection using convolutional neural network and long short-term memory: a deep multimodal framework. Multimedia Tools Appl 1–34

  30. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805

  31. Zhao H, Jia J, Koltun V (2020) Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10076–10085

  32. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale

  33. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 10012–10022

  34. Ni J, Liu R, Tang G, **e Y (2022) An improved attention-based bidirectional LSTM model for cyanobacterial bloom prediction. Int J Cont, Autom Syst 20(10):3445–3455

  35. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. ar**v:1704.04861

  36. Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7440–7449

  37. Qiao D, Zulkernine F (2023) Adaptive feature fusion for cooperative perception using lidar point clouds. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 1186–1195

  38. Rahman AU, Halim Z (2023) Identifying dominant emotional state using handwriting and drawing samples by fusing features. Appl Intell 53(3):2798–2814

  39. Loshchilov I, Hutter F (2016) Sgdr: stochastic gradient descent with warm restarts. ar**v:1608.03983

  40. Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 87–102

  41. Liu Y, Fan B, **ang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8895–8904

  42. Berg A, Oskarsson M, O’Connor M (2022) Points to patches: enabling the use of self-attention for 3d shape recognition. ar**v:2204.03957

  43. Wijaya KT, Paek D-H, Kong S-H (2022) Advanced feature learning on point clouds using multi-resolution features and learnable pooling. ar**v:2205.09962

  44. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, **ao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1912–1920

  45. Qiu S, Anwar S, Barnes N (2021) Dense-resolution network for point cloud classification and segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 3813–3822

  46. Qiu S, Anwar S, Barnes N (2021) Geometric back-projection network for point cloud classification. IEEE Trans Multimedia 24:1943–1955

    Article  Google Scholar 

  47. Goyal A, Law H, Liu B, Newell A, Deng J (2021) Revisiting point cloud shape classification with a simple and effective baseline. In: International conference on machine learning. pp 3809–3820

  48. Hamdi A, Giancola S, Ghanem B (2021) Mvtn: multi-view transformation network for 3d shape recognition. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 1–11

  49. Yu X, Tang L, Rao Y, Huang T, Zhou J, Lu J (2022) Point-bert: pre-training 3d point cloud transformers with masked point modeling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 19313–19322

  50. Cheng S, Chen X, He X, Liu Z, Bai X (2021) Pra-net: point relation-aware network for 3d point cloud analysis. IEEE Trans Image Process 30:4436–4448

  51. Ran H, Liu J, Wang C (2022) Surface representation for point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 18942–18952

  52. Klokov R, Lempitsky V (2017) Escape from cells: deep kdnetworks for the recognition of 3d point cloud models. In: Proceedings of the IEEE international conference on computer vision. pp 863–872

  53. Li J, Chen BM, Lee GH (2018) So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 9397–9406

  54. Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ar** Li, Chenghui Liu, **-Li-Aff1">

Corresponding author

Correspondence to Guang** Li.

Ethics declarations

Conflict of interests

The authors have no conflict of interest

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, G., Liu, C., Gao, X. et al. AFpoint: adaptively fusing local and global features for point cloud. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18658-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18658-2

Keywords

Navigation