Abstract
Due to the sparseness and irregularity of the point cloud, accurate extraction of internal structural details from the point cloud as well as fast identification of the overall contour remains a challenging task. Currently, most studies focus on introducing sophisticated designs to unilaterally capture local or global features of point cloud, and rarely combine local features with global features. More importantly, it is easy to increase the computational burden while pursuing efficiency. We propose a lightweight feature extractor that efficiently extract and fuse local and global features of point cloud, which is named as AFpoint. Specifically, AFpoint is composed of two modules: the Local-Global Parallelized Feature Extraction module (LGP) and the Adaptive Feature Fusion module (AFF). The LGP module encodes local and global features in parallel by using point-by-point convolution and relative attention mechanism, respectively. It simultaneously performs the task that extracts the fine-grained structure and captures the global relationships. The AFF module adaptively selects and integrates the local and global features by estimating the attention maps of encoded features and helps the model to autonomously focus on important regions. Note that AFpoint is a plug-and-play and universal module. We use AFpoint to construct classification and segmentation networks of point cloud, which greatly improves the accuracy and robustness of the baseline model and reduces the parameters by nearly half. Experiments on the widely adopted ModelNet40, ScanObjectNN classification dataset demonstrate the state-of-the-art performance of our network and also show good results on experiments with the ShapeNetPart part segmentation dataset.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18658-2/MediaObjects/11042_2024_18658_Fig10_HTML.png)
Similar content being viewed by others
References
Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 770–779
Halim Z, Rehan M (2020) On identification of driving-induced stress using electroencephalogram signals: a framework based on wearable safety-critical scheme and machine learning. Inf Fusion 53:66–79
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adva Neural Inf Process Syst
Christoph G, Ralf K, Tim R, Andreas T, Gernot E, Martin K, Ferdinand H, Thomas V, Andreas H, Karl S, Arne N (2023) Measurement of individual tree parameters with carriage-based laser scanning in cable yarding operations. Croat J For Eng: J Theor Appl For Eng 44(2):401–417
Rao Y, Lu J, Zhou J (2019) Spherical fractal convolutional neural networks for point cloud recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 452–460
Yi L, Su H, Guo X, Guibas LJ (2017) Syncspeccnn: synchronized spectral cnn for 3d shape segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2282–2290
Zhang Z, Li K, Yin X, Piao X, Wang Y, Yang X, Yin B (2020) Point cloud semantic scene segmentation based on coordinate convolution. Comput Animat Virtual Worlds 31(4–5):e1948
Shi S, Wang Z, Shi J, Wang X, Li H (2020) From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE Trans Pattern Anal Mach Intell 43(8):2647–2664
Chen Y, Ni J, Tang G, Cao W, Yang SX (2023) An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images. Multimedia Tools Appl 1–26
Ni J, Shen K, Chen Y, Yang SX (2023) An improved ssd-like deep network-based object detection method for indoor scenes. IEEE Trans Instrum Meas 72:1–15
Peng L, Liu F, Yu Z, Yan S, Deng D, Yang Z, Liu H, Cai D (2022) Lidar point cloud guided monocular 3d object detection. In: European conference on computer vision. pp 123–139
Yang H, Shi J, Carlone L (2020) Teaser: fast and certifiable point cloud registration. IEEE Trans Robot 37(2):314–333
Wang Z, Lu F (2019) Voxsegnet: volumetric cnns for semantic part segmentation of 3d shapes. IEEE Trans Vis Comput Graph 26(9):2919–2930
Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H (2020) Pv-rcnn: point-voxel feature set abstraction for 3d object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10529–10538
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 652–660
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph cnn for learning on point clouds. ACM Trans Graph (tog) 38(5):1–12
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. Adv Neural Inf Process Syst
Xu M, Ding R, Zhao H, Qi X (2021) Paconv: position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3173–3182
Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) Pct: point cloud transformer. Comput Vis Media 7(2):187–199
Zhao H, Jiang L, Jia J, Torr P HS, Koltun V (2021) Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision. pp 16259–16268
Yan X, Zheng C, Li Z, Wang S, Cui S (2020) Pointasnl: robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5589–5598
Liu X, Han Z, Liu Y-S, Zwicker M (2019) Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. Proceedings of the AAAI conference on artificial intelligence 33:8778–8785
Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9621–9630
Thomas H, Qi CR, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 6411–6420
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ar**v:1409.0473
Lin Z, Feng M, dos Santos CN, Yu M, **ang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. ar**v:1703.03130
Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. In: Proceedings of the conference of the north american chapter of the association for computational linguistics. pp 464–468
Tahir M, Halim Z, Waqas M, Sukhia KN, Tu S (2023) Emotion detection using convolutional neural network and long short-term memory: a deep multimodal framework. Multimedia Tools Appl 1–34
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805
Zhao H, Jia J, Koltun V (2020) Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10076–10085
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 10012–10022
Ni J, Liu R, Tang G, **e Y (2022) An improved attention-based bidirectional LSTM model for cyanobacterial bloom prediction. Int J Cont, Autom Syst 20(10):3445–3455
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. ar**v:1704.04861
Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7440–7449
Qiao D, Zulkernine F (2023) Adaptive feature fusion for cooperative perception using lidar point clouds. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 1186–1195
Rahman AU, Halim Z (2023) Identifying dominant emotional state using handwriting and drawing samples by fusing features. Appl Intell 53(3):2798–2814
Loshchilov I, Hutter F (2016) Sgdr: stochastic gradient descent with warm restarts. ar**v:1608.03983
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 87–102
Liu Y, Fan B, **ang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8895–8904
Berg A, Oskarsson M, O’Connor M (2022) Points to patches: enabling the use of self-attention for 3d shape recognition. ar**v:2204.03957
Wijaya KT, Paek D-H, Kong S-H (2022) Advanced feature learning on point clouds using multi-resolution features and learnable pooling. ar**v:2205.09962
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, **ao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1912–1920
Qiu S, Anwar S, Barnes N (2021) Dense-resolution network for point cloud classification and segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 3813–3822
Qiu S, Anwar S, Barnes N (2021) Geometric back-projection network for point cloud classification. IEEE Trans Multimedia 24:1943–1955
Goyal A, Law H, Liu B, Newell A, Deng J (2021) Revisiting point cloud shape classification with a simple and effective baseline. In: International conference on machine learning. pp 3809–3820
Hamdi A, Giancola S, Ghanem B (2021) Mvtn: multi-view transformation network for 3d shape recognition. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 1–11
Yu X, Tang L, Rao Y, Huang T, Zhou J, Lu J (2022) Point-bert: pre-training 3d point cloud transformers with masked point modeling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 19313–19322
Cheng S, Chen X, He X, Liu Z, Bai X (2021) Pra-net: point relation-aware network for 3d point cloud analysis. IEEE Trans Image Process 30:4436–4448
Ran H, Liu J, Wang C (2022) Surface representation for point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 18942–18952
Klokov R, Lempitsky V (2017) Escape from cells: deep kdnetworks for the recognition of 3d point cloud models. In: Proceedings of the IEEE international conference on computer vision. pp 863–872
Li J, Chen BM, Lee GH (2018) So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 9397–9406
Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ar** Li, Chenghui Liu, **-Li-Aff1">
Corresponding author
Ethics declarations
Conflict of interests
The authors have no conflict of interest
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, G., Liu, C., Gao, X. et al. AFpoint: adaptively fusing local and global features for point cloud. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18658-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18658-2