Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition

Sun, Yang; Guo, Jianhua; Wang, Haiyang; Zhang, Yuhang; Zheng, Jiushuai; Tian, Bin

doi:10.1007/s11760-024-03138-9

Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition

Original Paper
Published: 07 April 2024

Volume 18, pages 157–165, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yang Sun^1,2^na1,
Jianhua Guo^1,3,
Haiyang Wang⁴^na1,
Yuhang Zhang^1,3^na1,
Jiushuai Zheng^1,3^na1 &
…
Bin Tian⁵

122 Accesses
Explore all metrics

Abstract

LiDAR-based place recognition plays a crucial role in autonomous vehicles, enabling the identification of locations in GPS-invalid environments that were previously accessed. Localization in place recognition can be achieved by searching for nearest neighbors in the database. Two common types of place recognition features are local descriptors and global descriptors. Local descriptors typically compactly represent regions or points, while global descriptors provide an overarching view of the data. Despite the significant progress made in recent years by both types of descriptors, any representation inevitably involves information loss. To overcome this limitation, we have developed PatchLPR, a Transformer network employing multi-level feature fusion for robust place recognition. PatchLPR integrates global and local feature information, focusing on meaningful regions on the feature map to generate an environmental representation. We propose a patch feature extraction module based on the Vision Transformer to fully leverage the information and correlations of different features. We evaluated our approach on the KITTI dataset and a self-collected dataset covering over 4.2 km. The experimental results demonstrate that our method effectively utilizes multi-level features to enhance place recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

AttDLNet: Attention-Based Deep Network for 3D LiDAR Place Recognition

A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition

Data availability statement

The KITTI dataset used in this research is available online and the HUE dataset can be provided by us.

References

Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016). ar**v:1511.07247
Hausler, S., Garg, S., Xu, M., Milford, M., Fischer, T.: Patch-netvlad: multi-scale fusion of locally-global descriptors for place recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14141–14152 (2021). ar**v:2103.01486. Focus to learn more
Lowry, S., Sünderhauf, N., Newman, P., Leonard, J.J., Cox, D., Corke, P., Milford, M.J.: Visual place recognition: a survey. IEEE Trans. Robot. 32(1), 1–19 (2015). https://doi.org/10.1109/TRO.2015.2496823
Article Google Scholar
Schuster, R., Wasenmuller, O., Unger, C., Stricker, D.: Sdc-stacked dilated convolution: a unified descriptor network for dense matching tasks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2556–2565 (2019). ar**v:1904.03076
Cao, B., Araujo, A., Sim, J.: Unifying deep local and global features for image search. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pp. 726–743 (2020). ar**v:2001.05027
Wang, R., Shen, Y., Zuo, W., Zhou, S., Zheng, N.: Transvpr: Transformer-based place recognition with multi-level attention aggregation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13648–13657 (2022). ar**v:2201.02001
Yin, H., Xu, X., Lu, S., Chen, X., **ong, R., Shen, S., Stachniss, C., Wang, Y.: A survey on global lidar localization: challenges, advances and open problems. ar**v preprint ar**v:2302.07433 (2023)
Chen, X., Läbe, T., Milioto, A., Röhling, T., Vysotska, O., Haag, A., Behley, J., Stachniss, C.: Overlapnet: loop closing for lidar-based slam. ar**v preprint ar**v:2105.11344 (2021)
Ma, J., Zhang, J., Xu, J., Ai, R., Gu, W., Chen, X.: Overlaptransformer: an efficient and yaw-angle-invariant transformer network for lidar-based place recognition. IEEE Robot. Autom. Lett. 7(3), 6958–6965 (2022). https://doi.org/10.1109/LRA.2022.3178797
Article Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth \(16\times 16\) words: transformers for image recognition at scale. ar**v preprint ar**v:2010.11929 (2020)
Uy, M.A., Lee, G.H.: Pointnetvlad: Deep point cloud based retrieval for large-scale place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4470–4479 (2018). ar**v:1804.03492
Kim, G., Kim, A.: Scan context: egocentric spatial descriptor for place recognition within 3d point cloud map. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4802–4809 (2018). https://doi.org/10.1109/IROS.2018.8593953
Kong, X., Yang, X., Zhai, G., Zhao, X., Zeng, X., Wang, M., Liu, Y., Li, W., Wen, F.: Semantic graph based place recognition for 3d point clouds. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8216–8223 (2020). https://doi.org/10.1109/IROS45743.2020.9341060
Vidanapathirana, K., Moghadam, P., Harwood, B., Zhao, M., Sridharan, S., Fookes, C.: Locus: lidar-based place recognition using spatiotemporal higher-order pooling. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5075–5081 (2021). https://doi.org/10.1109/ICRA48506.2021.9560915
Vysotska, O., Stachniss, C.: Relocalization under substantial appearance changes using hashing. In: Proceedings of the IROS Workshop on Planning, Perception and Navigation for Intelligent Vehicles, Vancouver, BC, Canada, vol. 24 (2017)
Li, J., Hu, Q., Ai, M.: Rift: multi-modal image matching based on radiation-variation insensitive feature transform. IEEE Trans. Image Process. 29, 3296–3310 (2019). https://doi.org/10.1109/TIP.2019.2959244
Article Google Scholar
Luo, L., Cao, S.-Y., Sheng, Z., Shen, H.-L.: Lidar-based global localization using histogram of orientations of principal normals. IEEE Trans. Intell. Veh. 7(3), 771–782 (2022). https://doi.org/10.1109/TIV.2022.3169153
Article Google Scholar
Rizzini, D.L.: Place recognition of 3d landmarks based on geometric relations. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 648–654 (2017). https://doi.org/10.1109/IROS.2017.8202220
Guo, J., Borges, P.V., Park, C., Gawel, A.: Local descriptor for robust place recognition using lidar intensity. IEEE Robot. Autom. Lett. 4(2), 1470–1477 (2019). https://doi.org/10.1109/LRA.2019.2893887
Article Google Scholar
**ang, H., Zhu, X., Shi, W., Fan, W., Chen, P., Bao, S.: Delightlcd: a deep and lightweight network for loop closure detection in lidar slam. IEEE Sens. J. 22(21), 20761–20772 (2022). https://doi.org/10.1109/JSEN.2022.3206506
Article Google Scholar
Zhou, Y., Wang, Y., Poiesi, F., Qin, Q., Wan, Y.: Loop closure detection using local 3d deep descriptors. IEEE Robot. Autom. Lett. 7(3), 6335–6342 (2022). ar**v:2111.00440
Article Google Scholar
Poiesi, F., Boscaini, D.: Distinctive 3d local deep descriptors. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5720–5727 (2021). https://doi.org/10.1109/ICPR48806.2021.9411978
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017). ar**v:1612.00593
Liu, Z., Zhou, S., Suo, C., Yin, P., Chen, W., Wang, H., Li, H., Liu, Y.-H.: Lpd-net: 3d point cloud learning for large-scale place recognition and environment analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2831–2840 (2019). ar**v:1812.07050
Zhang, W., **ao, C.: Pcan: 3d attention map learning using contextual information for point cloud based retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12436–12445 (2019). ar**v:1904.09793
Komorowski, J.: Minkloc3d: Point cloud based large-scale place recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1790–1799 (2021). ar**v:2011.04530
Zhou, Z., Zhao, C., Adolfsson, D., Su, S., Gao, Y., Duckett, T., Sun, L.: Ndt-transformer: Large-scale 3d point cloud localisation using the normal distribution transform representation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5654–5660 (2021). https://doi.org/10.1109/ICRA48506.2021.9560932
Ma, J., **ong, G., Xu, J., Chen, X.: Cvtnet: a cross-view transformer network for lidar-based place recognition in autonomous driving environments. IEEE Trans. Ind. Inf. (2023). https://doi.org/10.1109/TII.2023.3313635
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. syst. (2017). ar**v:1706.03762
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012). https://doi.org/10.1109/CVPR.2012.6248074
Pandey, G., McBride, J.R., Eustice, R.M.: Ford campus vision and lidar data set. Int. J. Robot. Res. 30(13), 1543–1552 (2011)
Article Google Scholar
Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with GPUS. IEEE Trans. Big Data 7(3), 535–547 (2019). https://doi.org/10.1109/TBDATA.2019.2921572
Article Google Scholar

Download references

Funding

Research on Key Technologies of Intelligent Equipment for Mine Powered by Pure Clean Energy, Natural Science Foundation of Hebei Province, F2021402011

Author information

Haiyang Wang, Yuhang Zhang, Jiushuai Zheng and Bin Tian are contributed equally to this work.

Authors and Affiliations

College of Mechanical and Equipment Engineering, Hebei University of Engineering, Handan, 056038, China
Yang Sun, Jianhua Guo, Yuhang Zhang & Jiushuai Zheng
Key Laboratory of Intelligent Industrial Equipment Technology of Hebei Province, Handan, Hebei Province, China
Yang Sun
Handan Key Laboratory of Intelligent Vehicles, Handan, Hebei, China
Jianhua Guo, Yuhang Zhang & Jiushuai Zheng
Jizhong Energy Fengfeng Group Co. Ltd, 16 Unicom South Road, Handan, Hebei, China
Haiyang Wang
Institute of Automation Chinese Academy of Sciences, Bei**g, China
Bin Tian

Authors

Yang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Guo
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuhang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiushuai Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Bin Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have contributed their unique insights to the research concept, and after review and discussion, they unanimously approved the content of the final manuscript.

Corresponding author

Correspondence to Jianhua Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, Y., Guo, J., Wang, H. et al. Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition. SIViP 18 (Suppl 1), 157–165 (2024). https://doi.org/10.1007/s11760-024-03138-9

Download citation

Received: 17 January 2024
Revised: 29 February 2024
Accepted: 06 March 2024
Published: 07 April 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s11760-024-03138-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

3D point cloud-based place recognition: a survey

AttDLNet: Attention-Based Deep Network for 3D LiDAR Place Recognition

A Faster, Lighter and Stronger Deep Learning-Based Approach for Place Recognition

Data availability statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation