Abstract
This work addresses visual cross-view metric localization for outdoor robotics. Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch. Related work addressed this task for range-sensors (LiDAR, Radar), but for vision, only as a secondary regression step after an initial cross-view image retrieval step. Since the local satellite patch could also be retrieved through any rough localization prior (e.g. from GPS/GNSS, temporal filtering), we drop the image retrieval objective and focus on the metric localization only. We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck (rather than at the output as in image retrieval), and a dense spatial distribution as output to capture multi-modal localization ambiguities. We compare against a state-of-the-art regression baseline that uses global image descriptors. Quantitative and qualitative experimental results on the recently proposed VIGOR and the Oxford RobotCar datasets validate our design. The produced probabilities are correlated with localization accuracy, and can even be used to roughly estimate the ground camera’s heading when its orientation is unknown. Overall, our method reduces the median metric localization error by 51%, 37%, and 28% compared to the state-of-the-art when generalizing respectively in the same area, across areas, and across time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Models and code, plus extended data are available at
References
Agarwal, P., Burgard, W., Spinello, L.: Metric localization using google street view. In: IEEE/RSJ IROS, pp. 3111–3118 (2015)
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of IEEE/CVF CVPR, pp. 5297–5307 (2016)
Barsan, I.A., Wang, S., Pokrovsky, A., Urtasun, R.: Learning to localize using a lidar intensity map. In: CoRL (10 2018)
Ben-Moshe, B., Elkin, E., et al.: Improving accuracy of gnss devices in urban canyons. In: CCCG, pp. 511–515 (2011)
Chen, D.M., et al.: City-scale landmark identification on mobile devices. In: Proceedings of IEEE/CVF CVPR, pp. 737–744 (2011)
Clement, L., Gridseth, M., Tomasi, J., Kelly, J.: Learning matchable image transformations for long-term metric visual localization. IEEE Robot. Autom. Lett. 5(2), 1492–1499 (2020). https://doi.org/10.1109/LRA.2020.2967659
Deng, J., Dong, W., Socher, R., et al.: Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE/CVF CVPR, pp. 248–255 (2009)
Hu, S., Feng, M., Nguyen, R.M., Hee Lee, G.: CVM-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of IEEE/CVF CVPR, pp. 7258–7267 (2018)
Hu, S., Lee, G.H.: Image-based geo-localization using satellite imagery. IJCV, pp. 1–15 (2019)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE/CVF CVPR, pp. 1125–1134 (2017)
Khosla, P., et al.: Supervised contrastive learning. ar**v preprint ar**v:2004.11362 (2020)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. ICLR (2014)
Lategahn, H., Stiller, C.: Vision-only localization. IEEE Trans. Intell. Transport. Syst. 15(3), 1246–1257 (2014). https://doi.org/10.1109/TITS.2014.2298492
Lin, T.Y., Belongie, S., Hays, J.: Cross-view image geolocalization. In: Proceedings of IEEE/CVF CVPR, pp. 891–898 (2013)
Lin, T.Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of IEEE/CVF CVPR, pp. 5007–5015 (2015)
Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of IEEE/CVF CVPR, pp. 5624–5633 (2019)
Lowry, S., et al.: Visual place recognition: a survey. IEEE Trans. Robot. 32(1), 1–19 (2015)
Maddern, W., Pascoe, G., Linegar, C., Newman, P.: 1 year, 1000 km: The oxford robotcar dataset. IJRR 36(1), 3–15 (2017)
Maddern, W., Pascoe, G., et al.: Real-time kinematic ground truth for the oxford robotcar dataset. ar**v preprint: 2002.10152 (2020)
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. ar**v preprint ar**v:1807.03748 (2018)
Regmi, K., Shah, M.: Bridging the domain gap for ground-to-aerial image matching. In: Proc. of IEEE/CVF ICCV, pp. 470–479 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of IEEE/CVF CVPR, pp. 815–823 (2015)
Shi, Y., Li, H.: Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image. In: Proceedings of the IEEE/CVF CVPR (2022)
Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for image based cross-view geo-localization. In: NeurIPS, pp. 10090–10100 (2019)
Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? joint location and orientation estimation by cross-view matching. In: Proceedings of IEEE/CVF CVPR, pp. 4064–4072 (2020)
Shi, Y., Yu, X., Liu, L., et al.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of AAAI, pp. 11990–11997 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Tang, T.Y., De Martini, D., Newman, P.: Get to the point: Learning lidar place recognition and metric localisation using overhead imagery. Robotics: Science and Systems (2021)
Tang, T.Y., De Martini, D., Wu, S., Newman, P.: Self-supervised learning for using overhead imagery as maps in outdoor range sensor localization. IJRR 40(12–14), 1488–1509 (2021)
Tang, T.Y., De Martini, D., Barnes, D., Newman, P.: Rsl-net: Localising in satellite images from a radar on the ground. IEEE Robot. Autom. Lett. 5(2), 1087–1094 (2020)
Thrun, S., Burgard, W., Fox, D.: Probabilistic robotics. MIT press (2005)
Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceeidngs of IEEE/CVF CVPR, pp. 3608–3616 (2017)
Toker, A., Zhou, Q., Maximov, M., Leal-Taixe, L.: Coming down to earth: Satellite-to-street view synthesis for geo-localization. In: Proc. of IEEE/CVF CVPR. pp. 6488–6497 (June 2021)
Torii, A., Arandjelovic, R., Sivic, J., Okutomi, M., Pajdla, T.: 24/7 place recognition by view synthesis. In: Proceedings of IEEE/CVF CVPR, pp. 1808–1817 (2015)
Torii, A., Sivic, J., Okutomi, M., Pajdla, T.: Visual place recognition with repetitive structures. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2346–2359 (2015). https://doi.org/10.1109/TPAMI.2015.2409868
Vo, Nam N.., Hays, James: Localizing and orienting street views using overhead imagery. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9905, pp. 494–509. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_30
Wei, X., Bârsan, I.A., Wang, S., Martinez, J., Urtasun, R.: Learning to localize through compressed binary maps. In: Proceedings of IEEE/CVF CVPR, pp. 10316–10324 (2019)
Won, D., et al.: Performance improvement of inertial navigation system by using magnetometer with vehicle dynamic constraints. J. Sensors, 1–11 (2015)
Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of IEEE/CVF ICCV, pp. 3961–3969 (2015)
**a, Z., Booij, O., Manfredi, M., Kooij, J.F.P.: Cross-view matching for vehicle localization by learning geographically local representations. IEEE Robot. Autom. Lett. 6(3), 5921–5928 (2021). https://doi.org/10.1109/LRA.2021.3088076
**a, Z., Booij, O., Manfredi, M., Kooij, J.F.P.: Geographically local representation learning with a spatial prior for visual localization. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12536, pp. 557–573. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66096-3_38
Yang, H., Lu, X., Zhu, Y.: Cross-view geo-localization with layer-to-layer transformer. In: NeurIPS. pp. 29009–29020 (2021)
Yin, H., Chen, R., Wang, Y., **ong, R.: Rall: end-to-end radar localization on lidar map using differentiable measurement model. IEEE Transactions on Intelligent Transportation Systems (2021)
Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: Proceedings of IEEE/CVF CVPR, pp. 867–875 (2017)
Zhu, S., Shah, M., Chen, C.: Transgeo: Transformer is all you need for cross-view image geo-localization. In: Proceedings of the IEEE/CVF CVPR, pp. 1162–1171 (2022)
Zhu, S., Yang, T., Chen, C.: Revisiting street-to-aerial view image geo-localization and orientation estimation. In: Proceedings of IEEE/CVF WACV, pp. 756–765 (2021)
Zhu, S., Yang, T., Chen, C.: Vigor: Cross-view image geo-localization beyond one-to-one retrieval. In: Proceedings of IEEE/CVF CVPR, pp. 3640–3649 (2021)
Acknowledgements
This work is part of the research programme Efficient Deep Learning (EDL) with project number P16-25, which is (partly) financed by the Dutch Research Council (NWO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
**a, Z., Booij, O., Manfredi, M., Kooij, J.F.P. (2022). Visual Cross-View Metric Localization with Dense Uncertainty Estimates. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13699. Springer, Cham. https://doi.org/10.1007/978-3-031-19842-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-19842-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19841-0
Online ISBN: 978-3-031-19842-7
eBook Packages: Computer ScienceComputer Science (R0)