Learning Semantic Correspondence with Sparse Annotations

Huang, Shuaiyi; Yang, Luyu; He, Bo; Zhang, Songyang; He, Xuming; Shrivastava, Abhinav

doi:10.1007/978-3-031-19781-9_16

Shuaiyi Huang¹²,
Luyu Yang¹²,
Bo He¹²,
Songyang Zhang¹³,
Xuming He^14,15 &
…
Abhinav Shrivastava¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13674))

Included in the following conference series:

European Conference on Computer Vision

2422 Accesses
3 Citations

Abstract

Finding dense semantic correspondence is a fundamental problem in computer vision, which remains challenging in complex scenes due to background clutter, extreme intra-class variation, and a severe lack of ground truth. In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations. To this end, we first propose a teacher-student learning paradigm for generating dense pseudo-labels and then develop two novel strategies for denoising pseudo-labels. In particular, we use spatial priors around the sparse annotations to suppress the noisy pseudo-labels. In addition, we introduce a loss-driven dynamic label selection strategy for label denoising. We instantiate our paradigm with two variants of learning strategies: a single offline teacher setting, and mutual online teachers setting. Our approach achieves notable improvements on three challenging benchmarks for semantic correspondence and establishes the new state-of-the-art. Project page: https://shuaiyihuang.github.io/publications/SCorrSAN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 79.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 99.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Guided Collaborative Training for Pixel-Wise Semi-Supervised Learning

An efficient weakly semi-supervised method for object automated annotation

Article 17 June 2023

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

References

Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Annual Conference on Learning Theory(COLT) (1998)
Google Scholar
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. Advances in neural information processing systems 6 (1993)
Google Scholar
Chauhan, A.K., Krishan, P.: Moving object tracking using gaussian mixture model and optical flow. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(4) (2013)
Google Scholar
Chen, T., Goodfellow, I., Shlens, J.: Net2net: accelerating learning via knowledge transfer. In: Proceedings of the International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Chen, Y.C., Lin, Y.Y., Yang, M.H., Huang, J.B.: Show, match and segment: joint weakly supervised learning of semantic matching and object co-segmentation. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2020)
Google Scholar
Cho, S., Hong, S., Jeon, S., Lee, Y., Sohn, K., Kim, S.: Cats: cost aggregation transformers for visual correspondence. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
Google Scholar
Dale, K., Johnson, M.K., Sunkavalli, K., Matusik, W., Pfister, H.: Image restoration using online photo collections. In: Proceedings of the International Conference on Computer Vision (ICCV) (2009)
Google Scholar
Goldstein, A., Fattal, R.: Video stabilization using Epipolar geometry. ACM Trans. Graph. (TOG) 31(5), 1–10 (2012)
Article Google Scholar
Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow: Semantic correspondences from object proposals. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)
Google Scholar
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)
Google Scholar
Han, K., et al.: SCNet: learning semantic correspondence. In: Proceedings of the International Conference on Computer Vision (ICCV) (2017)
Google Scholar
He, B., Yang, X., Kang, L., Cheng, Z., Zhou, X., Shrivastava, A.: ASM-Loc: action-aware segment modeling for weakly-supervised temporal action localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
He, B., Yang, X., Wu, Z., Chen, H., Lim, S.N., Shrivastava, A.: GTA: global temporal attention for video action understanding. In: Proceedings of the British Machine Vision Conference (BMVC) (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Hinton, G., et al.: Distilling the knowledge in a neural network. ar**v preprint ar**v:1503.02531 2(7) (2015)
Hong, S., Cho, S., Nam, J., Lin, S., Kim, S.: Cost aggregation with 4D convolutional Swin transformer for few-shot segmentation. ar**v preprint ar**v:2207.10866 (2022)
Seo, P.H., Lee, J., Jung, D., Han, B., Cho, M.: Attentive semantic alignment with offset-aware correlation kernels. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 367–383. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_22
Chapter Google Scholar
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1-3), 185–203 (1981)
Google Scholar
Huang, S., Wang, Q., He, X.: Confidence-aware adversarial learning for self-supervised semantic matching. In: Peng, Y., et al. (eds.) PRCV 2020. LNCS, vol. 12305, pp. 91–103. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60633-6_8
Chapter Google Scholar
Huang, S., Wang, Q., Zhang, S., Yan, S., He, X.: Dynamic context correspondence network for semantic alignment. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: CCNet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Jeon, S., Kim, S., Min, D., Sohn, K.: PARN: pyramidal affine regression networks for dense semantic correspondence. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 355–371. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_22
Chapter Google Scholar
Jeon, S., Min, D., Kim, S., Choe, J., Sohn, K.: Guided semantic flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 631–648. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_38
Chapter Google Scholar
Jeon, S., Min, D., Kim, S., Sohn, K.: Joint learning of semantic alignment and object landmark detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Kim, S., Lin, S., JEON, S.R., Min, D., Sohn, K.: Recurrent transformer networks for semantic correspondence. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)
Google Scholar
Kim, S., Min, D., Ham, B., Jeon, S., Lin, S., Sohn, K.: FCSS: fully convolutional self-similarity for dense semantic correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Kim, S., Min, D., Jeong, S., Kim, S., Jeon, S., Sohn, K.: Semantic attribute matching networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Lan, S., et al.: DiscoBox: weakly supervised instance segmentation and semantic correspondence from box supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Lee, J.Y., DeGol, J., Fragoso, V., Sinha, S.N.: Patchmatch-based neighborhood consensus for semantic correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Lee, J., Kim, D., Ponce, J., Ham, B.: SFNet: learning object-aware semantic correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Lee, J., Kim, E., Lee, Y., Kim, D., Chang, J., Choo, J.: Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Li, H., Wu, Z., Shrivastava, A., Davis, L.S.: Rethinking pseudo labels for semi-supervised object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2022)
Google Scholar
Li, S., Han, K., Costain, T.W., Howard-Jenkins, H., Prisacariu, V.: Correspondence networks with adaptive neighbourhood consensus. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Li, X., Fan, D.P., Yang, F., Luo, A., Cheng, H., Liu, Z.: Probabilistic model distillation for semantic correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33 (2011)
Google Scholar
Liu, Y., Zhu, L., Yamada, M., Yang, Y.: Semantic correspondence as an optimal transport problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Min, J., Cho, M.: Convolutional hough matching networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Min, J., Lee, J., Ponce, J., Cho, M.: Hyperpixel flow: semantic correspondence with multi-layer neural features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Min, J., Lee, J., Ponce, J., Cho, M.: Spair-71k: a large-scale benchmark for semantic correspondence. ar**v preprint ar**v:1908.10543 (2019)
Min, J., Lee, J., Ponce, J., Cho, M.: Learning to compose hypercolumns for visual correspondence. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 346–363. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_21
Chapter Google Scholar
Paszke, A., et al.: Automatic differentiation in pyTorch (2017)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T., Sivic, J.: Neighbourhood consensus networks. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)
Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision 47(1–3), 7–42 (2002)
Article Google Scholar
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Google Scholar
Taniai, T., Sinha, S.N., Sato, Y.: Joint recovery of dense correspondence and cosegmentation in two images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
Google Scholar
Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 815-830 (2010)
Google Scholar
Truong, P., Danelljan, M., Timofte, R.: GLU-Net: global-local universal network for dense flow and correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Truong, P., Danelljan, M., Yu, F., Van Gool, L.: Probabilistic warp consistency for weakly-supervised semantic correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
**e, Q., Luong, M.T., Hovy, E., Le, Q.V.: Self-training with noisy student improves imageNet classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Yang, L., et al.: Deep co-training with task decomposition for semi-supervised domain adaptation. In: Proceedings of the International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2878-2890 (2013)
Google Scholar
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Yue, K., Sun, M., Yuan, Y., Zhou, F., Ding, E., Xu, F.: Compact generalized non-local network. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)
Google Scholar
Zhang, S., He, X., Yan, S.: LatentGNN: learning efficient non-local relations for visual recognition. In: Proceedings of the International Conference on Machine Learning (ICML) (2019)
Google Scholar
Zhang, Y., **ang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Zhao, D., Song, Z., Ji, Z., Zhao, G., Ge, W., Yu, Y.: Multi-scale matching networks for semantic correspondence. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2021)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland, College Park, USA
Shuaiyi Huang, Luyu Yang, Bo He & Abhinav Shrivastava
Shanghai AI Laboratory, Shanghai, China
Songyang Zhang
ShanghaiTech University, Shanghai, China
Xuming He
Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China
Xuming He

Authors

Shuaiyi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Luyu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Bo He
View author publications
You can also search for this author in PubMed Google Scholar
Songyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xuming He
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuaiyi Huang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 12768 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, S., Yang, L., He, B., Zhang, S., He, X., Shrivastava, A. (2022). Learning Semantic Correspondence with Sparse Annotations. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13674. Springer, Cham. https://doi.org/10.1007/978-3-031-19781-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-19781-9_16
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19780-2
Online ISBN: 978-3-031-19781-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Semantic Correspondence with Sparse Annotations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Guided Collaborative Training for Pixel-Wise Semi-Supervised Learning

An efficient weakly semi-supervised method for object automated annotation

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 12768 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning Semantic Correspondence with Sparse Annotations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Guided Collaborative Training for Pixel-Wise Semi-Supervised Learning

An efficient weakly semi-supervised method for object automated annotation

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 12768 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation