Abstract
Weakly Supervised Semantic Segmentation is a crucial task in computer vision. However, existing methods that utilize Class Activation Maps (CAMs) with classification tasks can only identify a small part of the region. To address this limitation, we propose a novel Attention Activation Remodulation (AAR) scheme that leverages traditional CAMs and the remodulation branch to obtain weighted CAMs for recalibrated supervision. The AAR scheme re-arranges important features’ distribution from the channel and space perspectives, which regulates segmentation-oriented activation responses. In addition, we propose a Feature Pixel Extraction Module (FPEM) that utilizes contextual information to improve pixel prediction. Furthermore, the proposed scheme can be combined with other methods to improve overall performance. Extensive experiments on the PASCAL VOC 2012 dataset demonstrate the effectiveness of the AAR mechanism and FPEM module.
Similar content being viewed by others
Data Availability
The data and materials utilized in this study are based on the VOC2012 dataset, which is a widely used benchmark dataset in computer vision research. The VOC2012 dataset contains a diverse collection of images annotated with object bounding boxes and class labels. It is specifically designed for object detection, segmentation, and classification tasks. Access to the VOC2012 dataset can be obtained by following the steps outlined on the official website of the Visual Object Classes (VOC) challenge. The dataset can be downloaded from https://pjreddie.com/projects/pascal-voc-dataset-mirror/. Researchers are required to agree to the terms and conditions provided by the VOC challenge organizers before gaining access to the dataset.
References
Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1635–1643
Li S, Liu Y, Zhang Y et al (2023) Adaptive generation of weakly supervised semantic segmentation for object detection. Neural Process Lett 55(1):657–670
Vernaza P, Chandraker M (2017) Learning random-walk label propagation for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7158–7166
Qian R, Wei Y, Shi H et al (2019) Weakly supervised scene parsing with point-based distance metric learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 8843–8850
Bearman A, Russakovsky O, Ferrari V et al (2016) What’s the point: semantic segmentation with point supervision. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Proceedings, Part VII 14, Springer, pp 549–565
Jiang PT, Yang Y, Hou Q et al (2022) L2g: a simple local-to-global knowledge transfer framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16886–16896
Cao Z, Gao Y, Zhang J (2022) Scale-aware attention network for weakly supervised semantic segmentation. Neurocomputing 492:34–49
Zhou L, Gong C, Liu Z et al (2020) Sal: selection and attention losses for weakly supervised semantic segmentation. IEEE Transact Multimed 23:1035–1048
Jiang PT, Han LH, Hou Q et al (2021) Online attention accumulation for weakly supervised semantic segmentation. IEEE Transact Pattern Anal Mach Intell 44(10):7062–7077
Wang Y, Zhang J, Kan M et al (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12275–12284
Wei Y, Feng J, Liang X et al (2017) Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1568–1576
Ahn J, Kwak S (2018) Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4981–4990
Chen Q, Yang L, Lai JH et al (2022) Self-supervised image-specific prototype exploration for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4288–4298
Fan J, Zhang Z (2022) Memory-based cross-image contexts for weakly supervised semantic segmentation. IEEE transactions on pattern analysis and machine intelligence
Jo S, Yu IJ, Kim K (2023) Mars: model-agnostic biased object removal without additional supervision for weakly-supervised semantic segmentation. Ar**v Preprint Ar**v:2304.09913
He J, Cheng L, Fang C et al (2023) Mitigating undisciplined over-smoothing in transformer for weakly supervised semantic segmentation. Ar**v Preprint Ar**v:2305.03112
Xu R, Wang C, Sun J et al (2023) Self correspondence distillation for end-to-end weakly-supervised semantic segmentation. Ar**v Preprint Ar**v:2302.13765
Guo MH, Xu TX, Liu JJ et al (2022) Attention mechanisms in computer vision: a survey. Comput Vis Med 8(3):331–368
Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62
Ma WX (2022) Nonlocal integrable MKDV equations by two nonlocal reductions and their soliton solutions. J Geom Phys 177:104522
Zhu Z, Xu M, Bai S et al (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 593–602
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Wang Q, Wu B, Zhu P et al (2020) Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542
Woo S, Park J, Lee JY et al (2018) Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19
Cao Y, Xu J, Lin S et al (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 0–0
Wei Y, Feng J, Liang X et al (2017) Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1568–1576
**e J, **ang J, Chen J et al (2022) C2am: contrastive learning of class-agnostic activation map for weakly supervised object localization and semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 989–998
Jiang PT, Hou Q, Cao Y et al (2019) Integral object mining via online attention accumulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2070–2079
Wang X, Girshick R, Gupta A et al (2018) Non-local neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7794–7803
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 770–778
Yang Y, Wan F, Ye Q et al (2022) Weakly supervised learning of instance segmentation with confidence feedback. In: Part I (ed) Artificial Intelligence: Second CAAI International Conference, CICAI 2022, Bei**g, China, Revised Selected Papers. Springer, pp 392–403
Chen LC, Papandreou G, Kokkinos I et al (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Transact Pattern Anal Mach Intell 40(4):834–848
Ahn J, Cho S, Kwak S (2019) Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2209–2218
Fan J, Zhang Z, Tan T et al (2020) Cian: cross-image affinity net for weakly supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 10762–10769
Shimoda W, Yanai K (2019) Self-supervised difference detection for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5208–5217
Yao Y, Chen T, **e GS et al (2021) Non-salient region object mining for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2623–2632
Lee J, Kim E, Yoon S (2021) Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4071–4080
Kim B, Han S, Kim J (2021) Discriminative region suppression for weakly-supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 1754–1761
Zhang B, **ao J, Wei Y et al (2022) End-to-end weakly supervised semantic segmentation with reliable region mining. Pattern Recogn 128:108663
Mai J, Zhang F, Ye J et al (2023) Exploit cam by itself: Complementary learning system for weakly supervised semantic segmentation. Ar**v Preprint Ar**v:2303.02449
Zhou L, Gong C, Liu Z et al (2020) Sal: selection and attention losses for weakly supervised semantic segmentation. IEEE Transact Multimed 23:1035–1048
Chen Z, Wang T, Wu X et al (2022) Class re-activation maps for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 969–978
**e J, Hou X, Ye K et al (2022) Clims: Cross language image matching for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4483–4492
Funding
This research was supported by the Research Foundation of the Institute of Environment-friendly Materials and Occupational Health (Wuhu), Anhui University of Science and Technology (No. ALW2021YF04), the Anhui University of Science and Technology Graduate Innovation Fund(No.2022CX2126), and the University Synergy Innovation Program of Anhui Province (No. GXXT-2021-006), and this study is supported by the open Foundation of Anhui Engineering Research Center of Intelligent Perception and Elderly Care, Chuzhou University, under Grant No.20220PB01.
Author information
Authors and Affiliations
Contributions
Yu-e Lin provided guidance during the experiment and guidance on the grammar of the article when writing the article. Houguo Li completed the experimental part of the thesis and the general writing of the article. **ngzhu Liang revised the first edition of the paper and put forward many valuable suggestions. Mengfan Li completed the drawing of Figs 1–6 of the article and completed the editing of references at the same time. Huilin Liu query relevant information when writing an article.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no direct or indirect interest in the work submitted for publication in this study.The dissertation is written by graduate students to meet academic requirements.No organization or individual will gain any benefit from the publication of this work.
Ethic approval
Ethics approval is not applicable to this paper.The paper did not conduct both human and/or animal studies.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, Ye., Li, H., Liang, X. et al. AAR:Attention Remodulation for Weakly Supervised Semantic Segmentation. J Supercomput 80, 9096–9114 (2024). https://doi.org/10.1007/s11227-023-05786-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05786-z