Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation

Ye, Jihua; Jiang, Lu; **ao, Shunjie; Zong, Ye; Jiang, Aiwen

doi:10.1007/s12204-023-2688-6

Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation

基于多尺度融合和自适应标签相关性的多标签图像分类模型

Published: 02 January 2024

(2024)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Jihua Ye (叶继华)¹,
Lu Jiang (江蕗)¹,
Shunjie **ao (肖顺杰)¹,
Ye Zong (宗义)¹ &
…
Aiwen Jiang (江爱文)¹

104 Accesses
Explore all metrics

Abstract

At present, research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images. However, in existing methods, label correlation is calculated based on the statistical information of the data. This label correlation is global and depends on the dataset, not suitable for all samples. In the process of extracting image features, the characteristic information of small objects in the image is easily lost, resulting in a low classification accuracy of small objects. To this end, this paper proposes a multi-label image classification model based on multiscale fusion and adaptive label correlation. The main idea is: first, the feature maps of multiple scales are fused to enhance the feature information of small objects. Semantic guidance decomposes the fusion feature map into feature vectors of each category, then adaptively mines the correlation between categories in the image through the self-attention mechanism of graph attention network, and obtains feature vectors containing category-related information for the final classification. The mean average precision of the model on the two public datasets of VOC 2007 and MS COCO 2014 reached 95.6% and 83.6%, respectively, and most of the indicators are better than those of the existing latest methods.

摘要

目前多标签图像分类的研究主要集中于探索标签之间的相关性, 以提高多标签图像的分类精度. 但是, 现有的方法中, 标签相关性是依据数据的统计信息计算的, 这种标签相关性是全局且依赖于数据集, 并不适合所有样本, 并且在提取图像特征过程中, 图像中的小物体特性信息易丢失导致小物体的分类准确率低. 为此, 提出一种基于多尺度融合和自适应标签相关性的多标签图像分类模型, 主要思路为: 首先将多个尺度的特征图融合以增**小物体的特征信息, 并通过标签语义的指导将融合特征图分解为各个类别的特征向量, 然后利用图注意力模块中的自注意力机制自适应地挖掘图像中类别之间的相关性, 并提出一个注意力**则化损失. 该模型在VOC 2007 和 MS COCO 2014 两个公开数据集上的**均精度均值(mAP)分别达到了95.6%和83.6%, 并且大部分指标都优于现有的最新方法.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

A multi-label image classification method combining multi-stage image semantic information and label relevance

Article 08 April 2024

A Multi-label Image Recognition Algorithm Based on Spatial and Semantic Correlation Interaction

Proposed Multi-label Image Classification Method Based on Gabor Filter

References

WANG Y, HE D L, LI F, et al. Multi-label classification with label graph superimposing [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12265–12272.
Article Google Scholar
WANG J, YANG Y, MAO J H, et al. CNN-RNN: A unified framework for multi-label image classification [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2285–2294.
Google Scholar
WANG Z X, CHEN T S, LI G B, et al. Multi-label image recognition by recurrently discovering attentional regions [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 464–472.
Google Scholar
CHEN Z M, WEI X S, WANG P, et al. Multi-label image recognition with graph convolutional networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5172–5181.
Google Scholar
YE J, HE J J, PENG X J, et al. Attention-driven dynamic graph convolutional network for multi-label image recognition [M]//European conference on computer vision. Cham: Springer, 2020: 649–665.
Google Scholar
CHEN T S, LIN L, CHEN R Q, et al. Knowledge-guided multi-label few-shot learning for general image recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1371–1384.
Article Google Scholar
LI Q, PENG X J, QIAO Y, et al. Learning label correlations for multi-label image recognition with graph networks [J]. Pattern Recognition Letters, 2020, 138: 378–384.
Article Google Scholar
QI Y H, GUO Y C, CHEN Y S. Multi-label image recognition with asymmetric co- dependency graphs [C]//2021 IEEE 6th International Conference on Big Data Analytics. **amen: IEEE, 2021: 287–294.
Google Scholar
NGUYEN H D, VU X S, LE D T. Modular graph transformer networks for multi-label image classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10): 9092–9100.
Article Google Scholar
OGUZ YAZICI V, GONZALEZ-GARCIA A, RAMISA A, et al. Orderless recurrent models for multi-label classification [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 13437–13446.
Google Scholar
YOU R C, GUO Z Y, CUI L, et al. Cross-modality attention with semantic graph embedding for multi-label classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12709–12716.
Article Google Scholar
VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks [DB/OL]. (2017-10-30). https://arxiv.org/abs/1710.10903
EVERINGHAM M, ALI ESLAMI S M, VAN GOOL L, et al. The pascal visual object classes challenge: A retrospective [J]. International Journal of Computer Vision, 2015, 111(1): 98–136.
Article Google Scholar
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context [M]//European conference on computer vision. Cham: Springer, 2014: 740–755.
Google Scholar
HE K M, ZHANG X Y, REN S Q, et al. Deepresidual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
Google Scholar
ZHU F, LI H S, OUYANG W L, et al. Learning spatial regularization with image-level supervisions for multi-label image classification [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2027–2036.
Google Scholar
GE W F, YANG S B, YU Y Z. Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1277–1286.
Chapter Google Scholar
CHEN Z M, WEI X S, WANG P, et al. Learning graph convolutional networks for multi-label recognition and applications [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6969–6983.
Article Google Scholar
YUAN J, CHEN S K, ZHANG Y, et al. Graph attention transformer network for multi-label image classification [J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(4): 150.
CHEN T S, XU M X, HUI X L, et al. Learning semantic-specific graph representation for multi-label image recognition [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 522–531.
Google Scholar
ZHAO J W, YAN K, ZHAO Y F, et al. Transformer-based dual relation graph for multi-label image recognition [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 163–172.
Google Scholar
YAO X, XU F Y, GU M, et al. M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition [J]. Neural Computing and Applications, 2022, 34(8): 6489–6502.
Article Google Scholar
ZANG L G, LI Y C, CHEN H. Multilabel recognition algorithm with multigraph structure [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(2): 782–792.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Information Engineering, Jiangxi Normal University, Nanchang, 330022, China
Jihua Ye (叶继华), Lu Jiang (江蕗), Shunjie **ao (肖顺杰), Ye Zong (宗义) & Aiwen Jiang (江爱文)

Authors

Jihua Ye (叶继华)
View author publications
You can also search for this author in PubMed Google Scholar
Lu Jiang (江蕗)
View author publications
You can also search for this author in PubMed Google Scholar
Shunjie **ao (肖顺杰)
View author publications
You can also search for this author in PubMed Google Scholar
Ye Zong (宗义)
View author publications
You can also search for this author in PubMed Google Scholar
Aiwen Jiang (江爱文)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jihua Ye (叶继华).

Ethics declarations

Conflict of Interest The authors declare no conflict of interest.

Additional information

Foundation item: the National Natural Science Foundation of China (Nos. 62167005 and 61966018), and the Key Research Projects of Jiangxi Provincial Department of Education (No. GJJ200302)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, J., Jiang, L., **ao, S. et al. Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation. J. Shanghai Jiaotong Univ. (Sci.) (2024). https://doi.org/10.1007/s12204-023-2688-6

Download citation

Received: 10 July 2023
Accepted: 31 July 2023
Published: 02 January 2024
DOI: https://doi.org/10.1007/s12204-023-2688-6

Keywords

关键词

CLC number

TP391.41

Document code

A

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multi-label image classification method combining multi-stage image semantic information and label relevance

A Multi-label Image Recognition Algorithm Based on Spatial and Semantic Correlation Interaction

Proposed Multi-label Image Classification Method Based on Gabor Filter

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

关键词

CLC number

Document code

Subscribe and save

Buy Now

Navigation

Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multi-label image classification method combining multi-stage image semantic information and label relevance

A Multi-label Image Recognition Algorithm Based on Spatial and Semantic Correlation Interaction

Proposed Multi-label Image Classification Method Based on Gabor Filter

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

关键词

CLC number

Document code

Subscribe and save

Buy Now

Search

Navigation