Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs

Dai, Jianhua; Wang, Zhiyang; Huang, Weiyi

doi:10.1007/s13042-023-02090-3

Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs

Original Article
Published: 08 March 2024

Volume 15, pages 3237–3253, (2024)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Multi-label feature selection is an effective solution to the multi-label data dimensionality disaster problem. However, there are few studies on multi-label feature selection considering label enhancement methods. Meanwhile, most existing label enhancement methods neglect the relative importance of labels, which can degrade the classification performance of the model. To address this issue, we propose a novel multi-label feature selection algorithm based on label enhancement and relative maximal discernibility pairs. Firstly, we propose the label importance weight based on relative discernibility pairs and design the concept of soft relevance between objects and labels via fuzzy rough sets. Secondly, we propose a novel label enhancement algorithm by combining the soft relevance and the label importance weight. Thirdly, we define a relative maximal discernibility pair model for evaluating features in label distribution information systems. Additionally, based on the relative maximal discriminative pair model and label enhancement, we present a multi-label feature selection algorithm which can continuously reduce the universe of object pairs in the selection process. Finally, to validate the effectiveness and stability of our algorithm, we conduct extensive comparison experiments with 7 representative multi-label feature selection algorithms on 13 datasets. Experimental results show that our algorithm performs better than the compared 7 algorithms in 5 evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Multi-label feature selection via redundancy of the selected feature set

Article 30 August 2022

Fuzzy information gain ratio-based multi-label feature selection with label correlation

Article 21 January 2024

Multi-label feature selection based on fuzzy neighborhood rough sets

Article Open access 10 January 2022

Data availability

The datasets used during the current study are available in the Mulan Library (http://mulan.sourceforge.net) and the MLL Repository (http://www.uco.es/kdis/mllresources).

References

Bellman R (1966) Dynamic programming. Science 153(3731):34–37
Google Scholar
Sun L, Wang L, Ding W, Qian Y, Xu J (2021) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33
Google Scholar
Shu W, Qian W, **e Y (2020) Incremental feature selection for dynamic hybrid data using neighborhood rough set. Knowl Based Syst 194:105516
Google Scholar
Zhang C, Dai J (2020) An incremental attribute reduction approach based on knowledge granularity for incomplete decision systems. Granul Comput 5:545–559
Google Scholar
Qian W, Dong P, Dai S, Huang J, Wang Y (2022) Incomplete label distribution feature selection based on neighborhood-tolerance discrimination index. Appl Soft Comput 130:109693
Google Scholar
Dai J, Chen J (2020) Feature selection via normative fuzzy information weight with application into tumor classification. Appl Soft Comput 92:106299
Google Scholar
Wang C, Huang Y, Shao M, Hu Q, Chen D (2020) Feature selection based on neighborhood self-information. IEEE Trans Cybern 50(9):4031–4042
Google Scholar
Dai J, Huang W, Wang W, Zhang C (2023) Semi-supervised attribute reduction based on label distribution and label irrelevance. Inform Fus 100:101951
Google Scholar
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label relieff and f-statistic feature selections for image annotation. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2352–2359
Nguyen CT, Zhan DC, Zhou ZH (2013) Multi-modal image annotation with multi-instance multi-label lda. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pp 1558–1564
Wu CH, Wei WL, Lin JC, Lee WY (2013) Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion. IEEE Trans Multimed 15(8):1732–1744
Google Scholar
Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Google Scholar
Zhou ZH, Zhang ML, Huang SJ, Li YF (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320
MathSciNet Google Scholar
Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):1–38
Google Scholar
Al-Salemi B, Noah SAM, Ab Aziz MJ (2016) Rfboost: an improved multi-label boosting algorithm and its application to text categorisation. Knowl Based Syst 103:104–117
Google Scholar
Zhang C, Zhu C (2022) Multiple classifiers fusion for facial expression recognition. Granul Comput:1–11
Turnbull D, Barrington L, Torres D, Lanckriet G (2008) Semantic annotation and retrieval of music and sound effects. IEEE Trans Audio Speech Lang Process 16(2):467–476
Google Scholar
Chen WJ, Shao YH, Li CN, Deng NY (2016) Mltsvm: a novel twin support vector machine to multi-label learning. Pattern Recogn 52:61–74
Google Scholar
Yi W, Lu M, Liu Z (2011) Multi-valued attribute and multi-labeled data decision tree algorithm. Int J Mach Learn Cybern 2:67–74
Google Scholar
Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4(3):1–21
Google Scholar
Kumar V, Minz S (2016) Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowl Inform Syst 49:1–59
Google Scholar
Yao E, Li D, Zhai Y, Zhang C (2022) Multilabel feature selection based on relative discernibility pair matrix. IEEE Trans Fuzzy Syst 30(7):2388–2401
Google Scholar
Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342
Google Scholar
Liang M, Mi J, Feng T (2019) Optimal granulation selection for multi-label data based on multi-granulation rough sets. Granul Comput 4:323–335
Google Scholar
Sun L, Yin T, Ding W, Qian Y, Xu J (2022) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211
Google Scholar
Liu J, Lin Y, Li Y, Weng W, Wu S (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recogn 84:273–287
Google Scholar
Geng X, **a Y (2014) Head pose estimation based on multivariate label distribution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1837–1842
Gao BB, **ng C, **e CW, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838
MathSciNet Google Scholar
Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748
Google Scholar
Xu N, Liu YP, Geng X (2021) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643
Google Scholar
Wen T, Li W, Chen L, Jia X (2022) Semi-supervised label enhancement via structured semantic extraction. Int J Mach Learn Cybern:1–14
Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels. Int J Mach Learn Cybern 13(8):2345–2364
Google Scholar
Liu J, Lin Y, Ding W, Zhang H, Wang C, Du J (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157
Google Scholar
Lin Y, Liu H, Zhao H, Hu Q, Zhu X, Wu X (2023) Hierarchical feature selection based on label distribution learning. IEEE Trans Knowl Data Eng 35(6):5964–5976
Google Scholar
Dai J, Hu Q, Zhang J, Hu H, Zheng N (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Transa Cybern 47(9):2460–2471
Google Scholar
Xu W, Yuan K, Li W, Ding W (2023) An emerging fuzzy feature selection method using composite entropy-based uncertainty measure and data distribution. IEEE Trans Emerg Top Comput Intell 7(1):76–88
Google Scholar
Dai J, Huang W, Zhang C, Liu J (2024) Multi-label feature selection by strongly relevant label gain and label mutual aid. Pattern Recogn 145:109945
Google Scholar
Guo D, Jiang C, Sheng R, Liu S (2022) A novel outcome evaluation model of three-way decision: a change viewpoint. Informa Sci 607:1089–1110
Google Scholar
Guo D, Jiang C, Wu P (2022) Three-way decision based on confidence level change in rough set. Int J Approx Reason 143:57–77
MathSciNet Google Scholar
Guo D, Xu W, Qian Y, Ding W (2023) M-fccl: memory-based concept-cognitive learning for dynamic fuzzy data classification and knowledge fusion. Inform Fus 100:101962
Google Scholar
Xu W, Guo D, Qian Y, Ding W (2023) Two-way concept-cognitive learning method: a fuzzy-based progressive learning. IEEE Trans Fuzzy Syst 31(6):1885–1899
Google Scholar
Xu W, Guo D, Mi J, Qian Y, Zheng K, Ding W (2023b) Two-way concept-cognitive learning via concept movement viewpoint. IEEE Trans Neural Netw Learn Syst:1–15
Guo D, Xu W (2023) Fuzzy-based concept-cognitive learning: an investigation of novel approach to tumor diagnosis analysis. Inform Sci 639:118998
Google Scholar
Chen D, Zhang L, Zhao S, Hu Q, Zhu P (2012) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389
Google Scholar
Chen D, Zhao S, Zhang L, Yang Y, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24(11):2080–2093
Google Scholar
Qian W, **ong C, Wang Y (2021) A ranking-based feature selection for multi-label classification with fuzzy relative discernibility. Appl Soft Comput 102:106995
Google Scholar
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209
Google Scholar
Yuan Z, Chen H, **e P, Zhang P, Liu J, Li T (2021) Attribute reduction methods in fuzzy rough set theory: an overview, comparative experiments, and new directions. Appl Soft Comput 107:107353
Google Scholar
Tsoumakas G, Spyromitros-**oufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12:2411–2414
MathSciNet Google Scholar
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Google Scholar
Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34(3):349–357
Google Scholar
Lee J, Lim H, Kim D (2012) Approximating mutual information for multi-label feature selection. Yeast 2417(103):14
Google Scholar
Lin Y, Hu Q, Liu J, Li J, Wu X (2017) Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Trans Fuzzy Syst 25(6):1491–1507
Google Scholar
Lee J, Kim DW (2017) Scls: multi-label feature selection based on scalable criterion for large label set. Pattern Recogn 66:342–352
MathSciNet Google Scholar
Zhang J, Wu H, Jiang M, Liu J, Li S, Tang Y, Long J (2023) Group-preserving label-specific feature selection for multi-label learning. Expert Syst Appl 213:118861
Google Scholar
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39:135–168
Google Scholar
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook, pp. 667–685
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
MathSciNet Google Scholar
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
MathSciNet Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (62376093, 61976089), the Major Program of the National Social Science Foundation of China (20 &ZD047), the Natural Science Foundation of Hunan Province (2021JJ30451), and the Hunan Provincial Science & Technology Project Foundation (2018RS3065, 2018TP1018).

Author information

Authors and Affiliations

Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, 410081, China
Jianhua Dai, Zhiyang Wang & Weiyi Huang
College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
Jianhua Dai, Zhiyang Wang & Weiyi Huang

Authors

Jianhua Dai
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Weiyi Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianhua Dai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dai, J., Wang, Z. & Huang, W. Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs. Int. J. Mach. Learn. & Cyber. 15, 3237–3253 (2024). https://doi.org/10.1007/s13042-023-02090-3

Download citation

Received: 12 February 2023
Accepted: 22 December 2023
Published: 08 March 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s13042-023-02090-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-label feature selection via redundancy of the selected feature set

Fuzzy information gain ratio-based multi-label feature selection with label correlation

Multi-label feature selection based on fuzzy neighborhood rough sets

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-label feature selection via redundancy of the selected feature set

Fuzzy information gain ratio-based multi-label feature selection with label correlation

Multi-label feature selection based on fuzzy neighborhood rough sets

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation