Abstract
Multi-label feature selection is an effective solution to the multi-label data dimensionality disaster problem. However, there are few studies on multi-label feature selection considering label enhancement methods. Meanwhile, most existing label enhancement methods neglect the relative importance of labels, which can degrade the classification performance of the model. To address this issue, we propose a novel multi-label feature selection algorithm based on label enhancement and relative maximal discernibility pairs. Firstly, we propose the label importance weight based on relative discernibility pairs and design the concept of soft relevance between objects and labels via fuzzy rough sets. Secondly, we propose a novel label enhancement algorithm by combining the soft relevance and the label importance weight. Thirdly, we define a relative maximal discernibility pair model for evaluating features in label distribution information systems. Additionally, based on the relative maximal discriminative pair model and label enhancement, we present a multi-label feature selection algorithm which can continuously reduce the universe of object pairs in the selection process. Finally, to validate the effectiveness and stability of our algorithm, we conduct extensive comparison experiments with 7 representative multi-label feature selection algorithms on 13 datasets. Experimental results show that our algorithm performs better than the compared 7 algorithms in 5 evaluation metrics.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Figa_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Figb_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-023-02090-3/MediaObjects/13042_2023_2090_Fig6_HTML.png)
Similar content being viewed by others
Data availability
The datasets used during the current study are available in the Mulan Library (http://mulan.sourceforge.net) and the MLL Repository (http://www.uco.es/kdis/mllresources).
References
Bellman R (1966) Dynamic programming. Science 153(3731):34–37
Sun L, Wang L, Ding W, Qian Y, Xu J (2021) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33
Shu W, Qian W, **e Y (2020) Incremental feature selection for dynamic hybrid data using neighborhood rough set. Knowl Based Syst 194:105516
Zhang C, Dai J (2020) An incremental attribute reduction approach based on knowledge granularity for incomplete decision systems. Granul Comput 5:545–559
Qian W, Dong P, Dai S, Huang J, Wang Y (2022) Incomplete label distribution feature selection based on neighborhood-tolerance discrimination index. Appl Soft Comput 130:109693
Dai J, Chen J (2020) Feature selection via normative fuzzy information weight with application into tumor classification. Appl Soft Comput 92:106299
Wang C, Huang Y, Shao M, Hu Q, Chen D (2020) Feature selection based on neighborhood self-information. IEEE Trans Cybern 50(9):4031–4042
Dai J, Huang W, Wang W, Zhang C (2023) Semi-supervised attribute reduction based on label distribution and label irrelevance. Inform Fus 100:101951
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label relieff and f-statistic feature selections for image annotation. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2352–2359
Nguyen CT, Zhan DC, Zhou ZH (2013) Multi-modal image annotation with multi-instance multi-label lda. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pp 1558–1564
Wu CH, Wei WL, Lin JC, Lee WY (2013) Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion. IEEE Trans Multimed 15(8):1732–1744
Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Zhou ZH, Zhang ML, Huang SJ, Li YF (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320
Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):1–38
Al-Salemi B, Noah SAM, Ab Aziz MJ (2016) Rfboost: an improved multi-label boosting algorithm and its application to text categorisation. Knowl Based Syst 103:104–117
Zhang C, Zhu C (2022) Multiple classifiers fusion for facial expression recognition. Granul Comput:1–11
Turnbull D, Barrington L, Torres D, Lanckriet G (2008) Semantic annotation and retrieval of music and sound effects. IEEE Trans Audio Speech Lang Process 16(2):467–476
Chen WJ, Shao YH, Li CN, Deng NY (2016) Mltsvm: a novel twin support vector machine to multi-label learning. Pattern Recogn 52:61–74
Yi W, Lu M, Liu Z (2011) Multi-valued attribute and multi-labeled data decision tree algorithm. Int J Mach Learn Cybern 2:67–74
Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4(3):1–21
Kumar V, Minz S (2016) Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowl Inform Syst 49:1–59
Yao E, Li D, Zhai Y, Zhang C (2022) Multilabel feature selection based on relative discernibility pair matrix. IEEE Trans Fuzzy Syst 30(7):2388–2401
Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342
Liang M, Mi J, Feng T (2019) Optimal granulation selection for multi-label data based on multi-granulation rough sets. Granul Comput 4:323–335
Sun L, Yin T, Ding W, Qian Y, Xu J (2022) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211
Liu J, Lin Y, Li Y, Weng W, Wu S (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recogn 84:273–287
Geng X, **a Y (2014) Head pose estimation based on multivariate label distribution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1837–1842
Gao BB, **ng C, **e CW, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838
Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748
Xu N, Liu YP, Geng X (2021) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643
Wen T, Li W, Chen L, Jia X (2022) Semi-supervised label enhancement via structured semantic extraction. Int J Mach Learn Cybern:1–14
Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels. Int J Mach Learn Cybern 13(8):2345–2364
Liu J, Lin Y, Ding W, Zhang H, Wang C, Du J (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157
Lin Y, Liu H, Zhao H, Hu Q, Zhu X, Wu X (2023) Hierarchical feature selection based on label distribution learning. IEEE Trans Knowl Data Eng 35(6):5964–5976
Dai J, Hu Q, Zhang J, Hu H, Zheng N (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Transa Cybern 47(9):2460–2471
Xu W, Yuan K, Li W, Ding W (2023) An emerging fuzzy feature selection method using composite entropy-based uncertainty measure and data distribution. IEEE Trans Emerg Top Comput Intell 7(1):76–88
Dai J, Huang W, Zhang C, Liu J (2024) Multi-label feature selection by strongly relevant label gain and label mutual aid. Pattern Recogn 145:109945
Guo D, Jiang C, Sheng R, Liu S (2022) A novel outcome evaluation model of three-way decision: a change viewpoint. Informa Sci 607:1089–1110
Guo D, Jiang C, Wu P (2022) Three-way decision based on confidence level change in rough set. Int J Approx Reason 143:57–77
Guo D, Xu W, Qian Y, Ding W (2023) M-fccl: memory-based concept-cognitive learning for dynamic fuzzy data classification and knowledge fusion. Inform Fus 100:101962
Xu W, Guo D, Qian Y, Ding W (2023) Two-way concept-cognitive learning method: a fuzzy-based progressive learning. IEEE Trans Fuzzy Syst 31(6):1885–1899
Xu W, Guo D, Mi J, Qian Y, Zheng K, Ding W (2023b) Two-way concept-cognitive learning via concept movement viewpoint. IEEE Trans Neural Netw Learn Syst:1–15
Guo D, Xu W (2023) Fuzzy-based concept-cognitive learning: an investigation of novel approach to tumor diagnosis analysis. Inform Sci 639:118998
Chen D, Zhang L, Zhao S, Hu Q, Zhu P (2012) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389
Chen D, Zhao S, Zhang L, Yang Y, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24(11):2080–2093
Qian W, **ong C, Wang Y (2021) A ranking-based feature selection for multi-label classification with fuzzy relative discernibility. Appl Soft Comput 102:106995
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209
Yuan Z, Chen H, **e P, Zhang P, Liu J, Li T (2021) Attribute reduction methods in fuzzy rough set theory: an overview, comparative experiments, and new directions. Appl Soft Comput 107:107353
Tsoumakas G, Spyromitros-**oufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12:2411–2414
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34(3):349–357
Lee J, Lim H, Kim D (2012) Approximating mutual information for multi-label feature selection. Yeast 2417(103):14
Lin Y, Hu Q, Liu J, Li J, Wu X (2017) Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Trans Fuzzy Syst 25(6):1491–1507
Lee J, Kim DW (2017) Scls: multi-label feature selection based on scalable criterion for large label set. Pattern Recogn 66:342–352
Zhang J, Wu H, Jiang M, Liu J, Li S, Tang Y, Long J (2023) Group-preserving label-specific feature selection for multi-label learning. Expert Syst Appl 213:118861
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39:135–168
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook, pp. 667–685
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Acknowledgements
This work is supported by the National Natural Science Foundation of China (62376093, 61976089), the Major Program of the National Social Science Foundation of China (20 &ZD047), the Natural Science Foundation of Hunan Province (2021JJ30451), and the Hunan Provincial Science & Technology Project Foundation (2018RS3065, 2018TP1018).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dai, J., Wang, Z. & Huang, W. Novel multi-label feature selection via label enhancement and relative maximal discernibility pairs. Int. J. Mach. Learn. & Cyber. 15, 3237–3253 (2024). https://doi.org/10.1007/s13042-023-02090-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-02090-3