Abstract
Multi-label Short Text Classification (MSTC) is a challenging subtask of Multi-Label Text Classification (MLTC) for tagging a short text with the most relevant subset of labels from a given set of labels. Recent studies have attempted to address MSTC task using MLTC methods and Pre-trained Language Models (PLM) based fine-tuning approaches, but suffering the low performance from the following three reasons, 1) failure to address the issue of data sparsity of short texts, 2) lack of adaptation to the long-tail distribution of labels in multi-label scenarios and 3) an implicit weakness in the encoding length for PLM, which limits the ability of the prompt learning paradigm. Therefore, in this paper, we propose KSSVPT, a Knowledge and Separating Soft Verbalizer based Prompt Tuning method for MSTC to address the above challenges. Firstly, to mitigate the sparsity issue in short texts, we propose a novel approach that enhances the semantic information of short texts by integrating external knowledge into the soft prompt template. Secondly, we construct a new soft prompt verbalizer for MSTC, called separating soft prompt verbalizer, to adapt to the long-tail distribution issue aggravated by multiple labels. Thirdly, we propose a mechanism of label cluster grou** in building a prompt template to directly alleviate limited encoding length and capture the label correlation. Extensive experiments conducted on six benchmark datasets demonstrate the superiority of our model compared to all competing models for MLTC and MSTC in the tackling of MSTC task.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05599-4/MediaObjects/10489_2024_5599_Fig8_HTML.png)
Similar content being viewed by others
Availability of data and materials
The data that support the findings of this study are available from the corresponding author, Peipei Li, upon reasonable request after acceptance.
Code availability
The code of this paper are available from the corresponding author, Peipei Li, upon reasonable request after acceptance.
References
Peng Z, Abdollahi B, **e M, et al (2021) Multi-label classification of short texts with label correlated recurrent neural networks. In: Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. pp 119–122
Lyu P, Rao G, Zhang L et al (2023) Bilgat: Bidirectional lattice graph attention network for Chinese short text classification. Appl Intell 1–10
Wang R, Long S, Dai X, et al (2021) Meta-lmtc: meta-learning for large-scale multi-label text classification. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8633–8646
Huang M, Zhuang F, Zhang X et al (2019) Supervised representation learning for multi-label classification. Mach Learn 108:747–763
Tarekegn AN, Giacobini M, Michalak K (2021) A review of methods for imbalanced multi-label classification. Pattern Recogn 118:107965
Lu G, Liu Y, Wang J et al (2023) Cnn-bilstm-attention: A multi-label neural classifier for short texts with a small set of labels. Inform Process Manag 60(3):103320
Chen LM, **u BX, Ding ZY (2022) Multiple weak supervision for short text classification. Appl Intell 52(8):9101–9116
**ao L, Huang X, Chen B, et al (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). pp 466–475
Wang R, Dai X, et al (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics. pp 672–679
Zhang T, Xu Z, Medini T et al (2022) Structural contrastive representation learning for zero-shot multi-label text classification. Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp 4937–4947
Chen J, Hu Y, Liu J, et al (2019) Deep short text classification with knowledge powered attention. In: Proceedings of the AAAI conference on artificial intelligence. pp 6252–6259
Wu M (2023) Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification. Expert Syst Appl 120800
Zhu Y, Wang Y, Qiang J et al (2023) Prompt-learning for short text classification. IEEE Trans. Knowl Data Eng 1–13
**ao L, Zhang X, **g L, et al (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence. pp 14103–14111
Wang Z, Wang P, Liu T, et al (2022) Hpt: Hierarchy-aware prompt tuning for hierarchical text classification. In: Proceedings of the 2022 conference on empirical methods in natural language processing. pp 3740–3751
Schick T, Schütze H (2021) Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume. Association for Computational Linguistics. pp 255–269
Hu S, Ding N, Wang H, et al (2022) Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics. pp 2225–2240
Li XL, Liang P (2021) Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing. pp 4582–4597
Ji K, Lian Y, Gao J, et al (2023) Hierarchical verbalizer for few-shot hierarchical text classification. In: Proceedings of the 61st annual meeting of the association for computational linguistics. Association for Computational Linguistics, Toronto, Canada. pp 2918–2933
Zhou Y, Kang X, Ren F (2023) Prompt consistency for multi-label textual emotion detection. IEEE Trans Affect Comput
Devlin J, Chang MW, Lee K, et al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. pp 4171–4186
Liu X, Shi T, Zhou G et al (2023) Emotion classification for short texts: an improved multi-label method. Humanit Soc Sci Commun 1–9
Shimura K, Li J, Fukumoto F (2018) Hft-cnn: Learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp 811–816
Liu P, Yuan W, Fu J et al (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 55:1–35
Li X, Yang F, Ma Y, et al (2020) Multi-label classification of short text based on similarity graph and restart random walk model. In: Proceedings of 11th IFIP TC 12 International Conference on Intelligent Information Processing X, IIP 2020, Hangzhou, China, July 3–6, 2020, Proceedings 11. Springer, pp 67–77
Tian G, Wang J, Wang R et al (2024) A multi-label social short text classification method based on contrastive learning and improved ml-knn. Expert Syst e13547
Chen X, Cheng J, Liu J, et al (2022) A survey of multi-label text classification based on deep learning. In: Proceedings of artificial intelligence and security. Springer International Publishing. pp 443–456
Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51:966–979
Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, pp 3816–3830
Hambardzumyan K, Khachatrian H, May J (2021) Warp: Word-level adversarial reprogramming. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp 4921–4933
Qin G, Eisner J (2021) Learning how to ask: Querying lms with mixtures of soft prompts. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp 5203–5212
Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp 3045–3059
Song R, Liu Z, Chen X et al (2023) Label prompt for multi-label text classification. Appl Intell 53(8):8761–8775
Su J, Zhu M, Murtadha A, et al (2022) Zlpr: A novel loss for multi-label classification. ar**v:2208.02955
McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM conference on Recommender systems. pp 165–172
Lehmann J, Isele R, Jakob M et al (2015) Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web 6(2):167–195
Chen J, Zhang R, Xu J et al (2023) A neural expectation-maximization framework for noisy multi-label text classification. IEEE Trans Knowl Data Eng 35(11):10992–11003
Liu W, Wang H, Shen X et al (2021) The emerging trends of multi-label learning. IEEE Trans Pattern Anal Mach Intell 7955–7974
Li J, Li P, Hu X et al (2022) Learning common and label-specific features for multi-label classification with correlation information. Pattern Recogn 121:108259
Ma Q, Yuan C, Zhou W, et al (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing. pp 3855–3864
Ozmen M, Zhang H, Wang P, et al (2022) Multi-relation message passing for multi-label text classification. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp 3583–3587
Liu Y, Li P, Hu X (2022) Combining context-relevant features with multi-stage attention network for short text classification. Comput Speech Lang 71:101268
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Acknowledgements
This work is supported in part by the Natural Science Foundation of China under grants (62376085, 62076085,6212016008), and Research Funds of Center for Big Data and Population Health of IHM under grant JKS2023003.
Funding
This work is supported in part by the Natural Science Foundation of China under grants (62376085, 61976077, 62076085), and Research Funds of Center for Big Data and Population Health of IHM under grant JKS2023003.
Author information
Authors and Affiliations
Contributions
Zhanwang Chen: Conceptualization, Methodology, Software, Editing. Peipei Li: Writing and Reviewing. Xuegang Hu: Supervision.
Corresponding author
Ethics declarations
Competing interests
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Ethics approval and informed consent for data used
This study was performed in line with the principles of the Declaration of the Committee on Publication Ethics. The data utilized in this study was obtained with permission to use proper data.
Consent to participate
All participants provided written informed consent.
Consent for publication
Informed consent for publication of this paper was obtained from all authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Z., Li, P. & Hu, X. Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05599-4
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-05599-4