Log in

Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-label Short Text Classification (MSTC) is a challenging subtask of Multi-Label Text Classification (MLTC) for tagging a short text with the most relevant subset of labels from a given set of labels. Recent studies have attempted to address MSTC task using MLTC methods and Pre-trained Language Models (PLM) based fine-tuning approaches, but suffering the low performance from the following three reasons, 1) failure to address the issue of data sparsity of short texts, 2) lack of adaptation to the long-tail distribution of labels in multi-label scenarios and 3) an implicit weakness in the encoding length for PLM, which limits the ability of the prompt learning paradigm. Therefore, in this paper, we propose KSSVPT, a Knowledge and Separating Soft Verbalizer based Prompt Tuning method for MSTC to address the above challenges. Firstly, to mitigate the sparsity issue in short texts, we propose a novel approach that enhances the semantic information of short texts by integrating external knowledge into the soft prompt template. Secondly, we construct a new soft prompt verbalizer for MSTC, called separating soft prompt verbalizer, to adapt to the long-tail distribution issue aggravated by multiple labels. Thirdly, we propose a mechanism of label cluster grou** in building a prompt template to directly alleviate limited encoding length and capture the label correlation. Extensive experiments conducted on six benchmark datasets demonstrate the superiority of our model compared to all competing models for MLTC and MSTC in the tackling of MSTC task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

The data that support the findings of this study are available from the corresponding author, Peipei Li, upon reasonable request after acceptance.

Code availability

The code of this paper are available from the corresponding author, Peipei Li, upon reasonable request after acceptance.

Notes

  1. https://probasegroup.com/

  2. https://sobigdata.d4science.org/web/tagme/tagme-help

  3. https://scikit-learn.org

References

  1. Peng Z, Abdollahi B, **e M, et al (2021) Multi-label classification of short texts with label correlated recurrent neural networks. In: Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. pp 119–122

  2. Lyu P, Rao G, Zhang L et al (2023) Bilgat: Bidirectional lattice graph attention network for Chinese short text classification. Appl Intell 1–10

  3. Wang R, Long S, Dai X, et al (2021) Meta-lmtc: meta-learning for large-scale multi-label text classification. In: Proceedings of the 2021 conference on empirical methods in natural language processing. pp 8633–8646

  4. Huang M, Zhuang F, Zhang X et al (2019) Supervised representation learning for multi-label classification. Mach Learn 108:747–763

    Article  MathSciNet  Google Scholar 

  5. Tarekegn AN, Giacobini M, Michalak K (2021) A review of methods for imbalanced multi-label classification. Pattern Recogn 118:107965

    Article  Google Scholar 

  6. Lu G, Liu Y, Wang J et al (2023) Cnn-bilstm-attention: A multi-label neural classifier for short texts with a small set of labels. Inform Process Manag 60(3):103320

    Article  Google Scholar 

  7. Chen LM, **u BX, Ding ZY (2022) Multiple weak supervision for short text classification. Appl Intell 52(8):9101–9116

    Article  Google Scholar 

  8. **ao L, Huang X, Chen B, et al (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). pp 466–475

  9. Wang R, Dai X, et al (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics. pp 672–679

  10. Zhang T, Xu Z, Medini T et al (2022) Structural contrastive representation learning for zero-shot multi-label text classification. Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp 4937–4947

    Chapter  Google Scholar 

  11. Chen J, Hu Y, Liu J, et al (2019) Deep short text classification with knowledge powered attention. In: Proceedings of the AAAI conference on artificial intelligence. pp 6252–6259

  12. Wu M (2023) Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification. Expert Syst Appl 120800

  13. Zhu Y, Wang Y, Qiang J et al (2023) Prompt-learning for short text classification. IEEE Trans. Knowl Data Eng 1–13

  14. **ao L, Zhang X, **g L, et al (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence. pp 14103–14111

  15. Wang Z, Wang P, Liu T, et al (2022) Hpt: Hierarchy-aware prompt tuning for hierarchical text classification. In: Proceedings of the 2022 conference on empirical methods in natural language processing. pp 3740–3751

  16. Schick T, Schütze H (2021) Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th conference of the european chapter of the association for computational linguistics: main volume. Association for Computational Linguistics. pp 255–269

  17. Hu S, Ding N, Wang H, et al (2022) Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics. pp 2225–2240

  18. Li XL, Liang P (2021) Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing. pp 4582–4597

  19. Ji K, Lian Y, Gao J, et al (2023) Hierarchical verbalizer for few-shot hierarchical text classification. In: Proceedings of the 61st annual meeting of the association for computational linguistics. Association for Computational Linguistics, Toronto, Canada. pp 2918–2933

  20. Zhou Y, Kang X, Ren F (2023) Prompt consistency for multi-label textual emotion detection. IEEE Trans Affect Comput

  21. Devlin J, Chang MW, Lee K, et al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. pp 4171–4186

  22. Liu X, Shi T, Zhou G et al (2023) Emotion classification for short texts: an improved multi-label method. Humanit Soc Sci Commun 1–9

  23. Shimura K, Li J, Fukumoto F (2018) Hft-cnn: Learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp 811–816

  24. Liu P, Yuan W, Fu J et al (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 55:1–35

    Google Scholar 

  25. Li X, Yang F, Ma Y, et al (2020) Multi-label classification of short text based on similarity graph and restart random walk model. In: Proceedings of 11th IFIP TC 12 International Conference on Intelligent Information Processing X, IIP 2020, Hangzhou, China, July 3–6, 2020, Proceedings 11. Springer, pp 67–77

  26. Tian G, Wang J, Wang R et al (2024) A multi-label social short text classification method based on contrastive learning and improved ml-knn. Expert Syst e13547

  27. Chen X, Cheng J, Liu J, et al (2022) A survey of multi-label text classification based on deep learning. In: Proceedings of artificial intelligence and security. Springer International Publishing. pp 443–456

  28. Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51:966–979

    Article  Google Scholar 

  29. Gao T, Fisch A, Chen D (2021) Making pre-trained language models better few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, pp 3816–3830

  30. Hambardzumyan K, Khachatrian H, May J (2021) Warp: Word-level adversarial reprogramming. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp 4921–4933

  31. Qin G, Eisner J (2021) Learning how to ask: Querying lms with mixtures of soft prompts. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp 5203–5212

  32. Lester B, Al-Rfou R, Constant N (2021) The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. pp 3045–3059

  33. Song R, Liu Z, Chen X et al (2023) Label prompt for multi-label text classification. Appl Intell 53(8):8761–8775

    Article  Google Scholar 

  34. Su J, Zhu M, Murtadha A, et al (2022) Zlpr: A novel loss for multi-label classification. ar**v:2208.02955

  35. McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM conference on Recommender systems. pp 165–172

  36. Lehmann J, Isele R, Jakob M et al (2015) Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web 6(2):167–195

    Article  Google Scholar 

  37. Chen J, Zhang R, Xu J et al (2023) A neural expectation-maximization framework for noisy multi-label text classification. IEEE Trans Knowl Data Eng 35(11):10992–11003

    Article  Google Scholar 

  38. Liu W, Wang H, Shen X et al (2021) The emerging trends of multi-label learning. IEEE Trans Pattern Anal Mach Intell 7955–7974

  39. Li J, Li P, Hu X et al (2022) Learning common and label-specific features for multi-label classification with correlation information. Pattern Recogn 121:108259

    Article  Google Scholar 

  40. Ma Q, Yuan C, Zhou W, et al (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing. pp 3855–3864

  41. Ozmen M, Zhang H, Wang P, et al (2022) Multi-relation message passing for multi-label text classification. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp 3583–3587

  42. Liu Y, Li P, Hu X (2022) Combining context-relevant features with multi-stage attention network for short text classification. Comput Speech Lang 71:101268

    Article  Google Scholar 

  43. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

Download references

Acknowledgements

This work is supported in part by the Natural Science Foundation of China under grants (62376085, 62076085,6212016008), and Research Funds of Center for Big Data and Population Health of IHM under grant JKS2023003.

Funding

This work is supported in part by the Natural Science Foundation of China under grants (62376085, 61976077, 62076085), and Research Funds of Center for Big Data and Population Health of IHM under grant JKS2023003.

Author information

Authors and Affiliations

Authors

Contributions

Zhanwang Chen: Conceptualization, Methodology, Software, Editing. Peipei Li: Writing and Reviewing. Xuegang Hu: Supervision.

Corresponding author

Correspondence to Peipei Li.

Ethics declarations

Competing interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Ethics approval and informed consent for data used

This study was performed in line with the principles of the Declaration of the Committee on Publication Ethics. The data utilized in this study was obtained with permission to use proper data.

Consent to participate

All participants provided written informed consent.

Consent for publication

Informed consent for publication of this paper was obtained from all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Li, P. & Hu, X. Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05599-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05599-4

Keywords

Navigation