Fitting and sharing multi-task learning

Piao, Chengkai; Wei, **mao

doi:10.1007/s10489-024-05549-0

Fitting and sharing multi-task learning

Published: 28 May 2024

Volume 54, pages 6918–6929, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

88 Accesses
Explore all metrics

Abstract

Multi-Task Learning is an effective method for learning cross-task knowledge. However, existing methods cannot fairly treat each task, their public parts are prone to continuously fit new tasks and decrease the performances of previous tasks. In this paper, we propose the Fitting-sharing Multi-Task Learning method to address this problem. In the Fitting step, a group of indicator parameters are trained to extract task-specific features and store them into an in-task template matrix. After all models converge, the indicators and templates are frozen to protect the learned knowledge. In the Sharing step, a group of connector parameters are trained to acquire information from other templates and to reason cross-task knowledge. Since the learning and sharing processes are separate, each model can acquire the learned knowledge from other tasks without affect them, and the imbalanced cross-task knowledge problem can be naturally avoided. Experimental results on public datasets illustrate that the proposed method can insistently improve the performance compared with existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning with Partially Shared Features for Multi-Task Learning

Knowledge Distillation for Multi-task Learning

Multi-task gradient descent for multi-task learning

Article 19 October 2020

Notes

References

Yu T, Kumar S, Gupta A, Levine S, Hausman K, Finn C (2020) Gradient surgery for multi-task learning. Adv Neural Inf Process Syst 33:5824–5836
Vandenhende S, Georgoulis S, Van Gool L (2020) Mti-net: multi-scale task interaction networks for multi-task learning. ECCV 2020: Computer Vision–ECCV 2020 12349:527–543. Springer Nature Switzerland AG
Gao M, Li J-Y, Chen C-H, Li Y, Zhang J, Zhan Z-H (2023) Enhanced multi-task learning and knowledge graph-based recommender system. IEEE Trans Knowl Data Eng 35(10):10281–10294. Institute of Electrical and Electronics Engineers
Lin B, Zhang Y (2023) Libmtl: a python library for deep multi-task learning. J Mach Learn Res 24(1–7):18
Xu Y, Yang Y, Zhang L (2023) Demt: deformable mixer transformer for multi-task learning of dense prediction. In: Proceedings of the thirty-seventh AAAI conference on artificial intelligence and thirty-fifth conference on innovative applications of artificial intelligence and thirteenth symposium on educational advances in artificial intelligence, pp 3072–3080
Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, De Laroussilhe Q, Gesmundo A, Attariyan M, Gelly S (2019) Parameter-efficient transfer learning for NLP. In: International conference on machine learning, pp 2790–2799. PMLR
Ma J, Zhao Z, Chen J, Li A, Hong L, Chi EH (2019) Snr: sub-network routing for flexible parameter sharing in multi-task learning. In: Proceedings of the AAAI conference on artificial intelligence 33(1):216–223
Guo P, Lee C-Y, Ulbricht D (2020) Learning to branch for multi-task learning. In: International conference on machine learning, pp 3854–3863. PMLR
Liu B, Liu X, ** X, Stone P, Liu Q (2021) Conflict-averse gradient descent for multi-task learning. Adv Neural Inf Process Syst 34:18878–18890
Chai H, Cui J, Wang Y, Zhang M, Fang B, Liao Q (2023) Improving gradient trade-offs between tasks in multi-task text classification. In: Proceedings of the 61st annual meeting of the association for computational linguistics, pp 2565–2579
Fifty C, Amid E, Zhao Z, Yu T, Anil R, Finn C (2021) Efficiently identifying task grou**s for multi-task learning. Adv Neural Inf Process Syst 34:27503–27516
Google Scholar
Gueta A, Venezian E, Raffel C, Slonim N, Katz Y, Choshen L (2023) Knowledge is a region in weight space for fine-tuned language models. In: Findings of the association for computational linguistics: EMNLP 2023, pp 1350–1370
Tripathi S, Singh C, Kumar A, Pandey C, Jain N (2019) Bidirectional transformer based multi-task learning for natural language understanding. In: Natural language processing and information systems: 24th international conference on applications of natural language to information systems, NLDB 2019, Salford, UK, June 26–28, 2019, Proceedings 24, pp 54–65. Springer
Vandenhende S, Georgoulis S, Van Gansbeke W, Proesmans M, Dai D, Van Gool L (2022) Multi-task learning for dense prediction tasks: a survey. IEEE Trans Pattern Anal Mach Intell 44(7):3614–3633
Liu P, Qiu X, Huang X-J (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistic, pp 1–10
Qin Q, Hu W, Liu B (2020) Feature projection for improved text classification. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 8161–8171
Romero R, Celard P, Sorribes-Fdez JM, Seara Vieira A, Iglesias EL, Borrajo L (2022) Mobydeep: a lightweight CNN architecture to configure models for text classification. Knowl-Based Syst 257:109914. Elsevier
Zhang T, Gong X, Chen CLP (2021) Bmt-net: broad multitask transformer network for sentiment analysis. IEEE Trans Cybernet 52(7):6232–6243. IEEE
Soni S, Chouhan SS, Rathore SS (2023) Textconvonet: a convolutional neural network based architecture for text classification. Appl Intell 53(11):14249–14268. Springer
Su J, Ahmed M, Lu Y, Pan S, Bo W, Liu Y (2024) Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568:127063. Elsevier
Merity S, **ong C, Bradbury J, Socher R (2016) Pointer sentinel mixture models. ar**v preprint ar**v:1609.07843
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186

Download references

Author information

**mao Wei contributed equally to this work.

Authors and Affiliations

School of Computer Science, Nankai University, 38 Tongyan Rd, Tian**, 300001, Tian**, China
Chengkai Piao & **mao Wei

Authors

Chengkai Piao
View author publications
You can also search for this author in PubMed Google Scholar
**mao Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengkai Piao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Piao, C., Wei, J. Fitting and sharing multi-task learning. Appl Intell 54, 6918–6929 (2024). https://doi.org/10.1007/s10489-024-05549-0

Download citation

Accepted: 20 May 2024
Published: 28 May 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s10489-024-05549-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fitting and sharing multi-task learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning with Partially Shared Features for Multi-Task Learning

Knowledge Distillation for Multi-task Learning

Multi-task gradient descent for multi-task learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fitting and sharing multi-task learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning with Partially Shared Features for Multi-Task Learning

Knowledge Distillation for Multi-task Learning

Multi-task gradient descent for multi-task learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation