Soft Prompt Transfer for Zero-Shot and Few-Shot Learning in EHR Understanding

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14178))

Included in the following conference series:

  • 377 Accesses

Abstract

Electronic Health Records (EHRs) are a rich source of information that can be leveraged for various medical applications, such as disease inference, treatment recommendation, and outcome analysis. However, the complexity and heterogeneity of EHR data, along with the limited availability of well-labeled samples, present significant challenges to the development of efficient and adaptable models for EHR tasks (such as rare or novel disease prediction or inference). In this paper, we propose Soft prompt transfer for Electronic Health Records (SptEHR), a novel pipeline designed to address these challenges. Specifically, SptEHR consists of three main stages: (1) self-supervised pre-training on raw EHR data for an EHR-centric transformer-based foundation model, (2) supervised multi-task continual learning from existing well-labeled tasks to further refine the foundation model and learn transferable task-specific soft prompts, and (3) further improve zero-shot and few-shot ability via prompt transfer. Specifically, the transformer-based foundation model learned from stage one captures domain-specific knowledge. Then the multi-task continual training in stage two improves model adaptability and performance on EHR tasks. Finally, stage three leverages soft prompt transfer which is based on the similarity between the new and the existing tasks, to effectively address new tasks without requiring additional/extensive training. The effectiveness of the SptEHR has been validated on the benchmark dataset - MIMIC-III.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 47.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 59.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aribandi, V., et al.: Ext5: towards extreme multi-task scaling for transfer learning. ar**v preprint ar**v:2111.10952 (2021)

  2. Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. ar**v preprint ar**v:1903.10676 (2019)

  3. Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  4. Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. NeurIPS 33, 22243–22255 (2020)

    Google Scholar 

  5. Choi, E., Xu, Z., Li, Y., Dusenberry, M., Flores, G., Xue, E., Dai, A.: Learning the graphical structure of electronic health records with graph convolutional transformer. In: Proceedings of the AAAI. vol. 34, pp. 606–613 (2020)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. ar**v preprint ar**v:1810.04805 (2018)

  7. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186. ACL (2019)

    Google Scholar 

  8. Gu, Y., Han, X., Liu, Z., Huang, M.: Ppt: Pre-trained prompt tuning for few-shot learning. ar**v preprint ar**v:2109.04332 (2021)

  9. Gururangan, S., et al.: Don’t stop pretraining: adapt language models to domains and tasks. ar**v preprint ar**v:2004.10964 (2020)

  10. Han, X., et al.: Pre-trained models: past, present and future. AI Open 2, 225–250 (2021)

    Article  Google Scholar 

  11. Jiang, Z., Xu, F.F., Araki, J., Neubig, G.: How can we know what language models know? Trans. Assoc. Comput. Linguist. 8, 423–438 (2020)

    Article  Google Scholar 

  12. Johnson, A.E., et al.: Mimic-iii, a freely accessible critical care database. Sci. Data 3, 160035 (2016)

    Article  Google Scholar 

  13. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. ar**v preprint ar**v:1909.11942 (2019)

  14. Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. ar**v preprint ar**v:2104.08691 (2021)

  15. Li, X.L., Liang, P.: Prefix-tuning: Optimizing continuous prompts for generation. ar**v preprint ar**v:2101.00190 (2021)

  16. Li, X.L., Liang, P.: Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1–6, 2021. pp. 4582–4597. ACL (2021)

    Google Scholar 

  17. Li, Y., et al.: BEHRT: transformer for electronic health records. Sci. Rep. 10(1), 1–12 (2020)

    Google Scholar 

  18. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)

    Article  Google Scholar 

  19. Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., Tang, J.: P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. CoRR abs/2110.07602 (2021)

    Google Scholar 

  20. Liu, X., et al.: P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. ar**v preprint ar**v:2110.07602 (2021)

  21. OpenAI: Gpt-4 technical report (2023)

    Google Scholar 

  22. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  23. Peng, X., Long, G., Shen, T., Wang, S., Jiang, J.: Sequential diagnosis prediction with transformer and ontological representation. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 489–498. IEEE (2021)

    Google Scholar 

  24. Peng, X., Long, G., Shen, T., Wang, S., Jiang, J., Zhang, C.: Bitenet: bidirectional temporal encoder network to predict medical outcomes. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 412–421. IEEE (2020)

    Google Scholar 

  25. Peng, X., et al.: MIPO: mutual integration of patient journey and medical ontology for healthcare representation learning. ar**v preprint ar**v:2107.09288 (2021)

  26. Qin, G., Eisner, J.: Learning how to ask: querying LMS with mixtures of soft prompts. ar**v preprint ar**v:2104.06599 (2021)

  27. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  28. Ren, H., Wang, J., Zhao, W.X., Wu, N.: Rapt: Pre-training of time-aware transformer for learning robust healthcare representation. In: Proceedings of the 27th ACM SIGKDD, pp. 3503–3511 (2021)

    Google Scholar 

  29. Schick, T., Schütze, H.: It’s not just size that matters: Small language models are also few-shot learners. In: Proceedings of the 2021 Conference of the NAACL: Human Language Technologies, pp. 2339–2352 (2021)

    Google Scholar 

  30. Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. ar** with limited data. J. Biomed. Inform. 116, 103726 (2021)

    Article  Google Scholar 

  31. Steinberg, E., Jung, K., Fries, J.A., Corbin, C.K., Pfohl, S.R., Shah, N.H.: Language models are an effective representation learning technique for electronic health record data. J. Biomed. Inform. 113, 103637 (2021)

    Article  Google Scholar 

  32. Taylor, N., Zhang, Y., Joyce, D., Nevado-Holgado, A., Kormilitzin, A.: Clinical prompt learning with frozen language models. ar**v preprint ar**v:2205.05535 (2022)

  33. Thrun, S., Pratt, L.: Learning to learn: Introduction and overview. learning to learn, pp. 3–17 (1998)

    Google Scholar 

  34. Vaswani, A., et al.: Attention is all you need. In: NeurIPS 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)

    Google Scholar 

  35. Vu, T., Lester, B., Constant, N., Al-Rfou, R., Cer, D.: Spot: better frozen model adaptation through soft prompt transfer. ar**v preprint ar**v:2110.07904 (2021)

  36. Vu, T., Lester, B., Constant, N., Al-Rfou’, R., Cer, D.: Spot: better frozen model adaptation through soft prompt transfer. In: Proceedings of ACL, pp. 5039–5059. Association for Computational Linguistics (2022)

    Google Scholar 

  37. Wang, W., et al.: Structbert: Incorporating language structures into pre-training for deep language understanding. ar**v preprint ar**v:1908.04577 (2019)

  38. Xu, H., Chen, Y., Du, Y., Shao, N., Wang, Y., Li, H., Yang, Z.: ZeroPrompt: scaling prompt-based pretraining to 1, 000 tasks improves zero-shot generalization. In: Findings of the Association for Computational Linguistics: EMNLP, pp. 4235–4252 (2022)

    Google Scholar 

  39. Zhao, Z., Wallace, E., Feng, S., Klein, D., Singh, S.: Calibrate before use: improving few-shot performance of language models. In: International Conference on Machine Learning, pp. 12697–12706. PMLR (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yang Wang or Xue** Peng .

Editor information

Editors and Affiliations

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y. et al. (2023). Soft Prompt Transfer for Zero-Shot and Few-Shot Learning in EHR Understanding. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14178. Springer, Cham. https://doi.org/10.1007/978-3-031-46671-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46671-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46670-0

  • Online ISBN: 978-3-031-46671-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation