PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13694))

Included in the following conference series:

Abstract

With the increasing abundance of pretrained models in recent years, the problem of selecting the best pretrained checkpoint for a particular downstream classification task has been gaining increased attention. Although several methods have recently been proposed to tackle the selection problem (e.g. LEEP, H-score), these methods resort to applying heuristics that are not well motivated by learning theory. In this paper we present PACTran, a theoretically grounded family of metrics for pretrained model selection and transferability measurement. We first show how to derive PACTran metrics from the optimal PAC-Bayesian bound under the transfer learning setting. We then empirically evaluate three metric instantiations of PACTran on a number of vision tasks (VTAB) as well as a language-and-vision (OKVQA) task. An analysis of the results shows PACTran is a more consistent and effective transferability measure compared to existing selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 117.69
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.tensorflow.org/hub.

  2. 2.

    https://huggingface.co/.

References

  1. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/ software available from tensorflow.org

  2. Antol, S., et al.: VQA: visual question answering. In: ICCV (2015)

    Google Scholar 

  3. Bao, Y., et al.: An information-theoretic approach to transferability in task transfer learning. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2309–2313. IEEE (2019)

    Google Scholar 

  4. Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)

    Article  MathSciNet  Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Bommasani, R., et al.: On the opportunities and risks of foundation models (2021)

    Google Scholar 

  7. Bousquet, O., Boucheron, S., Lugosi, G.: Introduction to statistical learning theory. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML -2003. LNCS (LNAI), vol. 3176, pp. 169–207. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_8

    Chapter  MATH  Google Scholar 

  8. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis (2019)

    Google Scholar 

  9. Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  10. Changpinyo, S., Kukliansky, D., Szpektor, I., Chen, X., Ding, N., Soricut, R.: All you may need for VQA are image captions. In: NAACL (2022)

    Google Scholar 

  11. Ding, N., Chen, X., Levinboim, T., Goodman, S., Soricut, R.: Bridging the gap between practice and PAC-bayes theory in few-shot meta-learning. Adv. Neural Inf. Process. Syst. 34, 29506–29516 (2021)

    Google Scholar 

  12. Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction (2016)

    Google Scholar 

  13. Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with convolutional neural networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems. vol. 27. Curran Associates, Inc. (2014)

    Google Scholar 

  14. Dziugaite, G.K., Roy, D.M.: Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data. ar**v preprint ar**v:1703.11008 (2017)

  15. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Computer Vision and Pattern Recognition Workshop (2004)

    Google Scholar 

  16. Germain, P., Bach, F., Lacoste, A., Lacoste-Julien, S.: PAC-bayesian theory meets bayesian inference. Adv. Neural Inf. Process. Syst. 29, 1884–1892 (2016)

    Google Scholar 

  17. Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: PAC-bayesian learning of linear classifiers. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 353–360 (2009)

    Google Scholar 

  18. Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations (2018)

    Google Scholar 

  19. Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., Parikh, D.: Making the V in VQA matter: elevating the role of image understanding in visual question answering. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  20. Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., Parikh, D.: Making the V in VQA matter: elevating the role of image understanding in visual question answering. In: CVPR (2017)

    Google Scholar 

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  22. He, K., Zhang, X., Ren, S., Sun, J.: Identity map**s in deep residual networks (2016)

    Google Scholar 

  23. Huang, S.L., Makur, A., Wornell, G.W., Zheng, L.: On universal features for high-dimensional learning and inference. ar**v preprint ar**v:1911.09105 (2019)

  24. Hudson, D.A., Manning, C.D.: GQA: A new dataset for real-world visual reasoning and compositional question answering. In: CVPR (2019)

    Google Scholar 

  25. Jiang, Y., Neyshabur, B., Mobahi, H., Krishnan, D., Bengio, S.: Fantastic generalization measures and where to find them. In: ICLR (2020)

    Google Scholar 

  26. Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2014)

    Google Scholar 

  27. Krizhevsky, A.: Learning multiple layers of features from tiny images. University of Toronto, Technical Report (2009)

    Google Scholar 

  28. LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-104 (2004)

    Google Scholar 

  29. Li, Y., et al.: Ranking neural checkpoints. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2663–2673 (2021)

    Google Scholar 

  30. Marino, K., Rastegari, M., Farhadi, A., Mottaghi, R.: Ok-vqa: a visual question answering benchmark requiring external knowledge. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  31. McAllester, D.A.: Some PAC-bayesian theorems. Mach. Learn. 37(3), 355–363 (1999)

    Article  Google Scholar 

  32. Neyshabur, B., Bhojanapalli, S., McAllester, D., Srebro, N.: Exploring generalization in deep learning. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  33. Nguyen, C., Hassner, T., Seeger, M., Archambeau, C.: Leep: a new measure to evaluate transferability of learned representations. In: International Conference on Machine Learning, pp. 7294–7305. PMLR (2020)

    Google Scholar 

  34. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, December 2008

    Google Scholar 

  35. Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles (2017)

    Google Scholar 

  36. Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  37. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR (2020)

    Google Scholar 

  38. Rothfuss, J., Fortuin, V., Josifoski, M., Krause, A.: Pacoh: bayes-optimal meta-learning with pac-guarantees. In: International Conference on Machine Learning, pp. 9116–9126. PMLR (2021)

    Google Scholar 

  39. Rubenstein, P., Bousquet, O., Djolonga, J., Riquelme, C., Tolstikhin, I.O.: Practical and consistent estimation of f-divergences. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  40. Sawyer-Lee, R., Gimenez, F., Hoogi, A., Rubin, D.: Curated breast imaging subset of DDSM (2016). https://doi.org/10.7937/k9/tcia.2016.7o02s9cy

  41. Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders (2019)

    Google Scholar 

  42. Tran, A.T., Nguyen, C.V., Hassner, T.: Transferability and hardness of supervised classification tasks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1395–1405 (2019)

    Google Scholar 

  43. Tripuraneni, N., Jordan, M., **, C.: On the theory of transfer learning: the importance of task diversity. Adv. Neural Inf. Process. Syst 33, 7852–7862 (2020)

    Google Scholar 

  44. Tsuzuku, Y., Sato, I., Sugiyama, M.: Normalized flat minima: exploring scale invariant definition of flat minima for neural networks using PAC-Bayesian analysis. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9636–9647 (2020)

    Google Scholar 

  45. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  46. Veeling, B.S., Linmans, J., Winkens, J., Cohen, T., Welling, M.: Rotation equivariant CNNs for digital pathology (2018). https://doi.org/10.1007/978-3-030-00934-2-24

  47. Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. ar**v preprint ar**v:1910.03771 (2019)

  48. **ao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492, June 2010. https://doi.org/10.1109/CVPR.2010.5539970

  49. You, K., Liu, Y., Wang, J., Long, M.: Logme: practical assessment of pre-trained models for transfer learning. In: International Conference on Machine Learning, pp. 12133–12143. PMLR (2021)

    Google Scholar 

  50. Zhai, X., Oliver, A., Kolesnikov, A., Beyer, L.: S4l: self-supervised semi-supervised learning. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1476–1485 (2019). https://doi.org/10.1109/ICCV.2019.00156

  51. Zhai, X., et al.: A large-scale study of representation learning with the visual task adaptation benchmark. ar**v preprint ar**v:1910.04867 (2019)

  52. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)

    Article  Google Scholar 

  53. Zhu, Y., Groth, O., Bernstein, M., Li, F.F.: Visual7W: grounded question answering in images. In: CVPR (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nan Ding .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2346 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, N., Chen, X., Levinboim, T., Changpinyo, S., Soricut, R. (2022). PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13694. Springer, Cham. https://doi.org/10.1007/978-3-031-19830-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19830-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19829-8

  • Online ISBN: 978-3-031-19830-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation