Learning Curve Extrapolation Methods Across Extrapolation Settings

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis XXII (IDA 2024)

Abstract

Learning curves are important for decision-making in supervised machine learning. They show how the performance of a machine learning model develops over a given resource. In this work, we consider learning curves that describe the performance of a machine learning model as a function of the number of data points used for training. It is often useful to extrapolate learning curves, which can be done by fitting a parametric model based on the observed values, or by using an extrapolation model trained on learning curves from similar datasets. We perform an extensive analysis comparing these two methods with different observations and prediction objectives. Depending on the setting, different extrapolation methods perform best. When a small number of initial segments of the learning curve have been observed we find that it is better to rely on learning curves from similar datasets. Once more observations have been made, a parametric model, or just the last observation, should be used. Moreover, using a parametric model is mostly useful when the exact value of the final performance itself is of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The authors use the name mmf4 for the version of MMF that is used in our work.

  2. 2.

    The authors use the name A_MDS for the version of MDS used in our work.

  3. 3.

    See: https://github.com/ADA-research/LearningCurveExtrapolationSettings.

References

  1. Bousquet, O., Hanneke, S., Moran, S., van Handel, R., Yehudayoff, A.: A theory of universal learning. In: STOC 2021: 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 532–541. ACM (2021)

    Google Scholar 

  2. Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J.: Metalearning: Applications to Automated Machine Learning and Data Mining, 2nd edn. Springer (2022). https://doi.org/10.1007/978-3-030-67024-5

  3. Chandrashekaran, A., Lane, I.R.: Speeding up hyper-parameter optimization by extrapolation of learning curves using previous builds. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10534, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71249-9_29

    Chapter  Google Scholar 

  4. Cortes, C., Jackel, L.D., Solla, S.A., Vapnik, V., Denker, J.S.: Learning curves: asymptotic values and rate of convergence. Adv. Neural Info. Proc. Syst. 6, 327–334 (1993)

    Google Scholar 

  5. Gu, B., Hu, F., Liu, H.: Modelling classification performance for large data sets. In: Wang, X.S., Yu, G., Lu, H. (eds.) WAIM 2001. LNCS, vol. 2118, pp. 317–328. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47714-4_29

    Chapter  Google Scholar 

  6. Hutter, M.: Learning curve theory. CoRR abs/2102.04074 (2021)

    Google Scholar 

  7. John, G.H., Langley, P.: Static versus dynamic sampling for data mining. In: KDD’96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 367–370. AAAI Press (1996)

    Google Scholar 

  8. Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast bayesian optimization of machine learning hyperparameters on large datasets. In: International Conference on Artificial Intelligence and Statistics, AISTATS 2017. Proceedings of Machine Learning Research, vol. 54, pp. 528–536. PMLR (2017)

    Google Scholar 

  9. Klein, A., Falkner, S., Springenberg, J.T., Hutter, F.: Learning curve prediction with bayesian neural networks. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  10. Last, M.: Improving data mining utility with projective sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 487–496. ACM (2009)

    Google Scholar 

  11. Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: International Conference on Machine Learning (ICML 2005). ACM International Conference Proceeding Series, vol. 119, pp. 497–503. ACM (2005)

    Google Scholar 

  12. Leite, R., Brazdil, P.: Active testing strategy to predict the best classification algorithm via sampling and metalearning. In: ECAI 2010 - 19th European Conference on Artificial Intelligence. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 309–314. IOS Press (2010)

    Google Scholar 

  13. Mohr, F., van Rijn, J.N.: Learning curves for decision making in supervised machine learning – a survey. CoRR abs/2201.12150 (2022)

    Google Scholar 

  14. Mohr, F., van Rijn, J.N.: Fast and informative model selection using learning curve cross-validation. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9669–9680 (2023)

    Article  Google Scholar 

  15. Mohr, F., Viering, T.J., Loog, M., van Rijn, J.N.: LCDB 1.0: An extensive learning curves database for classification tasks. In: Amini, M.-R., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds.) Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part V, pp. 3–19. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26419-1_1

  16. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Proc. Syst. 32, 8024–8035 (2019)

    Google Scholar 

  17. Provost, F.J., Jensen, D.D., Oates, T.: Efficient progressive sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–32. ACM (1999)

    Google Scholar 

  18. van Rijn, J.N., Abdulrahman, S.M., Brazdil, P., Vanschoren, J.: Fast algorithm selection using learning curves. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 298–309. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24465-5_26

    Chapter  Google Scholar 

  19. Swersky, K., Snoek, J., Adams, R.P.: Freeze-thaw bayesian optimization. CoRR abs/1406.3896 (2014)

    Google Scholar 

  20. Viering, T., Loog, M.: The shape of learning curves: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7799–7819 (2022)

    Article  Google Scholar 

  21. Weiss, G.M., Tian, Y.: Maximizing classifier utility when there are data acquisition and modeling costs. Data Min. Knowl. Disc. 17(2), 253–282 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan N. van Rijn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kielhöfer, L., Mohr, F., van Rijn, J.N. (2024). Learning Curve Extrapolation Methods Across Extrapolation Settings. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-58553-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-58555-5

  • Online ISBN: 978-3-031-58553-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation