Learning Curve Extrapolation Methods Across Extrapolation Settings

Kielhöfer, Lionel; Mohr, Felix; van Rijn, Jan N.

doi:10.1007/978-3-031-58553-1_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14642))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

216 Accesses
1 Citations

Abstract

Learning curves are important for decision-making in supervised machine learning. They show how the performance of a machine learning model develops over a given resource. In this work, we consider learning curves that describe the performance of a machine learning model as a function of the number of data points used for training. It is often useful to extrapolate learning curves, which can be done by fitting a parametric model based on the observed values, or by using an extrapolation model trained on learning curves from similar datasets. We perform an extensive analysis comparing these two methods with different observations and prediction objectives. Depending on the setting, different extrapolation methods perform best. When a small number of initial segments of the learning curve have been observed we find that it is better to rely on learning curves from similar datasets. Once more observations have been made, a parametric model, or just the last observation, should be used. Moreover, using a parametric model is mostly useful when the exact value of the final performance itself is of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The authors use the name mmf4 for the version of MMF that is used in our work.
2.
The authors use the name A_MDS for the version of MDS used in our work.
3.
See: https://github.com/ADA-research/LearningCurveExtrapolationSettings.

References

Bousquet, O., Hanneke, S., Moran, S., van Handel, R., Yehudayoff, A.: A theory of universal learning. In: STOC 2021: 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 532–541. ACM (2021)
Google Scholar
Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J.: Metalearning: Applications to Automated Machine Learning and Data Mining, 2nd edn. Springer (2022). https://doi.org/10.1007/978-3-030-67024-5
Chandrashekaran, A., Lane, I.R.: Speeding up hyper-parameter optimization by extrapolation of learning curves using previous builds. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10534, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71249-9_29
Chapter Google Scholar
Cortes, C., Jackel, L.D., Solla, S.A., Vapnik, V., Denker, J.S.: Learning curves: asymptotic values and rate of convergence. Adv. Neural Info. Proc. Syst. 6, 327–334 (1993)
Google Scholar
Gu, B., Hu, F., Liu, H.: Modelling classification performance for large data sets. In: Wang, X.S., Yu, G., Lu, H. (eds.) WAIM 2001. LNCS, vol. 2118, pp. 317–328. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47714-4_29
Chapter Google Scholar
Hutter, M.: Learning curve theory. CoRR abs/2102.04074 (2021)
Google Scholar
John, G.H., Langley, P.: Static versus dynamic sampling for data mining. In: KDD’96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 367–370. AAAI Press (1996)
Google Scholar
Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast bayesian optimization of machine learning hyperparameters on large datasets. In: International Conference on Artificial Intelligence and Statistics, AISTATS 2017. Proceedings of Machine Learning Research, vol. 54, pp. 528–536. PMLR (2017)
Google Scholar
Klein, A., Falkner, S., Springenberg, J.T., Hutter, F.: Learning curve prediction with bayesian neural networks. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Last, M.: Improving data mining utility with projective sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 487–496. ACM (2009)
Google Scholar
Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: International Conference on Machine Learning (ICML 2005). ACM International Conference Proceeding Series, vol. 119, pp. 497–503. ACM (2005)
Google Scholar
Leite, R., Brazdil, P.: Active testing strategy to predict the best classification algorithm via sampling and metalearning. In: ECAI 2010 - 19th European Conference on Artificial Intelligence. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 309–314. IOS Press (2010)
Google Scholar
Mohr, F., van Rijn, J.N.: Learning curves for decision making in supervised machine learning – a survey. CoRR abs/2201.12150 (2022)
Google Scholar
Mohr, F., van Rijn, J.N.: Fast and informative model selection using learning curve cross-validation. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9669–9680 (2023)
Article Google Scholar
Mohr, F., Viering, T.J., Loog, M., van Rijn, J.N.: LCDB 1.0: An extensive learning curves database for classification tasks. In: Amini, M.-R., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds.) Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part V, pp. 3–19. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26419-1_1
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Proc. Syst. 32, 8024–8035 (2019)
Google Scholar
Provost, F.J., Jensen, D.D., Oates, T.: Efficient progressive sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–32. ACM (1999)
Google Scholar
van Rijn, J.N., Abdulrahman, S.M., Brazdil, P., Vanschoren, J.: Fast algorithm selection using learning curves. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 298–309. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24465-5_26
Chapter Google Scholar
Swersky, K., Snoek, J., Adams, R.P.: Freeze-thaw bayesian optimization. CoRR abs/1406.3896 (2014)
Google Scholar
Viering, T., Loog, M.: The shape of learning curves: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7799–7819 (2022)
Article Google Scholar
Weiss, G.M., Tian, Y.: Maximizing classifier utility when there are data acquisition and modeling costs. Data Min. Knowl. Disc. 17(2), 253–282 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Netherlands
Lionel Kielhöfer & Jan N. van Rijn
Universidad de La Sabana, Chia, Colombia
Felix Mohr

Authors

Lionel Kielhöfer
View author publications
You can also search for this author in PubMed Google Scholar
Felix Mohr
View author publications
You can also search for this author in PubMed Google Scholar
Jan N. van Rijn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan N. van Rijn .

Editor information

Editors and Affiliations

Stockholm University, Kista, Sweden
Ioanna Miliou
Fraunhofer IAIS, Sankt Augustin, Germany
Nico Piatkowski
Stockholm University, Kista, Sweden
Panagiotis Papapetrou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kielhöfer, L., Mohr, F., van Rijn, J.N. (2024). Learning Curve Extrapolation Methods Across Extrapolation Settings. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-58553-1_12
Published: 16 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58555-5
Online ISBN: 978-3-031-58553-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics