Abstract
Learning curves are important for decision-making in supervised machine learning. They show how the performance of a machine learning model develops over a given resource. In this work, we consider learning curves that describe the performance of a machine learning model as a function of the number of data points used for training. It is often useful to extrapolate learning curves, which can be done by fitting a parametric model based on the observed values, or by using an extrapolation model trained on learning curves from similar datasets. We perform an extensive analysis comparing these two methods with different observations and prediction objectives. Depending on the setting, different extrapolation methods perform best. When a small number of initial segments of the learning curve have been observed we find that it is better to rely on learning curves from similar datasets. Once more observations have been made, a parametric model, or just the last observation, should be used. Moreover, using a parametric model is mostly useful when the exact value of the final performance itself is of interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The authors use the name mmf4 for the version of MMF that is used in our work.
- 2.
The authors use the name A_MDS for the version of MDS used in our work.
- 3.
References
Bousquet, O., Hanneke, S., Moran, S., van Handel, R., Yehudayoff, A.: A theory of universal learning. In: STOC 2021: 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 532–541. ACM (2021)
Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J.: Metalearning: Applications to Automated Machine Learning and Data Mining, 2nd edn. Springer (2022). https://doi.org/10.1007/978-3-030-67024-5
Chandrashekaran, A., Lane, I.R.: Speeding up hyper-parameter optimization by extrapolation of learning curves using previous builds. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10534, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71249-9_29
Cortes, C., Jackel, L.D., Solla, S.A., Vapnik, V., Denker, J.S.: Learning curves: asymptotic values and rate of convergence. Adv. Neural Info. Proc. Syst. 6, 327–334 (1993)
Gu, B., Hu, F., Liu, H.: Modelling classification performance for large data sets. In: Wang, X.S., Yu, G., Lu, H. (eds.) WAIM 2001. LNCS, vol. 2118, pp. 317–328. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47714-4_29
Hutter, M.: Learning curve theory. CoRR abs/2102.04074 (2021)
John, G.H., Langley, P.: Static versus dynamic sampling for data mining. In: KDD’96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 367–370. AAAI Press (1996)
Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast bayesian optimization of machine learning hyperparameters on large datasets. In: International Conference on Artificial Intelligence and Statistics, AISTATS 2017. Proceedings of Machine Learning Research, vol. 54, pp. 528–536. PMLR (2017)
Klein, A., Falkner, S., Springenberg, J.T., Hutter, F.: Learning curve prediction with bayesian neural networks. In: International Conference on Learning Representations (ICLR) (2017)
Last, M.: Improving data mining utility with projective sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 487–496. ACM (2009)
Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: International Conference on Machine Learning (ICML 2005). ACM International Conference Proceeding Series, vol. 119, pp. 497–503. ACM (2005)
Leite, R., Brazdil, P.: Active testing strategy to predict the best classification algorithm via sampling and metalearning. In: ECAI 2010 - 19th European Conference on Artificial Intelligence. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 309–314. IOS Press (2010)
Mohr, F., van Rijn, J.N.: Learning curves for decision making in supervised machine learning – a survey. CoRR abs/2201.12150 (2022)
Mohr, F., van Rijn, J.N.: Fast and informative model selection using learning curve cross-validation. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9669–9680 (2023)
Mohr, F., Viering, T.J., Loog, M., van Rijn, J.N.: LCDB 1.0: An extensive learning curves database for classification tasks. In: Amini, M.-R., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds.) Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part V, pp. 3–19. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26419-1_1
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Proc. Syst. 32, 8024–8035 (2019)
Provost, F.J., Jensen, D.D., Oates, T.: Efficient progressive sampling. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–32. ACM (1999)
van Rijn, J.N., Abdulrahman, S.M., Brazdil, P., Vanschoren, J.: Fast algorithm selection using learning curves. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 298–309. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24465-5_26
Swersky, K., Snoek, J., Adams, R.P.: Freeze-thaw bayesian optimization. CoRR abs/1406.3896 (2014)
Viering, T., Loog, M.: The shape of learning curves: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7799–7819 (2022)
Weiss, G.M., Tian, Y.: Maximizing classifier utility when there are data acquisition and modeling costs. Data Min. Knowl. Disc. 17(2), 253–282 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kielhöfer, L., Mohr, F., van Rijn, J.N. (2024). Learning Curve Extrapolation Methods Across Extrapolation Settings. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-58553-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58555-5
Online ISBN: 978-3-031-58553-1
eBook Packages: Computer ScienceComputer Science (R0)