A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10334))

Included in the following conference series:

  • 3758 Accesses

Abstract

In classification, class noise alludes to incorrect labelling of instances and it causes the classifiers to perform worse. In this contribution, we test the resistance against noise of the most influential boosting algorithms. We explain the fundamentals of these state-of-the-art algorithms, providing an unified notation to facilitate their comparison. We analyse how they carry out the classification, what loss functions use and what techniques employ under the boosting scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://keel.es/datasets.php.

References

  1. Alfaro, E., Gámez, M., García, N.: Adabag: an R package for classification with boosting and bagging. J. Stat. Softw. 54(2), 1–35 (2013). https://www.jstatsoft.org/article/view/v054i02

    Article  Google Scholar 

  2. Álvarez, P.M., Luengo, J., Herrera, F.: A first study on the use of boosting for class noise reparation. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS, vol. 9648, pp. 549–559. Springer, Cham (2016). doi:10.1007/978-3-319-32034-2_46

    Chapter  Google Scholar 

  3. Cao, J., Kwong, S., Wang, R.: A noise-detection based AdaBoost algorithm for mislabeled data. Pattern Recogn. 45(12), 4451–4465 (2012)

    Article  MATH  Google Scholar 

  4. Chen, T., Gestrin, C.: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

    Google Scholar 

  5. Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)

    Article  Google Scholar 

  6. Frénay, B., Verleysen, M.: Classification in the presence of noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)

    Article  Google Scholar 

  7. Freund, Y., Schapire, R.E.: Foundations and algorithms. MIT press, Cambridge (2012)

    MATH  Google Scholar 

  8. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 337–374 (2002)

    MathSciNet  Google Scholar 

  9. Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38, 367–378 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  10. García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, New York (2015)

    Book  Google Scholar 

  11. Karmaker, A., Kwek, S.: A boosting approach to remove class label noise. Int. J. Hybrid Intell. Syst. 3(3), 169–177 (2006)

    Article  MATH  Google Scholar 

  12. McDonald, R.A., Hand, D.J., Eckley, I.A.: An empirical comparison of three boosting algorithms on real data sets with artificial class noise. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 35–44. Springer, Heidelberg (2003). doi:10.1007/3-540-44938-8_4

    Chapter  Google Scholar 

  13. Miao, Q., Cao, Y., **a, G., Gong, M., Liu, J., Song, J.: RBoost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2216–2228 (2015)

    Article  MathSciNet  Google Scholar 

  14. Rätsch, G., Onoda, T., Mller, K.R.: Soft margins for AdaBoost. Mach. Learn. 42(3), 287–320 (2001)

    Article  MATH  Google Scholar 

  15. Ridgeway, G.: Generalized Boosted Models: A guide to the gbm package. Update 1(1), 1–15 (2007)

    Google Scholar 

  16. Sáez, J.A., Luengo, J., Herrera, F.: Evaluating the classifier behaviour with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176, 26–35 (2016)

    Article  Google Scholar 

  17. Sun, B., Chen, S., Wang, J., Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowl. Based Syst. 102, 87–102 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Research Project TIN2014-57251-P and Andalusian Research Plan P11-TIC-7765.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anabel Gómez-Ríos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Gómez-Ríos, A., Luengo, J., Herrera, F. (2017). A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59650-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59649-5

  • Online ISBN: 978-3-319-59650-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation