A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost

Gómez-Ríos, Anabel; Luengo, Julián; Herrera, Francisco

doi:10.1007/978-3-319-59650-1_23

Anabel Gómez-Ríos¹⁷,
Julián Luengo¹⁷ &
Francisco Herrera¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10334))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

3758 Accesses

Abstract

In classification, class noise alludes to incorrect labelling of instances and it causes the classifiers to perform worse. In this contribution, we test the resistance against noise of the most influential boosting algorithms. We explain the fundamentals of these state-of-the-art algorithms, providing an unified notation to facilitate their comparison. We analyse how they carry out the classification, what loss functions use and what techniques employ under the boosting scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

Article 06 August 2014

AR-Boost: Reducing Overfitting by a Robust Data-Driven Regularization Strategy

Notes

1.
http://keel.es/datasets.php.

References

Alfaro, E., Gámez, M., García, N.: Adabag: an R package for classification with boosting and bagging. J. Stat. Softw. 54(2), 1–35 (2013). https://www.jstatsoft.org/article/view/v054i02
Article Google Scholar
Álvarez, P.M., Luengo, J., Herrera, F.: A first study on the use of boosting for class noise reparation. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS, vol. 9648, pp. 549–559. Springer, Cham (2016). doi:10.1007/978-3-319-32034-2_46
Chapter Google Scholar
Cao, J., Kwong, S., Wang, R.: A noise-detection based AdaBoost algorithm for mislabeled data. Pattern Recogn. 45(12), 4451–4465 (2012)
Article MATH Google Scholar
Chen, T., Gestrin, C.: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)
Article Google Scholar
Frénay, B., Verleysen, M.: Classification in the presence of noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)
Article Google Scholar
Freund, Y., Schapire, R.E.: Foundations and algorithms. MIT press, Cambridge (2012)
MATH Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 337–374 (2002)
MathSciNet Google Scholar
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38, 367–378 (2002)
Article MATH MathSciNet Google Scholar
García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, New York (2015)
Book Google Scholar
Karmaker, A., Kwek, S.: A boosting approach to remove class label noise. Int. J. Hybrid Intell. Syst. 3(3), 169–177 (2006)
Article MATH Google Scholar
McDonald, R.A., Hand, D.J., Eckley, I.A.: An empirical comparison of three boosting algorithms on real data sets with artificial class noise. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 35–44. Springer, Heidelberg (2003). doi:10.1007/3-540-44938-8_4
Chapter Google Scholar
Miao, Q., Cao, Y., **a, G., Gong, M., Liu, J., Song, J.: RBoost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2216–2228 (2015)
Article MathSciNet Google Scholar
Rätsch, G., Onoda, T., Mller, K.R.: Soft margins for AdaBoost. Mach. Learn. 42(3), 287–320 (2001)
Article MATH Google Scholar
Ridgeway, G.: Generalized Boosted Models: A guide to the gbm package. Update 1(1), 1–15 (2007)
Google Scholar
Sáez, J.A., Luengo, J., Herrera, F.: Evaluating the classifier behaviour with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176, 26–35 (2016)
Article Google Scholar
Sun, B., Chen, S., Wang, J., Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowl. Based Syst. 102, 87–102 (2016)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Research Project TIN2014-57251-P and Andalusian Research Plan P11-TIC-7765.

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, University of Granada, CITIC-UGR, 18071, Granada, Spain
Anabel Gómez-Ríos, Julián Luengo & Francisco Herrera

Authors

Anabel Gómez-Ríos
View author publications
You can also search for this author in PubMed Google Scholar
Julián Luengo
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anabel Gómez-Ríos .

Editor information

Editors and Affiliations

University of La Rioja , Logroño, La Rioja, Spain
Francisco Javier Martínez de Pisón
University of La Rioja , Logroño, La Rioja, Spain
Rubén Urraca
University of A Coruña , Ferrol, La Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gómez-Ríos, A., Luengo, J., Herrera, F. (2017). A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-59650-1_23
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

AR-Boost: Reducing Overfitting by a Robust Data-Driven Regularization Strategy

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

AR-Boost: Reducing Overfitting by a Robust Data-Driven Regularization Strategy

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation