Abstract
In this paper we propose a new algorithm called SPIDER3 for selective preprocessing of multi-class imbalanced data sets. While it borrows selected ideas (i.e., combination of relabeling and local resampling) from its predecessor – SPIDER2, it introduces several important extensions. Unlike SPIDER2, it is able to handle directly multi-class problems. Moreover, it considers the relevance of specific decision classes to control the order of their processing. Finally, it uses information about relations between specific classes (modeled with misclassification costs) to better control the extent of changes introduced locally to preprocessed data. We performed a computational experiment on artificial 3-class data sets to evaluate and compare SPIDER3 to SPIDER2 with temporarily aggregated classes and the results confirmed advantages of the new algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
See the on-line appendix available at http://www.cs.put.poznan.pl/swilk/cores2017/spider3-appendix.pdf.
References
Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, IJCAI 2001, vol. 2, pp. 973–978. Morgan Kaufmann Publishers Inc., San Francisco (2001). http://dl.acm.org/citation.cfm?id=1642194.1642224
Fernández, A., López, V., Galar, M., Del Jesus, M.J., Herrera, F.: Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches. Know.-Based Syst. 42, 97–110 (2013). http://dx.doi.org/10.1016/j.knosys.2013.01.018
He, H., Ma, Y.: Imbalanced Learning: Foundations, Algorithms and Applications. Wiley, New York (2013)
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artifi. Intell. 5(4), 221–232 (2016)
Maciejewski, T., Stefanowski, J.: Local neighbourhood extension of SMOTE for mining imbalanced data. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, pp. 104–111 (2011)
Napierala, K., Stefanowski, J.: Types of minority class examples and their influence on learning classifiers from imbalanced data. J. Intell. Inform. Syst. 46, 563–597 (2016)
Napierała, K., Stefanowski, J., Wilk, S.: Learning from imbalanced data in presence of noisy and borderline examples. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 158–167. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13529-3_18
Sáez, J.A., Krawczyk, B., Woźniak, M.: Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recognit. 57, 164–178 (2015)
Wilk, S., Stefanowski, J., Wojciechowski, S., Farion, K.J., Michalowski, W.: Application of preprocessing methods to imbalanced clinical data: an experimental study. In: Piętka, E., Badura, P., Kawa, J., Wieclawek, W. (eds.) Information Technologies in Medicine. AISC, vol. 471, pp. 503–515. Springer, Cham (2016). doi:10.1007/978-3-319-39796-2_41
Acknowledgments
The authors would like to acknowledge support by the Polish National Science Center under Grant No. DEC-2013/11/B/ST6/00963.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Wojciechowski, S., Wilk, S., Stefanowski, J. (2018). An Algorithm for Selective Preprocessing of Multi-class Imbalanced Data. In: Kurzynski, M., Wozniak, M., Burduk, R. (eds) Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017. CORES 2017. Advances in Intelligent Systems and Computing, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-319-59162-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-59162-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59161-2
Online ISBN: 978-3-319-59162-9
eBook Packages: EngineeringEngineering (R0)