Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains

Branco, Paula; Torgo, Luís; Ribeiro, Rita P.

doi:10.1007/978-3-319-57454-7_54

Paula Branco^19,20,
Luís Torgo^19,20 &
Rita P. Ribeiro^19,20

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10234))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4246 Accesses
29 Citations

Abstract

The class imbalance problem is a key issue that has received much attention. This attention has been mostly focused on two-classes problems. Fewer solutions exist for the multi-classes imbalance problem. From an evaluation point of view, the class imbalance problem is challenging because a non-uniform importance is assigned to the classes. In this paper, we propose a relevance-based evaluation framework that incorporates user preferences by allowing the assignment of differentiated importance values to each class. The presented solution is able to overcome difficulties detected in existing measures and increases discrimination capability. The proposed framework requires the assignment of a relevance score to the problem classes. To deal with cases where the user is not able to specify each class relevance, we describe three mechanisms to incorporate the existing domain knowledge into the relevance framework. These mechanisms differ in the amount of information available and assumptions made regarding the domain. They also allow the use of our framework in common settings of multi-class imbalanced problems with different levels of information available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On Model Evaluation Under Non-constant Class Imbalance

An Analysis of Performance Metrics for Imbalanced Classification

A Comparative Study of Assessment Metrics for Imbalanced Learning

Notes

1.
The experimental framework, code and results of this evaluation is available in https://github.com/paobranco/Relevance-basedMulticlassImbalanceMetrics.

References

Brüggemann, R., Sørensen, P.B., Lerche, D., Carlsen, L.: Estimation of averaged ranks by a local partial order model#. J. Chem. Inf. Comput. Sci. 44(2), 618–625 (2004)
Article Google Scholar
Cohen, G., Hilario, M., Sax, H., Hugonnet, S., Geissbuhler, A.: Learning from imbalanced data in surveillance of nosocomial infection. Artif. Intell. Med. 37(1), 7–18 (2006)
Article Google Scholar
Dushnik, B., Miller, E.W.: Partially ordered sets. Am. J. Math. 63(3), 600–610 (1941)
Article MathSciNet MATH Google Scholar
Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)
Article Google Scholar
Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. SIGKDD Explor. Newsl. 12(1), 49–57 (2010)
Article Google Scholar
Gorodkin, J.: Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 28(5), 367–374 (2004)
Article MATH Google Scholar
Gu, Q., Zhu, L., Cai, Z.: Evaluation measures of the classification performance of imbalanced data sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds.) ISICA 2009. CCIS, vol. 51, pp. 461–471. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04962-0_53
Chapter Google Scholar
Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77(1), 103–123 (2009)
Article Google Scholar
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach 45(2), 171–186 (2001)
Article MATH Google Scholar
Hempstalk, K., Frank, E.: Discriminating against new classes: one-class versus multi-class classification. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 325–336. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89378-3_32
Chapter Google Scholar
Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA-Protein Struct. 405(2), 442–451 (1975)
Article Google Scholar
Mosley, L.: A balanced approach to the multi-class imbalance problem. Graduate Theses and Dissertations, Paper 13537 (2013)
Google Scholar
Sindhwani, V., Bhattacharya, P., Rakshit, S.: Information theoretic feature crediting in multiclass support vector machines. In: SDM, pp. 1–18. SIAM (2001)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)
Article Google Scholar
Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: ICDM, pp. 592–602. IEEE (2006)
Google Scholar
Wei, J.M., Yuan, X.J., Hu, Q.H., Wang, S.Q.: A novel measure for evaluating classifiers. Expert Syst. Appl. 37(5), 3799–3809 (2010)
Article Google Scholar

Download references

Acknowledgements

This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013. The work of P. Branco is supported by a Ph.D. scholarship of FCT (PD/BD/105788/2014).

Author information

Authors and Affiliations

LIAAD - INESC TEC, Porto, Portugal
Paula Branco, Luís Torgo & Rita P. Ribeiro
DCC - Faculdade de Ciências - Universidade do Porto, Porto, Portugal
Paula Branco, Luís Torgo & Rita P. Ribeiro

Authors

Paula Branco
View author publications
You can also search for this author in PubMed Google Scholar
Luís Torgo
View author publications
You can also search for this author in PubMed Google Scholar
Rita P. Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paula Branco .

Editor information

Editors and Affiliations

Kangwon National University, Chuncheon, Korea (Republic of)
**ho Kim
Seoul National University, Seoul, Korea (Republic of)
Kyuseok Shim
University of Technology Sydney, Sydney, New South Wales, Australia
Longbing Cao
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
University of New South Wales, Sydney, New South Wales, Australia
Xuemin Lin
Kangwon National University, Chuncheon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Branco, P., Torgo, L., Ribeiro, R.P. (2017). Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10234. Springer, Cham. https://doi.org/10.1007/978-3-319-57454-7_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-57454-7_54
Published: 23 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57453-0
Online ISBN: 978-3-319-57454-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On Model Evaluation Under Non-constant Class Imbalance

An Analysis of Performance Metrics for Imbalanced Classification

A Comparative Study of Assessment Metrics for Imbalanced Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On Model Evaluation Under Non-constant Class Imbalance

An Analysis of Performance Metrics for Imbalanced Classification

A Comparative Study of Assessment Metrics for Imbalanced Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation