Log in

Credit risk evaluation: a comprehensive study

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

To date, there has been relatively little research in the field of credit risk analysis that compares all of the well known statistical, optimization technique (heuristic methods) and machine learning based approaches in a single article. Review on credit risk assessment using sixteen well-known approaches has been conducted in this work. The accuracy of the machine learning approaches in dealing with financial difficulties is superior to that of traditional statistical methods, especially when dealing with nonlinear patterns, according to the findings. Hybrid or Ensemble algorithms, on the other hand have been found to outperform their traditional counterparts – standalone classifiers in the vast majority of situations. Finally, the paper compares the models with nine machine learning classifiers utilizing two benchmark datasets. In this study, we have encountered with 46 datasets, among them 35 datasets have been utilized for once; whereas among the other 11 datasets, Australian, German and Japanese are the three most frequently utilized datasets by the researchers. The study showed that the performance of ensemble classifiers were very much significant. As per the experimental result, for both datasets ensemble classifiers outperformed other standalone classifiers which validate with the prior research also. Although some of these approaches have a high level of accuracy, additional study is required to discover the right parameters and procedures for better outcomes in a transparent manner. Additionally this study is a valuable reference source for analyzing credit risk for both academic and practical domains, since it contains relevant information on the most major machine learning approaches employed so far.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

All data generated or analysed during this study are included in this article.

Abbreviations

AdaBoost:

Adaptive Boosting

ANFIS:

Adaptive Neuro-Fuzzy Inference System

ANN:

Artificial Neural Network

AUC:

Area Under Curve

BPNN:

Back-Propogation Neural Network

CART:

Classification And Regreesion Tree

CCR:

Candidate Classifier Repository

CGD:

Conjugate Gradient Desecent

CNN:

Convolutional Neural Network

ConsA:

Consensus Approach

CRJ:

Cycle Reservoir with Regular Jump

CSVM:

Clustered Suport Vector Machine

DA:

Discriminant Networks

DAG:

Directed Acylic Graph

DNN:

Deep Neural Network

DP:

Discriminate Power

DT:

Decision Tree

EAD:

Exposure At Default

EmNN:

Emotional Neural Network

EMPNGA:

Enhanced Multi-Population Niche Genetic Algorithm

FKNN:

Fuzzy K-Nearest Neighbour

FNN:

Feedforward Neural Network

GA:

Genetic Algorithm

GBDT:

Gradient Boosting Decision Tree

GD:

Gradient Descent

GNG:

Gabriel Neighbourhood Graph

GRNN:

General Regession Neural Network

GWO:

Grey Wolf Optimization

HMM:

Hidden Markov Model

IFOA:

Improved Fruit Fly Optimization Algorithm

IMF:

International Monetary Fund

KDD:

Knowledge Discovery in Data

KNN:

K- Nearest Neighbour

LDA:

Linear Discriminant Anaysis

LGD:

Loss Given Default

LM:

Levenberg – Marquadt

LR:

Logistic Regression

MARS:

Multivariate Adaptive Regression Splines

MLP:

Multilayer Perception

MLPNN:

Multilayer Perception Neural Network

MODE-GL:

Multi-Objective Evolutionary Algorithm

MPGA:

Multiple Population Genetic Algorithm

MSE:

Mean Squared Error

NB:

Naïve Bayes

NN:

Neural Network

OS:

One-step Secant

P2P:

Peer To Peer

PD :

Probability of Default

PNN:

Probalistic Neural Network

PSO:

Particle Swarm Optimization

PTVPSO:

Parallel TVPSO

RBF:

Radial Basis Function

RF:

Random Forest

RFoGAPS:

Random Forest optimized by genetic algorithm with profit score

RNN:

Recurrent Neural Network

ROC:

Receiver operating Characteristic

RoS:

Random Over Sampling

SME:

Small- and Medium-sized Enterprises

SMOTE:

Synthetic Minority Over-Sampling Technique

SVM:

Support vector Machine

TLP:

Traditional Linear Programming

TVPSO:

Time Variant Particle Swarm Optimization

UNCTAD:

UN Conference on Trade and Development

References

  1. Abdelmoula AK (2015) Bank credit risk analysis with k-nearest-neighbor classifier: Case of Tunisian banks. Account Manag Inf Syst 14(1):79

    Google Scholar 

  2. Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Financ 23(4):589–609

    Article  Google Scholar 

  3. Altman EI, Saunders A (1997) Credit risk measurement: developments over the last 20 years. J Bank Financ 21(11–12):1721–1742

    Article  Google Scholar 

  4. Anagnostou I, Kandhai D (2019) Risk factor evolution for counterparty credit risk under a hidden markov model. Risks 7(2):66

    Article  Google Scholar 

  5. Anderson B (2019) Using Bayesian networks to perform reject inference. Expert Syst Appl 137:349–356

    Article  Google Scholar 

  6. Atiya AF (2001) Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans Neural Netw 12(4):929–935

    Article  Google Scholar 

  7. Augasta MG, Kathirvalavakumar T (2012) Reverse engineering the neural networks for rule extraction in classification problems. Neural Process Lett 35(2):131–150

    Article  Google Scholar 

  8. Ayodele OE (2021) “Development of credit risk prediction model using support vector machine technique,” PhD Thesis, Federal University of Technology Akure

  9. Back B, Laitinen T, Sere K, van Wezel M (1996) Choosing bankruptcy predictors using discriminant analysis, logit analysis, and genetic algorithms. Turku Centre Comput Sci Tech Rep 40(2):1–18

    Google Scholar 

  10. Bahrammirzaee A (2010) A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Comput & Applic 19(8):1165–1195

    Article  Google Scholar 

  11. Balin BJ (2008) “Basel I, Basel II, and emerging markets: a nontechnical analysis,” Available at SSRN 1477712

  12. Baum LE, Eagon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull Am Math Soc 73(3):360–363

    Article  MathSciNet  MATH  Google Scholar 

  13. Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6):1554–1563

    Article  MathSciNet  MATH  Google Scholar 

  14. Bertsimas D, Tsitsiklis J (1993) Simulated annealing. Stat Sci 8(1):10–15

    Article  MATH  Google Scholar 

  15. Bhattacharya A, Ghatak S, Ghosh S, Das R (2014) “Simulated annealing approach onto VLSI circuit partitioning,”

  16. Biswas SK, Chakraborty M, Purkayastha B, Roy P, Thounaojam DM (2017) Rule extraction from training data using neural network. Int J Artif Intell Tools 26(03):1750006

    Article  Google Scholar 

  17. Chakraborty M, Biswas SK, Purkayastha B (2018) Recursive rule extraction from NN using reverse engineering technique. N Gener Comput 36(2):119–142

    Article  Google Scholar 

  18. Chakraborty M, Biswas SK, Purkayastha B (2019) Rule extraction from neural network using input data ranges recursively. N Gener Comput 37(1):67–96

    Article  Google Scholar 

  19. Chang Y-C, Chang K-H, Chu H-H, Tong L-I (2016) Establishing decision tree-based short-term default credit risk assessment models. Commun Stat Theory Methods 45(23):6803–6815

    Article  MathSciNet  MATH  Google Scholar 

  20. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  21. Chen H-L, Yang B, Wang G, Liu J, Xu X, Wang SJ, Liu DY (2011) A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowl-Based Syst 24(8):1348–1359

    Article  Google Scholar 

  22. Chen N, Ribeiro B, Chen A (2016) Financial credit risk assessment: a recent review. Artif Intell Rev 45(1):1–23

    Article  Google Scholar 

  23. Chi L-C, Tang T-C (2006) Bankruptcy prediction: application of logit analysis in export credit risks. Aust J Manag 31(1):17–27

    Article  Google Scholar 

  24. Chi G, Uddin MS, Abedin MZ, Yuan K (2019) Hybrid model for credit risk prediction: an application of neural network approaches. Int J Artif Intell Tools 28(05):1950017

    Article  Google Scholar 

  25. Chidambaram S, Srinivasagan KG (2019) Performance evaluation of support vector machine classification approaches in data mining. Clust Comput 22(1):189–196

    Article  Google Scholar 

  26. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  27. Dahiya S, Handa SS, Singh NP (2017) A feature selection enabled hybrid-bagging algorithm for credit risk evaluation. Expert Syst 34(6):e12217

    Article  Google Scholar 

  28. Danenas P, Garsva G (2015) Selection of support vector machines based classifiers for credit risk domain. Expert Syst Appl 42(6):3194–3204

    Article  Google Scholar 

  29. Dorigo M, Di Caro G (1999) “Ant colony optimization: a new meta-heuristic,” in Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), vol. 2, pp. 1470–1477

  30. Estrella A (2000) “Credit ratings and complementary sources of credit quality information,”

  31. Fatemi A, Fooladi I (2006) “Credit risk management: a survey of practices,” Managerial Finance

  32. From global pandemic to prosperity for all: avoiding another lost decade. (2020)

  33. Gavira-Durón N, Gutierrez-Vargas O, Cruz-Aké S (2021) Markov Chain K-Means Cluster Models and Their Use for Companies’ Credit Quality and Default Probability Estimation. Mathematics 9(8):879

    Article  Google Scholar 

  34. Goldberg DE, Holland JH (1988) “Genetic algorithms and machine learning,”

  35. Gyamfi . K, Abdulai J-D (2018) “Bank fraud detection using support vector machine,” in 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 37–41

  36. Harris T (2015) Credit scoring using the clustered support vector machine. Expert Syst Appl 42(2):741–750

    Article  Google Scholar 

  37. He J, Liu X, Shi Y, Xu W, Yan N (2004) Classifications of credit cardholder behavior by using fuzzy linear programming. Int J Inf Technol Decis Mak 3(04):633–650

    Article  Google Scholar 

  38. Henley WE (1997) Construction of a k-nearest-neighbour credit-scoring system. IMA J Manag Math 8(4):305–321

    Article  MATH  Google Scholar 

  39. Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–73

    Article  Google Scholar 

  40. Hu J, Cai J (2017) “Internet Credit Risk Scoring Based on Simulated Annealing and Genetic Algorithm,” in 2017 International Conference on Applied Mathematics, Modelling and Statistics Application (AMMSA 2017), pp. 373–377

  41. Huang J-J, Tzeng G-H, Ong C-S (2006) Two-stage genetic programming (2SGP) for the credit scoring model. Appl Math Comput 174(2):1039–1053

    MathSciNet  MATH  Google Scholar 

  42. Huang C-L, Chen M-C, Wang C-J (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33(4):847–856

    Article  Google Scholar 

  43. Huang X, Liu X, Ren Y (2018) Enterprise credit risk evaluation based on neural network algorithm. Cogn Syst Res 52:317–324

    Article  Google Scholar 

  44. Imandoust SB, Bolandraftar M (2013) Application of k-nearest neighbor (knn) approach for predicting economic events: theoretical background. Int J Eng Res Appl 3(5):605–610

    Google Scholar 

  45. Jiang Y (2009) “Credit scoring model based on the decision tree and the simulated annealing algorithm,” in 2009 WRI world congress on computer science and information engineering, vol. 4, pp. 18–22

  46. Khashman A (2010) Neural networks for credit risk evaluation: investigation of different neural models and learning schemes. Expert Syst Appl 37(9):6233–6239

    Article  Google Scholar 

  47. Khashman A (2011) Credit risk evaluation using neural networks: emotional versus conventional models. Appl Soft Comput 11(8):5477–5484

    Article  Google Scholar 

  48. Khemakhem S, Said FB, Boujelbene Y (2018) “Credit risk assessment for unbalanced datasets based on data mining, artificial neural network and support vector machines,” J Modell Manag

  49. Konglai ZHU, **g**g LI (2011) Studies of discriminant analysis and logistic regression model application in credit risk for China’s listed companies. Manag Sci Eng 4(4):24–32

    Google Scholar 

  50. Le R, Ku H, Jun D (2021) Sequence-based clustering applied to long-term credit risk assessment. Expert Syst Appl 165:113940

    Article  Google Scholar 

  51. Leo M, Sharma S, Maddulety K (2019) Machine learning in banking risk management: A literature review. Risks 7(1):29

    Article  Google Scholar 

  52. Lileikienė A (2008) “Analysis of chosen strategies of asset and liability management in commercial banks,” Eng Econ, vol. 57, no. 2

  53. Marinakis Y, Marinaki M, Doumpos M, Matsatsinis N, Zopounidis C (2008) Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. J Glob Optim 42(2):279–293

    Article  MathSciNet  MATH  Google Scholar 

  54. Marinakis Y, Marinaki M, Zopounidis C (2008) Application of ant colony optimization to credit risk assessment. New Math Natural Comput 4(01):107–122

    Article  MATH  Google Scholar 

  55. Marinakis Y, Marinaki M, Doumpos M, Zopounidis C (2009) Ant colony and particle swarm optimization for financial classification problems. Expert Syst Appl 36(7):10604–10611

    Article  Google Scholar 

  56. Martens D, Van Gestel T, De Backer M, Haesen R, Vanthienen J, Baesens B (2010) Credit rating prediction using ant colony optimization. J Oper Res Soc 61(4):561–573

    Article  Google Scholar 

  57. Masmoudi K, Abid L, Masmoudi A (2019) Credit risk modeling using Bayesian network with a latent variable. Expert Syst Appl 127:157–166

    Article  Google Scholar 

  58. Metawa N, Hassan MK, Elhoseny M (2017) Genetic algorithm based model for optimizing bank lending decisions. Expert Syst Appl 80:75–82

    Article  Google Scholar 

  59. Miller LH, LaDue EL (1988) “Credit assessment models for farm borrowers: a logit analysis,”

  60. Mohammadi N, Zangeneh M (2016) Customer credit risk assessment using artificial neural networks. IJ Information Technol Comput Sci 8(3):58–66

    Google Scholar 

  61. Moula FE, Guotai C, Abedin MZ (2017) Credit default prediction modeling: an application of support vector machine. Risk Manag 19(2):158–187

    Article  Google Scholar 

  62. Nazari M, Alidadi M (2013) Measuring credit risk of bank customers using artificial neural network. J Manag Res 5(2):17

    Google Scholar 

  63. Oguz HT, Gurgen FS (2008) “Credit risk analysis using hidden markov model,” in 2008 23rd International Symposium on Computer and Information Sciences, pp. 1–5

  64. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064

    Article  Google Scholar 

  65. Pacelli V, Azzollini M (2011) An artificial neural network approach for credit risk management. J Intell Learn Syst Appl 3(02):103

    Google Scholar 

  66. Pavlenko T, Chernyak O (2010) Credit risk modeling using bayesian networks. Int J Intell Syst 25(4):326–344

    MATH  Google Scholar 

  67. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Article  Google Scholar 

  68. Regulation R-BB (2009) “Foundations of banking risk,”

  69. Rodan A, Faris H (2016) “Credit risk evaluation using cycle reservoir neural networks with support vector machines readout,” in Asian Conference on Intelligent Information and Database Systems, pp. 595–604

  70. Roy AG, Urolagin S (2019) “Credit risk assessment using decision tree and support vector machine based data analytics,” in Creative Business and Social Innovations for a Sustainable Future, Springer, pp. 79–84

  71. Satchidananda SS, Simha JB (2006) Comparing decision trees with logistic regression for credit risk analysis. International Institute of Information Technology, Bangalore

    Google Scholar 

  72. Setiono R, Baesens B, Mues C (2008) Recursive neural network rule extraction for data with mixed attributes. IEEE Trans Neural Netw 19(2):299–307

    Article  Google Scholar 

  73. Souza CR (2010) Kernel functions for machine learning applications. Creative Commons Attribution-Noncommercial-Share Alike 3:29

    Google Scholar 

  74. Tian Z, **ao J, Feng H, Wei Y (2020) Credit risk assessment based on gradient boosting decision tree. Procedia Comput Sci 174:150–160

    Article  Google Scholar 

  75. Triki MW, Boujelbene Y (2017) “Bank credit risk: evidence from Tunisia using Bayesian networks,”

  76. Uddin MS (2021) “Machine learning in credit risk modeling: empirical application of neural network approaches,” The Fourth Industrial Revolution: Implementation of Artificial Intelligence for Growing Business Success, pp. 417–435

  77. Wang Y, Duan D (2021) Research on risk assessment of clients before loan based on decision tree algorithm. J Phys Conf Ser 1774(1):012056

    Article  Google Scholar 

  78. Wang T, Li J (2019) An improved support vector machine and its application in P2P lending personal credit scoring. IOP Conf Series: Mater Sci Eng 490(6):062041

    Article  Google Scholar 

  79. Wang S, Mathew A, Chen Y, ** L, Ma L, Lee J (2009) Empirical analysis of support vector machine ensemble classifiers. Expert Syst Appl 36(3):6466–6476

    Article  Google Scholar 

  80. Wang D, Zhang Z, Bai R, Mao Y (2018) A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring. J Comput Appl Math 329:307–321

    Article  MathSciNet  MATH  Google Scholar 

  81. Ye X, Dong L, Ma D (2018) Loan evaluation in P2P lending based on random forest optimized by genetic algorithm with profit score. Electron Commer Res Appl 32:23–36

    Article  Google Scholar 

  82. Yurynets R, Yurynets Z, Dosyn D, Kis Y (2019) “Risk Assessment Technology of Crediting with the Use of Logistic Regression Model.,” in COLINS, pp. 153–162

  83. Zhang R, Wang W (2011) Facilitating the applications of support vector machine by using a new kernel. Expert Syst Appl 38(11):14225–14230

    Article  MathSciNet  Google Scholar 

  84. Zhang W, He H, Zhang S (2019) A novel multi-stage hybrid model with enhanced multi-population niche genetic algorithm: an application in credit scoring. Expert Syst Appl 121:221–232

    Article  Google Scholar 

Download references

Acknowledgements

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arijit Bhattacharya.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharya, A., Biswas, S.K. & Mandal, A. Credit risk evaluation: a comprehensive study. Multimed Tools Appl 82, 18217–18267 (2023). https://doi.org/10.1007/s11042-022-13952-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13952-3

Keywords

Navigation