Bank Failure Prediction: A Comparison of Machine Learning Approaches

  • Chapter
  • First Online:
Financial Risk Management and Modeling

Abstract

This paper is a comprehensive and complete research on bank failures that we examine from many different perspectives. It compromises a comprehensive dataset of ~60,000 observations for an extensive period (2005–2014) and examines different prediction horizons prior to failure. Moreover, we explore whether the addition of variables related to the diversification of the banks’ activities along with local effects, improve the predictability of the models. Seven popular and widely used machine learning techniques are compared under different performance metrics, using a bootstrap analysis. The results show that mid to long-term prediction improves significantly with the addition of diversification variables. Local effects exist and further improve the results, while, support vector machines, gradient boosting, and random forests outperform traditional models with the performance differences increasing over longer prediction horizons.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 116.04
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 158.24
Price includes VAT (France)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Abdou H, Pointon J, El-Masry A (2008) Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Syst Appl 35(3):1275–1292

    Article  Google Scholar 

  • Affes Z, Hentati-Kaffel R (2019) Predicting US banks bankruptcy: logit versus canonical discriminant analysis. Comput Econ 54:199–244

    Google Scholar 

  • Al-Aidaroos KM, Bakar AA, Othman Z (2012) Medical data classification with Naive Bayes approach. Inf Technol J 11(9):1166–1174

    Article  Google Scholar 

  • Almeida TA, Almeida J, Yamakami A (2011) Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers. Journal of Internet Services and Applications 1(3):183–200

    Article  Google Scholar 

  • Altman EI (1968) The prediction of corporate bankruptcy: a discriminant analysis. J Financ 23(1):193–194

    Google Scholar 

  • Altman EI, Sabato G (2007) Modelling credit risk for SMEs: evidence from the U.S. market. Abacus 43(3):332–357

    Article  Google Scholar 

  • Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7):1545–1588

    Article  Google Scholar 

  • Androutsopoulos I, Koutsias J, Chandrinos KV, Spyropoulos CD (2000) An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR’00, pp 160–167

    Google Scholar 

  • Arena M (2008) Bank failures and bank fundamentals: a comparative analysis of Latin America and East Asia during the nineties using bank-level data. J Bank Financ 32(2):299–310

    Article  Google Scholar 

  • Awad WA, ELseuofi SM (2011) Machine learning methods for spam e-mail. Int J Comput Sci Inf Technol 3(1):173–184

    Google Scholar 

  • Baele L, Jonghe OD, Venner RV (2007) Does the stock market value bank diversification? J Bank Financ 31:1999–2023

    Article  Google Scholar 

  • Baesens B, Van Gestel T, Viaene S, Stepanova M, Suykens JAK, Vanthienen J (2003) Benchmarking state of the art classification algorithms for credit scoring. J Oper Res Soc 54(6):627–635

    Article  Google Scholar 

  • Beaver (1966) Financial ratios as predictors of failure. Empirical research in accounting: selected studies 1966. J Account Res 4:71–111

    Article  Google Scholar 

  • Becchetti L, Sierra J (2003) Bankruptcy risk and productive efficiency in manufacturing firms. J Bank Financ 27(11):2099–2120

    Article  Google Scholar 

  • Beltratti A, Stulz RM (2012) The credit crisis around the globe: why did some banks perform better? J Financ Econ 105(1):1–17

    Article  Google Scholar 

  • Betz F, Opric S, Peltonen TA, Sarlin P (2014) Predicting distress in European banks. J Bank Financ 45(1):225–241

    Article  Google Scholar 

  • Bharath ST, Shumway T (2008) Forecasting default with the Merton distance to default model. Rev Financ Stud 21(3):1339–1369

    Article  Google Scholar 

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York

    Google Scholar 

  • Bosch A, Zisserman A, Mu X, Munoz X (2007) Image classification using random forests and ferns. In: IEEE 11th international conference on computer vision (ICCV), pp 1–8

    Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Brown I, Mues C (2012) An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst Appl 39(3):3446–3453

    Article  Google Scholar 

  • Carmona P, Climent F, Momparler A (2019) Predicting bank failure in the U.S. banking sector: an extreme gradient boosting approach. Int Rev Econ Financ 61:304–323

    Google Scholar 

  • Chen MY (2011) Predicting corporate financial distress based on integration of decision tree classification and logistic regression. Expert Syst Appl 38(9):11261–11272

    Article  Google Scholar 

  • Chen T, Guestrin C (2016) XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, pp. 785–794

    Google Scholar 

  • Chen T, He T (2015) xgboost : eXtreme Gradient Boosting. R Package Version 0.4-2, 1–4

    Google Scholar 

  • Chen J, Huang H, Tian S, Qu Y (2009) Feature selection for text classification with Naïve Bayes. Expert Syst Appl 36:5432–5435

    Article  Google Scholar 

  • Cleary S, Hebb G (2016) An efficient and functional model for predicting bank distress: in and out of sample evidence. J Bank Financ 64:101–111

    Article  Google Scholar 

  • Cole RA, White LJ (2012) Déjà vu all over again : the causes of U. S. commercial bank failures this time around. J Financial Serv Res 42(1–2):5–29

    Article  Google Scholar 

  • Cox RAK, Wang GWY (2014) Predicting the US bank failure: a discriminant analysis. Econ Anal Pol 44(2):202–211

    Article  Google Scholar 

  • Crook JN, Edelman DB, Thomas LC (2007) Recent developments in consumer credit risk assessment. Eur J Oper Res 183(3):1447–1465

    Article  Google Scholar 

  • Demyanyk Y, Hasan I (2009) Financial crisis and bank failure: a review of prediction methods. Omega 38(5):315–324

    Article  Google Scholar 

  • Desai VS, Crook JN, Overstreet GA (1996) A comparison of neural networks and linear scoring models in the credit union environment. Eur J Oper Res 95(1):24–37

    Article  Google Scholar 

  • Díaz-Uriarte R, Alvarez de Andrés S (2006) Gene selection and classification of microarray data using random forest. BMC Bioinf 7(3)

    Google Scholar 

  • Doumpos M, Zopounidis C (1998) The use of the preference disaggregation analysis in the assessment of financial risks. Fuzzy Econ Rev 3(1):39–57

    Article  Google Scholar 

  • Doumpos M, Gaganis C, Pasiouras F (2016) Bank diversification and overall financial strength: international evidence. Financ Mark Inst Instrum 25(3):169–213

    Article  Google Scholar 

  • Flannery MJ (1998) Using market information in prudential bank supervision: a review of the U.S. empirical evidence. J Money Credit Bank 30(3):273–305

    Article  Google Scholar 

  • Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth international conference on machine learning, Vol. pages, pp 148–156

    Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  Google Scholar 

  • Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

    Article  Google Scholar 

  • Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77(1):103–123

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer, New York

    Book  Google Scholar 

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  • Hosmer DW, Lemeshow S (1989) Applied logistic regression. Wiley, New York

    Google Scholar 

  • Ince H, Trafalis TB (2006) A hybrid model for exchange rate prediction. Decis Support Syst 42(2):1054–1062

    Article  Google Scholar 

  • Kharya S, Soni S (2016) Weighted Naive Bayes classifier: a predictive model for breast cancer detection. Int J Comput Appl 133(9):32–37

    Google Scholar 

  • Kharya S, Durg C, India CG, Soni S (2014) Naive Bayes classifiers: a probabilistic detection model for breast cancer Shika Agrawal. Int J Comput Appl 92(10):975–8887

    Google Scholar 

  • Kolari J, Glennon D, Shin H, Caputo M (2002) Predicting large US commercial bank failures. J Econ Bus 54(4):361–387

    Article  Google Scholar 

  • Laeven L, Levine R (2007) Is there a diversification discount in financial conglomerates? J Financ Econ 85(2):331–367

    Article  Google Scholar 

  • Lenard MJ, Alam P, Madey GR (1995) The application of neural networks and a qualitative response model to the auditor’s going concern uncertainty decision. Decis Sci 26(2):209–227

    Article  Google Scholar 

  • López Iturriaga FJ, Sanz IP (2015) Bankruptcy visualization and prediction using neural networks: a study of U.S. commercial banks. Expert Syst Appl 42(6):2857–2868

    Article  Google Scholar 

  • Männasoo K, Mayes DG (2009) Explaining bank distress in eastern European transition economies. J Bank Financ 33(2):244–253

    Article  Google Scholar 

  • Martens D, Baesens B, Van Gestel T, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur J Oper Res 183(3):1466–1476

    Article  Google Scholar 

  • Mascaro J, Asner GP, Knapp DE, Kennedy-Bowdoin T, Martin RE, Anderson C et al (2014) A tale of two “forests”: random forest machine learning aids tropical forest carbon map**. PLoS One 9(1):e85993

    Article  Google Scholar 

  • McLeay S, Omar A (2000) The sensitivity of prediction models to the non-normality of bounded and unbounded financial ratios. Br Account Rev 32(2):213–230

    Article  Google Scholar 

  • Min JH, Lee Y-C (2005) Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst Appl 28(4):603–614

    Article  Google Scholar 

  • Ooghe H, De Bourdeaudhuij C, Joos P (1995) Financial distress models in Belgium: the results of a decade of empirical research. Int J Account 31(3):245–274

    Google Scholar 

  • Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222

    Article  Google Scholar 

  • Park H, Konishi S (2016) Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection. J Stat Comput Simul 86(7):1450–1461

    Article  Google Scholar 

  • Park H, Kim N, Lee J (2014) Parametric models and non-parametric machine learning models for predicting option prices: empirical comparison study over KOSPI 200 index options. Expert Syst Appl 41(11):5227–5237

    Article  Google Scholar 

  • Ravi Kumar P, Ravi V (2007) Bankruptcy prediction in banks and firms via statistical and intelligent techniques – a review. Eur J Oper Res 180(1):1–28

    Article  Google Scholar 

  • Sarkar S, Sriram RS (2001) Bayesian models for early warning of bank failures. Manag Sci 47(11):1457–1475

    Article  Google Scholar 

  • Tanaka K, Kinkyo T, Hamori S (2016) Random forests-based early warning system for bank failures. Econ Lett 148:118–121

    Article  Google Scholar 

  • Tsai CF, Hsu YF, Yen DC (2014) A comparative study of classifier ensembles for bankruptcy prediction. Appl Soft Comput J 24:977–984

    Article  Google Scholar 

  • Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    Book  Google Scholar 

  • Zopounidis C, Dimitras AI (1998) Multicriteria decision aid methods for the prediction of business failure. Springer, New York

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michalis Doumpos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Manthoulis, G., Doumpos, M., Zopounidis, C., Galariotis, E., Baourakis, G. (2021). Bank Failure Prediction: A Comparison of Machine Learning Approaches. In: Zopounidis, C., Benkraiem, R., Kalaitzoglou, I. (eds) Financial Risk Management and Modeling. Risk, Systems and Decisions. Springer, Cham. https://doi.org/10.1007/978-3-030-66691-0_10

Download citation

Publish with us

Policies and ethics

Navigation