Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data

Museba, Tinofirei

doi:10.1007/978-3-031-46177-4_29

Tinofirei Museba^3,4

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Included in the following conference series:

ICABR Conference

303 Accesses

Abstract

The determination of the financial credibility of a loan applicant by financial institutions is quantified using a credit score. Sources of credit, such as banks and financial institutions, play a crucial role in sustaining economies and kee** cash flowing in the market. Financial institutions solve the problem of lack of data in credit scoring by extracting customer information from data sources such as social networks. Such data sources store data in large quantities. Traditional data mining techniques fail to accurately distinguish between a credit-worthy applicant and a non-creditworthy applicant using big data. The problem of big data has necessitated the advent of machine learning algorithms capable of sifting through large volumes of credit data sourced from social networks. Recently, to automate, streamline and digitise business processes such as credit scoring, machine learning approaches have been widely used, but the design and deployment of effective and robust credit scoring models require a lot of time, and if the behaviour of customers changes or the customer variables drift over time, the credit score model becomes obsolete or outdated. As a result, credit scoring tasks should be considered as an ephemeral scenario due to big data, as variables tend to drift over time. Incremental and adaptive credit scoring models can help to mitigate the loss of time of re-creating credit models due to drifting variables, big data challenges and changes in customer behaviour. This necessitates the design of robust and effective credit score models capable of learning incrementally, adaptive and able to detect changes. This paper proposes the Incremental Adaptive and Heterogeneous ensemble (IAHE) credit scoring model capable of learning incrementally, adapt to drifting variables and detect changes in customer behaviour and learn big data in a streaming fashion. Empirical experiments conducted indicate that IAHE has the strongest ability to recognise default samples and demonstrated the best generalisation ability on the datasets and the same time maintained a strong interpretability of the results when compared to nine credit scoring models on four public datasets. The superior generalisation performance of IAHE is statistically significant and demonstrated excellent robustness and adaptation to drifting variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Adaptive and Dynamic Heterogeneous Ensemble Model for Credit Scoring

An Investigation on Vietnamese Credit Scoring Based on Big Data Platform and Ensemble Learning

Credit Scoring with Drift Adaptation Using Local Regions of Competence

Article Open access 25 November 2022

References

Abellan, J., & Castellano J. G. (2017). A comparative study on base classifiers in ensemble methods for credit scoring. Expert Systems with Applications, 73, 1–10. https://doi.org/10.1016/j.eswa.2016.12.020
Barddal, J. P., Loezer, L., Enembreck, F., & Lanzuolo, R. (2020). Lessons learned from data stream classification applied to credit scoring. Expert Systems with Applications, 162, 113899.
Article Google Scholar
Biallas, M., & O’Neil, F. (2020). Artificial Intelligence innovation in financial services. www.ifc.org/thoughtleadership
Blochlinger, A., & Leippold, M. (2006). Economic benefit of powerful credit scoring. Journal of Banking and Finance, 30, 851–873.
Article Google Scholar
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority oversampling technique. Journal of Artificial Intelligence Research, 16(2002), 321–357.
Article Google Scholar
Chen, T., & Guestrin, C. (2016). A scalable tree boosting system. In proceedings of the 22nd ACM SIGKDD International Conference on knowledge discovery and data mining, 785–794. Publishing.
Google Scholar
Chen, X., Li, S., Xu, X., Meng, F., & Cao, W. (2023). A novel GSCI-based ensemble approach for credit scoring. IEEE Access, 8, 222449–222465. https://doi.org/10.1109/ACCESS.2020.3043937
Crook, J. N., Edelman, D. B., & Thomas, L. C. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183, 1447–1465.
Article Google Scholar
Cruz, R. M., Sabourin, R., & Cavalcanti, R. D. (2017). META-DES: Oracle: Meta-learning and feature selection for dynamic ensemble selection. Information Fusion, 38, 84–103.
Article Google Scholar
Demsar, J. (2006). Statistical comparison of classifiers over multiple datasets. Journal of Machine Learning Research, 7(1–30), 2006.
Google Scholar
Engelbrecht, A.P., (2002). Computational Intelligence: An Introduction. John Wiley and Sons, Chichester, December, 2002.
Google Scholar
Fan, H., Liu, W., **a, M. (2022). Credit scoring based on tree-enhanced gradient boosting decision trees. Expert Systems with Applications, 189, 116034.
Google Scholar
Frame, W. S., Srinivasan, A., & Woosley, L. (2001). The effect of credit scoring on small business lending. Journal of Money, Credit and Banking, 33(3), 813–825.
Article Google Scholar
Gicic, A., Donko, D., & Subasi, A. (2023). Intelligent credit scoring using deep learning methods. Concurrency and computation. Practice and Experience, 35(9).
Google Scholar
Gorzalczany, M., & Rudzinski, B. (2016). A multiobjective genetic optimisation for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability. Applied Soft Computing, 40, 206–220. https://doi.org/10.1016/j.asoc.2015.11.037
Article Google Scholar
Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A Review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160, 523–554.
Google Scholar
He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications, 98, 105–117. https://doi.org/10.1016/j.eswa.2018.01.012
Article Google Scholar
Hjelkrem, L. O., & Lange, P. E. (2023). Explaining deep learning models for credit scoring with SHAP: A case study using Open Banking Data. Journal of Risk and Financial Management, 16(4), 221. https://doi.org/10.3390/jrfm16040221
Hou, W., Kang-Wang, X., Wang, H. Z., & Li, L. (2020). A novel dynamic ensemble selection classifier for an imbalanced data set: An application for credit risk assessment. Knowledge Based Systems, 208, 106462. https://doi.org/10.1016/j.knosys.2020.106462
Article Google Scholar
Kennedy, J., & Eberhart, R. C., (1995). Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth Australia, 4, 1942–1948
Google Scholar
Kyeong, S., & Shin, J. (2022). Two-stage credit scoring using Bayesian approach. Journal of Big Data, 9, 106. https://doi.org/10.1186/s40537-022-00665-5
Article Google Scholar
Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state of the art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136.
Article Google Scholar
Liu, W., Fan, H., & **a, M. (2022a). Credit scoring based on tree-enhanced gradient boosting decision trees. Expert Systems with Applications, 189, 116034. https://doi.org/10.1016/j.eswa.2021.116034
Article Google Scholar
Liu, W., Fan, H., & **a, M. (2022b). Tree-based heterogeneous cascade ensemble for credit scoring. International Journal of Forecasting. https://doi.org/10.1016/j.ijforecast.2022.07.007
Mushava, J., & Murray, M. (2018). An experimental comparison of classification techniques in debt recoveries scoring: Evidence from South Africa's unsecured lending market. Expert Systems with Applications, 111(2018), 35–50.
Google Scholar
Mushava, J., & Murray, M. (2022). A novel XGBoost extension for credit scoring class-imbalanced data combining a generalised extreme value link and a modified focal loss function. Expert Systems with Applications, 202. https://doi.org/10.1016/j.eswa.2022.117233
Niu, B., Ren, J., & Li, X. (2019). Credit scoring using machine learning machine learning by combing social network information: Evidence from peer to peer lending information, 2019(10), 397. https://doi.org/10.3390/info10120397
Article Google Scholar
Qin, C., Zhang, Y., Bao, F., Zhang, C., Liu, P., & Liu, P. (2021). XGBoost optimised by adaptive particle swarm optimization for credit scoring. Mathematical Problems in Engineering, 2021. https://doi.org/10.1155/2021/6655510
Ranchi, Z., Liguo, X., & Qin, W. (2023). An ensemble credit scoring model based on Logistic regression with heterogeneous balancing and weighting effects. Expert Systems with Applications, 212. https://doi.org/10.1016/j.eswa.2022.118732
Shen, F., Zhao, X., Kou, G., & Alsaadi, F. E. (2021). A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique. Applied Soft Computing, 98(1), 106852. https://doi.org/10.1016/jasoc.2020.106852
Article Google Scholar
Tang, T. (2009). Information asymmetry and firms’ credit market access: Evidence from Moody’s credit rating format refinement. Journal of Financial Economics, 93, 325–351.
Article Google Scholar
Tsiu, C.-F., & Yen, D. C. (2014). A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing, 24, 977–984. https://doi.org/10.1016/j.asoc.2014.08.047
Article Google Scholar
Wang, S. X., Dong, P. F., & Tian, Y. J. (2017). A novel method of statistical line loss estimation for distribution feeders based on feeder clusters and modified XGBoost. Energies, (10) (12) 2067.
Google Scholar
**a, Y., Liu, C., Da, B., & **e, F. (2018). A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Systems with Applications, 93. https://doi.org/10.1016/j.eswa.2017.10.022
**a, Y., Zhao, Z., He, L., Li, Y., & Niu, M. (2020). A novel tree-based dynamic heterogeneous ensemble method for credit scoring. Expert Systems with Applications, 159. https://doi.org/10.1016/j.eswa.2020.113615
**ao, H., **ao, Z., & Wang, Z. (2016). Ensemble classification based on supervised clustering for credit scoring. Applied Soft Computing, 43, 73–86. https://doi.org/10.1016/j.asoc.2016.02.022
Article Google Scholar
Xu, X., Chen, X., Li, S., Meng, F., & Cao, W. (2023). A novel GSCI-Based Ensemble Approach for credit scoring: IEEE ACCESS, 8, 222449–222465. https://doi.org/10.1109/ACCESS.2020.3043937
Yang, L. (2011). Classifier selection for ensembles learning based on accuracy and diversity. Procedia Engineering, 15, 4266–4270.
Article Google Scholar
Yao, J., Wang, Y., Wang, L., Liu, M., Jiang, H., & Chen, Y. (2022). Novel hybrid ensemble credit scoring model with stacking-based noise detection and weight assignment. Expert Systems with Applications, 198. https://doi.org/10.1016/j.eswa.2022.116913
Yule, G. (1900). On the association of attributes in statistics. Philosophical Transactions. Royal Society of London. Series A, 194, 257–319, 1900.
Google Scholar
Zhou, Z.-H. (2012). Ensemble methods: Foundations and algorithms. CRC Press.
Book Google Scholar

Download references

Author information

Authors and Affiliations

University of Johannesburg, Johannesburg, South Africa
Tinofirei Museba
Department of Applied Information Systems, College of Business and Economics, Johannesburg, South Africa
Tinofirei Museba

Authors

Tinofirei Museba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tinofirei Museba .

Editor information

Editors and Affiliations

Johannesburg Business School, University of Johannesburg, Johannesburg, South Africa
Tankiso Moloi
Alcorn State University, Mississippi, MS, USA
Babu George

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Museba, T. (2024). Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data. In: Moloi, T., George, B. (eds) Towards Digitally Transforming Accounting and Business Processes. ICAB 2023. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-031-46177-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-46177-4_29
Published: 12 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46176-7
Online ISBN: 978-3-031-46177-4
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive and Dynamic Heterogeneous Ensemble Model for Credit Scoring

An Investigation on Vietnamese Credit Scoring Based on Big Data Platform and Ensemble Learning

Credit Scoring with Drift Adaptation Using Local Regions of Competence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive and Dynamic Heterogeneous Ensemble Model for Credit Scoring

An Investigation on Vietnamese Credit Scoring Based on Big Data Platform and Ensemble Learning

Credit Scoring with Drift Adaptation Using Local Regions of Competence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation