Combining Feature Selection and Classification Using LASSO-Based MCO Classifier for Credit Risk Evaluation

Li, **ufang; Zhang, Zhiwang; Li, Lingyun; Pan, Hui

doi:10.1007/s10614-023-10535-8

Combining Feature Selection and Classification Using LASSO-Based MCO Classifier for Credit Risk Evaluation

Published: 08 January 2024

(2024)
Cite this article

Computational Economics Aims and scope Submit manuscript

**ufang Li¹,
Zhiwang Zhang²,
Lingyun Li³ &
…
Hui Pan³

153 Accesses
1 Citation
Explore all metrics

Abstract

Credit risk evaluation is a difficult task to predict default probabilities and deduce risk classification, and many classification methods and techniques have already been applied in predicting credit risk. In this paper, in view of the significant limitations of feature reduction and weak interpretability of the multi-criteria optimization classifier (MCOC), an improved LASSO-based MCOC (LASSO-MCOC) for simultaneous classification and feature selection is proposed and the corresponding algorithm is constructed. Based on the four real-world credit risk datasets, the LASSO-MCOC with linear and RBF kernels are tested and compared with the SMCOC proposed by Zhang et al. (2019) and six basic classification methods including logistic regression, multilayer perceptron, support vector machines, Naïve Bayes, k-nearest neighbors and random forest. The experimental and statistically comparative analysis results show that the LASSO-MCOC we proposed is more effective for credit risk assessment with better performance in accuracy, efficiency, and interpretability than that of other classifiers and can be extended to other real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

The Most Effective Strategy for Incorporating Feature Selection into Credit Risk Assessment

Article 17 December 2022

Feature Selection for Credit Risk Classification

Credit Scoring Models Using Ensemble Learning and Classification Approaches: A Comprehensive Survey

Article 01 October 2021

References

Arora, N., & Kaur, P. D. (2019). A bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment. Applied Soft Computing, 86, 105936.
Article Google Scholar
Bhattacharya, A., Biswas, S. K., & Mandal, A. (2022). Credit risk evaluation: A comprehensive study. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-022-13952-3
Article Google Scholar
Bijak, K., & Thomas, L. C. (2012). Does segmentation always improve model performance in credit scoring? Expert Systems with Applications, 39(3), 2433–2442.
Article Google Scholar
Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications, 39(3), 3446–3453.
Article Google Scholar
Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2021). Explainable machine learning in credit risk management. Computational Economics, 57, 203–216.
Article Google Scholar
Chen, N., Ribeiro, B., & Chen, A. (2016). Financial credit risk assessment: A recent review. Artificial Intelligence Review, 45, 1–23.
Article Google Scholar
Danenas, P., Garsva, G., & Gudas, S. (2011). Credit risk evaluation model development using support vector based classifiers. Procedia Computer Science, 4, 1699–1707.
Article Google Scholar
Dastile, X., & Celik, T. (2021). Making deep learning-based predictions for credit scoring explainable. IEEE Access, 9, 50426–50440.
Article Google Scholar
Fan, Y., Huang, H., & Yang, Z. (2022). Research on personal credit evaluation based on feature engineering and tree enhanced Bayesian Network. Journal of Gulin University of Aerospace Technology, 27(4), 573–579.
Google Scholar
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Article Google Scholar
Galindo, J., & Tamayo, P. (2000). Credit risk assessment using statistical and machine learning: Basic methodology and risk modeling applications. Computational Economics, 15(1/2), 107–143.
Article Google Scholar
Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society Series A (statistics in Society), 160(3), 523–541.
Article Google Scholar
Hand, D. J., & Vinciotti, V. (2013). Choosing k for two-class nearest neighbour classifiers with unbalanced classes. Pattern Recognition Letters, 24, 1555–1562.
Article Google Scholar
Hofmann, H. (1994). Statlog (German Credit Data). UCI Machine Learning Repository. https://doi.org/10.24432/C5NC77
Article Google Scholar
Huang, X. B., Liu, X. L., & Ren, Y. Q. (2018). Enterprise credit risk evaluation based on neural network algorithm. Cognitive Systems Research, 52, 317–324.
Article Google Scholar
Huang, Y., Song, Y., & Wang, B. (2023). Improved forest optimization feature selection algorithm for credit evaluation. Computer Science, 50(S1), 531–536.
Google Scholar
Islam, M. J., Wu, Q. M. J., Ahmadi, M., & Sid-Ahmed, M. A. (2010). Investigating the performance of Naïve-Bayes classifiers and K-nearest neighbor classifiers. Journal of Convergence Information Technology, 5(2), 133–137.
Article Google Scholar
Kou, G. (2006). Multi-class multi-criteria mathematical programming and its applications in large scale data mining problems. Ph.D. Dissertation, University of Nebraska Omaha.
Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136.
Article Google Scholar
Leong, C. K. (2016). Credit risk scoring with Bayesian network models. Computational Economics, 47(3), 423–446.
Article Google Scholar
Louzada, F., Ara, A., & Fernandes, G. B. (2016). Classification methods applied to credit scoring: A systematic review and overall comparison. Surveys in Operations Research and Management Science, 21(2), 117–134.
Article Google Scholar
Pavlenko, T., & Chernyak, O. (2010). Credit risk modeling using Bayesian networks. International Journal of Intelligent Systems, 25(4), 326–344.
Google Scholar
Peng, Y., Kou, G., Shi, Y., & Chen, Z. (2008). A multi-criteria convex quadratic programming model for credit data analysis. Decision Support System, 44, 1016–1030.
Article Google Scholar
Pérez-Martín, A., Pérez-Torregrosa, A., Rabasa, A., & Vaca, M. (2020). Feature selection to optimize credit banking risk evaluation decisions for the example of home equity loans. Mathematics, 8(11), 1971.
Article Google Scholar
Quinlan, J. R. (1987). Statlog (Australian Credit Approval). UCI Machine Learning Repository. https://doi.org/10.24432/C59012
Article Google Scholar
Robert, T. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, 58(1), 267–288.
Google Scholar
Roy, A. G., & Urolagin, S. (2019). Credit risk assessment using decision tree and support vector machine based data analytic. In M. Mateev & P. Poutziouris (Eds.), Creative business and social innovations for a sustainable future (pp. 79–84). Cham: Springer Nature Switzerland AG.
Chapter Google Scholar
Shi, Y. (2010). Multiple criteria optimization-based data mining methods and applications: A systematic survey. Knowledge and Information Systems, 24(3), 369–391.
Article Google Scholar
Shi, Y., Peng, Y., Xu, W., & Tang, X. (2002). Datamining via multiple criteria linear programming: Applications in credit card portfolio management. International Journal of Information Technology & Decision Making, 1, 131–151.
Article Google Scholar
Sohn, S. Y., Kim, D. H., & Yoon, J. H. (2016). Technology credit scoring model with fuzzy logistic regression. Applied Soft Computing, 43, 150–158.
Article Google Scholar
Twala, B. (2010). Multiple classifier application to credit risk assessment. Expert Systems with Applications, 37(4), 3326–3336.
Article Google Scholar
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2010). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society, 67, 91–108.
Article Google Scholar
Trivedi, S. K. (2020). A study on credit scoring modeling with different feature selection and machine learning approaches. Technology in Society, 63, 101413.
Article Google Scholar
Varetto, F. (1998). Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking and Finance, 22, 1421–1439.
Article Google Scholar
Wei, L.W. (2008). Research on data mining classification model based on the multiple criteria programming and its application. Ph.D. Dissertation, Institute of Policy and Management, Chinese Academy of Sciences.
West, D. (2000). Neural network credit scoring models. Computers and Operations Research, 27(11/12), 1131–1152.
Article Google Scholar
Witten, I. H., & Frank, E. (2011). Data mining: Practical machine learning tools and techniques. Acm Sigmod Record, 31(1), 76–77.
Article Google Scholar
Wu, Y., Li, X., Liu, Q., & Tong, G. (2022). The analysis of credit risks in agricultural supply chain finance assessment model based on genetic algorithm and backpropagation neural network. Computational Economics, 60, 1269–1292.
Article Google Scholar
Rao, C., Liu, M., Goh, M., & Wen, J. (2020). 2-stage modified random forest model for credit risk assessment of P2P network lending to “Three Rurals” borrowers. Applied Soft Computing, 95, 106570.
Article Google Scholar
Zhang, D., Zhou, X., Leung, S. C. H., & Zheng, J. (2010). Vertical bagging decision trees model for credit scoring. Expert Systems with Applications, 37(12), 7838–7843.
Article Google Scholar
Zhang, H., Shi, Y., Yang, X., & Zhou, R. (2021). A firefly algorithm modified support vector machine for the credit risk assessment of supply chain finance. Research in International Business and Finance, 58, 101482.
Article Google Scholar
Zhang, L., Hu, H., & Zhang, D. (2015a). A credit risk assessment model based on SVM for small and medium enterprises in supply chain finance. Financial Innovation, 1(1), 1–21.
Article Google Scholar
Zhang, Z., Gao, G., & Shi, Y. (2014). Credit risk evaluation using multicriteria optimization classifier with kernel, fuzzification and penalty factors. European Journal of Operational Research, 237(1), 335–348.
Article Google Scholar
Zhang, Z., Gao, G., & Tian, Y. (2015b). Multi-kernel multi-criteria optimization classifier with fuzzification and penalty factors for predicting biological activity. Knowledge-Based Systems, 89, 301–313.
Article Google Scholar
Zhang, Z., He, J., Gao, G., & Tian, Y. (2019). Sparse multi-criteria optimization classifier for credit risk evaluation. Soft Computing, 23, 3053–3066.
Article Google Scholar
Zhang, Z., He, J., Cao, J., Li, S., Li, X., Zhang, K., & Wang, P. (2022). An explainable multi-sparsity multi-kernel nonconvex optimization least-squares classifier method via ADMM. Neural Computing & Application, 34, 16103–16128.
Article Google Scholar
Zhang, Z., He, J., Zheng, H., Cao, J., Wang, G., & Shi, Y. (2023). Alternating minimization-based sparse least-squares classifier for accuracy and interpretability improvement of credit risk assessment. International Journal of Information Technology & Decision Making, 20(1), 537–567.
Article Google Scholar
Zhang, Z., Shi, Y., & Gao, G. (2009). A rough set-based multiple criteria linear programming approach for the medical diagnosis and prognosis. Expert Systems with Applications, 36(5), 8932–8937.
Article Google Scholar
Zhao, J., & Li, B. (2022). Credit risk assessment of small and medium-sized enterprises in supply chain finance based on SVM and BP neural network. Neural Computing and Applications, 34(15), 12467–12478.
Article Google Scholar
Zhao, X., Shi, Y., & Niu, L. (2015). Kernel based simple regularized multiple criteria linear program for binary classification and regression. Intelligent Data Analysis, 19(3), 505–527.
Article Google Scholar

Download references

Funding

This study has been partially supported by the National Natural Science Foundation of China under Grant 61877061, and in part by the Major Program of Natural Science Foundation of the Higher Education Institutions of Jiangsu Province under Grant 22KJA520003 and Yantai School Land Integration Development Project under Grant 2021PT02.

Author information

Authors and Affiliations

Department of Science and Technology, Ludong University, Yantai, 264025, Shandong, China
**ufang Li
College of Information Engineering, Nan**g University of Finance and Economics, Nan**g, 210023, China
Zhiwang Zhang
School of Information and Electrical Engineering, Ludong University, Yantai, 264025, Shandong, China
Lingyun Li & Hui Pan

Authors

**ufang Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Li
View author publications
You can also search for this author in PubMed Google Scholar
Hui Pan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study scheme and design. Algorithm design and optimization were performed by ZZ. Data collection, experiment and analysis were performed by XL, LL and HP. The first draft of the manuscript was written by XL and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhiwang Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships which could have appeared to influence the work reported in this paper.

Financial interests

All authors once participated in the development of bank management projects. Zhiwang Zhang began to research the credit risk evaluation earlier.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, X., Zhang, Z., Li, L. et al. Combining Feature Selection and Classification Using LASSO-Based MCO Classifier for Credit Risk Evaluation. Comput Econ (2024). https://doi.org/10.1007/s10614-023-10535-8

Download citation

Accepted: 04 December 2023
Published: 08 January 2024
DOI: https://doi.org/10.1007/s10614-023-10535-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Combining Feature Selection and Classification Using LASSO-Based MCO Classifier for Credit Risk Evaluation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Most Effective Strategy for Incorporating Feature Selection into Credit Risk Assessment

Feature Selection for Credit Risk Classification

Credit Scoring Models Using Ensemble Learning and Classification Approaches: A Comprehensive Survey

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Financial interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Combining Feature Selection and Classification Using LASSO-Based MCO Classifier for Credit Risk Evaluation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Most Effective Strategy for Incorporating Feature Selection into Credit Risk Assessment

Feature Selection for Credit Risk Classification

Credit Scoring Models Using Ensemble Learning and Classification Approaches: A Comprehensive Survey

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Financial interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation