Comparative Analysis of Machine Learning Algorithms for Consumer Credit Risk Assessment

Authors

  • Tianyi Xu

DOI:

https://doi.org/10.62051/r1m3pg16

Keywords:

Financial Technology, Machine Learning, Consumer Credit Risk, Comparative Analysis.

Abstract

In the rapidly evolving landscape of financial technology, machine learning algorithms are increasingly supplanting traditional methodologies for evaluating consumer credit risk. This study leverages a comprehensive dataset comprising 10,000 credit accounts to conduct a comparative analysis of four prevalent machine learning algorithms: Logistic Regression, Decision Tree, Random Forest, and Gradient Boosting Machine (GBM). The results distinctly favor GBM, which achieves an AUC of 0.87, closely followed by Random Forest with an AUC of 0.85. In stark contrast, Logistic Regression and Decision Tree recorded lower AUCs of 0.78 and 0.72, respectively. GBM and Random Forest significantly outperform in classification accuracy, attaining 92% and 90%, respectively, far exceeding the 86% by Logistic Regression and 80% by Decision Tree. Notably, GBM exhibited 95% specificity and 90% sensitivity, efficiently identifying high-risk accounts while minimizing false positives among low-risk categories. Furthermore, the study delves into the handling of imbalanced datasets, interpretability, and computational demands of each algorithm, offering quantifiable insights that inform future directions for optimizing credit risk models, particularly in enhancing transparency and scalability.

Downloads

Download data is not yet available.

References

Nazareth, N., & Reddy, Y. V. R. (2023). Financial applications of machine learning: A literature review. Expert Systems with Applications, 219, 119640.

Wang, X., Lin, W., Zhang, W., Huang, Y., Li, Z., Liu, Q., ... & Lv, C. (2024). Integrating Merkle Trees with Transformer Networks for Secure Financial Computation. Applied Sciences, 14(4), 1386.

Chi, G., Uddin, M. S., Habib, T., Zhou, Y., Islam, M. R., & Chowdhury, M. A. I. (2019). A hybrid model for credit risk assessment: empirical validation by real-world credit data. Journal of Risk Model Validation, 14(4).

Florez-Lopez, R., & Ramon-Jeronimo, J. M. (2015). Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Systems with Applications, 42(13), 5737-5753.

Chen, Y., Calabrese, R., & Martin-Barragan, B. (2024). Interpretable machine learning for imbalanced credit scoring datasets. European Journal of Operational Research, 312(1), 357-372.

Felzmann, H., Fosch-Villaronga, E., Lutz, C., & Tamò-Larrieux, A. (2020). Towards transparency by design for artificial intelligence. Science and engineering ethics, 26(6), 3333-3361.

Khalid, A. R., Owoh, N., Uthmani, O., Ashawa, M., Osamor, J., & Adejoh, J. (2024). Enhancing credit card fraud detection: an ensemble machine learning approach. Big Data and Cognitive Computing, 8(1), 6.

Song, Z., Xia, J., Wang, G., She, D., Hu, C., & Hong, S. (2022). Regionalization of hydrological model parameters using gradient boosting machine. Hydrology and Earth System Sciences, 26(2), 505-524.

Yang, L., Wu, H., Jin, X., Zheng, P., Hu, S., Xu, X., ... & Yan, J. (2020). Study of cardiovascular disease prediction model based on random forest in eastern China. Scientific reports, 10(1), 5245.

Jones, S., Johnstone, D., & Wilson, R. (2017). Predicting corporate bankruptcy: An evaluation of alternative statistical frameworks. Journal of Business Finance & Accounting, 44(1-2), 3-34.

Austin, P. C., & Steyerberg, E. W. (2017). Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models. Statistical methods in medical research, 26(2), 796-808.

B Song, Y Zhao.(2022). A comparative research of innovative comparators. Journal of Physics: Conference Series 2221(1),012021,2022.

Cao Jiangfei, Yuan Xinnian, Chen Chi, Zhang Baojie, Wei Shoulian.(2024). AComparison of Three Digestion Methods for Detecting Selenium Content in Selenium Rich Rice and Tea by Atomic Fluorescence Spectroscopy. China Food Additives Journal,1006-2513(2024)5-0312-0005.

Downloads

Published

20-06-2024

How to Cite

Xu, T. (2024) “Comparative Analysis of Machine Learning Algorithms for Consumer Credit Risk Assessment”, Transactions on Computer Science and Intelligent Systems Research, 4, pp. 60–67. doi:10.62051/r1m3pg16.