Research on Diabetes Risk Prediction Using Multiple Machine Learning Models
DOI:
https://doi.org/10.62051/tz1wn473Keywords:
Disease Prediction, Machine Learning, Predictive Model, SHAP.Abstract
Diabetes is a chronic hyperglycemic disease caused by insufficient insulin secretion. Traditional diagnostic methods cannot provide early prevention, while data-driven machine learning methods can analyze medical data for early prediction. This study employs four machine learning algorithms—Logistic Regression, Support Vector Machine, Decision Tree, and Random Forest—to analyze and model relevant data. By comparing model performance indicators such as accuracy and recall, it was found that the Random Forest model performs better overall. Using SHAP waterfall and beeswarm plots further to explain the prediction results of the Random Forest model and comparing these results with the variable importance ranking within the Random Forest, it was discovered that Serum Creatinine, Blood Urea, Triglycerides, and Hemoglobin are the most significant factors contributing to diabetes. This finding suggests that these factors should be the focus in predicting the risk of diabetes onset.
Downloads
References
[1] CHARBONNEL B, SIMON D, DALLONGEVILLE J, et al. Direct medical costs of type 2 diabetes in France: an insurance claims database analysis [J]. Pharmacoecon Open, 2018, 2(2):209-219.
[2] He Xiaoning, Zhang Yawen, Ruan Zhen, et al. Study on the Prevalence of Chronic Complications and Per Capita Medical Expenses in Chinese Patients with Type 2 Diabetes [J]. Chinese Journal of Endocrinology and Metabolism, 2019, 35(3): 6.
[3] Ma Yujia, Che Qianzi, Zheng Qiwen, et al. Common Evaluation Methods for Risk Prediction Models of Type 2 Diabetes [J]. Chinese Journal of Prevention and Control of Chronic Diseases, 2020, 28(02): 94-100.
[4] Chen Mengmeng, Fang Zhenhong, Tu Wenyi, et al. Construction and Effect Analysis of a Heart Disease Prediction Model Based on Logistic Regression [J]. Hospital Management Forum, 2022, 39(02): 32-35.
[5] Chen Wangyue, Xue Fang, Han Wei, et al. Prediction of Five Types of General Surgery Complications Based on Logistic Regression Model [J]. Basic & Clinical Medicine, 2023, 43(06): 974-980.
[6] Ding Weijing, Zhang Shaomeng, Pei Yuntao. Research on a Prediction Model for Children's Influenza Based on an Artificial Bee Colony Optimized Support Vector Machine [J]. Pharmacy Today, 2024, 34(01): 74-80.
[7] Chen Jingwen, Zhang Pengpeng, Xu Siyu, et al. Visualization System for Respiratory Disease Prediction Based on Machine Learning [J]. Internet of Things Technology, 2023, 13(02): 68-70.
[8] Wang Silu, Liu Lei, Wu Wei, et al. Construction of a Prediction Model for Hospital-acquired Infection Risk in Cardiovascular Inpatients Based on Decision Tree Algorithm [J]. Journal of Shenyang Medical College, 2023, 25(03): 312-317.
[9] Zhang Qing, Duan Liyao, Liu Yanxiang, et al. Research on a Risk Prediction Model for Asthma Incidence by Integrating Multiple Machine Learning Algorithms [J]. Journal of Environmental Hygiene, 2024, 14(02): 113-120.
[10] Li Dan, Lu Yan, Wu Peishan, et al. Disease Risk Prediction Based on an Improved Random Forest Ensemble Model [J]. Laboratory Research and Exploration, 2023, 42(09): 95-99+109.
[11] Xie Wenliang, Wang Shufang, Li Xuguang, et al. Construction and Validation of a Risk Prediction Model for Enteral Nutrition-Related Diarrhea in ICU Patients [J]. Chinese Journal of Nursing, 2022, 57(19): 2324-2332.
[12] Zeng Jing, He Xiaolong, Hu Huajuan, et al. Construction of a Risk Prediction Model for Death or Readmission During Vulnerable Periods in Patients with Acute Heart Failure Based on Machine Learning [J]. Journal of Army Medical University, 2024, 46(07): 738-745.
[13] Li Jin, Wei Yanlong, Xue Hongxin. Analysis of Influencing Factors for Influenza-like Cases Based on Random Forest Model and SHAP Algorithm [J]. Information Technology and Informatization, 2024(02): 3-6.
Downloads
Published
Conference Proceedings Volume
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







