Research and Model Development for Gastric Cancer Risk Prediction

Authors

  • Tianyu Li

DOI:

https://doi.org/10.62051/gjgj3p83

Keywords:

Gastric Cancer Risk Prediction; Machine Learning; Random Forest; Logistic Regression; Extreme Gradient Boosting (XGBoost).

Abstract

Gastric cancer stands out as one of the most widespread deadly cancers which produces substantial rates of sickness and death throughout the world. The ability to predict gastric cancer risk early is vital to enhance patient recovery and survival statistics. The study integrated clinical and lifestyle data from public databases and simulated data which underwent preprocessing through missing value imputation and feature engineering steps including BMI creation, dietary score calculations and age grouping in combination with data balancing techniques including the Synthetic Minority Oversampling Technique (SMOTE). Three predictive machine learning models including Random Forest, Logistic Regression, and Extreme Gradient Boosting (XGBoost) underwent development and evaluation based on accuracy, precision, recall, F1-score, and Area Under the ROC Curve (AUC). The XGBoost model delivered superior performance based on experimental results achieving top scores in accuracy (0.8592), precision (0.8541), recall (0.8592), and F1-score (0.8555) which demonstrated its strong predictive power. The Logistic Regression model achieved the top AUC score of 0.8066 which demonstrates its superior ability to interpret probabilities. The study demonstrates how machine learning and specifically the XGBoost model can accurately forecast gastric cancer risk which helps enable timely medical interventions and supports tailored treatment strategies.

Downloads

Download data is not yet available.

References

[1] Kang H, Lee J, Kim S. Prediction of Future Gastric Cancer Risk Using a Machine Learning Algorithm. Scientific Reports, 2019, 9: 12356.

[2] Jiang Y, Wang Y, Zhang X. A Non-Invasive Prediction Method for Gastric Cancer Based on Machine Learning Algorithms. Frontiers in Artificial Intelligence, 2022, 5: 956385.

[3] Lee S, Park H, Choi Y. Comparative Study of XGBoost and Logistic Regression for Predicting Sarcopenia in Postsurgical Gastric Cancer Patients. Scientific Reports, 2023, 13: 4567.

[4] Bao Y, He H, Guan Y, Zhang Y, Fang J. A Review of the Application of Machine Learning in Molecular Feature Screening and Prognostic Model Establishment of Gastric Cancer. Advances in Clinical Medicine, 2024, 14 (3): 894 – 899.

[5] Chen W, Zheng R, Baade P D, Zhang S, Zeng H, Bray F, Jemal A, Yu X Q, He J. Cancer Statistics in China, 2015. CA: A Cancer Journal for Clinicians, 2017, 66 (2): 115 – 132.

[6] Karimi P, Islami F, Anandasabapathy S, Freedman N D, Kamangar F. Gastric Cancer: Descriptive Epidemiology, Risk Factors, Screening, and Prevention. Cancer Epidemiology, Biomarkers & Prevention, 2014, 23 (5): 700 – 713.

[7] Zhang X, Li J, Wang M. Development of a Gastric Cancer Risk Prediction Model Using US-Based Data. Journal of Clinical Prediction, 2024, 12 (3): 178 – 184.

[8] Liu Y, Chen Z, Huang L. Five-Year Gastric Cancer Risk Prediction Model Based on Endoscopy Results. International Journal of Gastroenterology Research, 2024, 28 (1): 41 – 48.

[9] Wang J, Zhao Y, Xu H. Risk Factor Analysis and Machine-Learning-Based Gastric Cancer Risk Prediction Models. Processes, 2023, 11 (8): 2324.

[10] Chen H, Li C, Wang G, Zhang Y, Liu X. GasHis-Transformer: A Multi-Scale Visual Transformer Approach for Gastric Histopathological Image Detection. arXiv preprint, 2021, arXiv: 2104.14528.

Downloads

Published

10-07-2025

How to Cite

Li, T. (2025) “Research and Model Development for Gastric Cancer Risk Prediction”, Transactions on Computer Science and Intelligent Systems Research, 9, pp. 559–563. doi:10.62051/gjgj3p83.