A Machine Learning Pipeline for Cervical Cancer Prediction
DOI:
https://doi.org/10.62051/c048dt92Keywords:
Cervical cancer; Machine learning; SVM; Decision tree; Random Forest.Abstract
Cervical cancer ranks as the fourth leading cause of cancer-related deaths among women and often remains asymptomatic during its initial stages. Therefore, it’s very important to predict it. The study's objective is to explore cervical cancer risk variables and develop an ensemble prediction model to forecast cervical cancer more precisely. The cervical cancer data were represented by 11 risk factors and a dependent variable consisted of Hinselmann, Schiller, Cytology, and Biopsy. SVM, decision tree, and random forest methods are compared for prediction, with models trained on 75% of the data and evaluated using confusion matrices. Precision and recall metrics assess model performance, with recall being vital in critical situations like identifying cervical cancer patients. The comparison was made among these three methods and compared the ranking result of risk factors with the ground truth, random forest method offers high recall (0.8821) and accuracy (0.8744). Recognizing risk factors can enhance cure rates and female patient survival. And the cervical cancer prediction system enhances medical care and reduces the cost. Data mining techniques in the medical sector facilitate diagnostic and prognostic apps for early treatment of life-threatening diseases.
Downloads
References
The overview of cervical cancer, World Health Organization.
Naif Al Mudawi (2022). A Model for Predicting Cervical Cancer Using Machine Learning Algorithms, MDPI, 22 (11), 4132.
Prabhpreet Kaur (2019). Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification, ScienceDirect.
Asadi, F (2020). Supervised Algorithms of Machine Learning for the Prediction of Cervical Cancer. J. Biomed. Phys. Eng. 2020, 10, 509 – 513.
Ching-Hsien Hsu (2021). Interactive personalized recommendation systems for human health, Journal of Ambient Intelligence and Humanized Computing.
Anuraga G (2019). Random forest prognostic factor in colorectal cancer, Journal of Physics: Conference Series.
Alyafeai Z (2020). A fully-automated deep learning pipeline for cervical cancer classification.
Jundong Li (2017). Feature Selection: A Data Perspective. ACM Computing Surveys. 50, 6, Article 94, 45 pages.
Rohith Gandhi (2018). Support Vector Machine — Introduction to Machine Learning Algorithms, Towards Data Science.
abhishek (2023). Decision Tree, GeeksforGeeks.
Onesmus Mbaabu (2020). Introduction to Random Forest in Machine Learning.
Downloads
Published
Conference Proceedings Volume
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







