A Machine Learning Pipeline for Cervical Cancer Prediction

Authors

  • Yubo Huang

DOI:

https://doi.org/10.62051/c048dt92

Keywords:

Cervical cancer; Machine learning; SVM; Decision tree; Random Forest.

Abstract

Cervical cancer ranks as the fourth leading cause of cancer-related deaths among women and often remains asymptomatic during its initial stages. Therefore, it’s very important to predict it. The study's objective is to explore cervical cancer risk variables and develop an ensemble prediction model to forecast cervical cancer more precisely. The cervical cancer data were represented by 11 risk factors and a dependent variable consisted of Hinselmann, Schiller, Cytology, and Biopsy. SVM, decision tree, and random forest methods are compared for prediction, with models trained on 75% of the data and evaluated using confusion matrices. Precision and recall metrics assess model performance, with recall being vital in critical situations like identifying cervical cancer patients. The comparison was made among these three methods and compared the ranking result of risk factors with the ground truth, random forest method offers high recall (0.8821) and accuracy (0.8744). Recognizing risk factors can enhance cure rates and female patient survival. And the cervical cancer prediction system enhances medical care and reduces the cost. Data mining techniques in the medical sector facilitate diagnostic and prognostic apps for early treatment of life-threatening diseases.

Downloads

Download data is not yet available.

References

The overview of cervical cancer, World Health Organization.

Naif Al Mudawi (2022). A Model for Predicting Cervical Cancer Using Machine Learning Algorithms, MDPI, 22 (11), 4132.

Prabhpreet Kaur (2019). Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification, ScienceDirect.

Asadi, F (2020). Supervised Algorithms of Machine Learning for the Prediction of Cervical Cancer. J. Biomed. Phys. Eng. 2020, 10, 509 – 513.

Ching-Hsien Hsu (2021). Interactive personalized recommendation systems for human health, Journal of Ambient Intelligence and Humanized Computing.

Anuraga G (2019). Random forest prognostic factor in colorectal cancer, Journal of Physics: Conference Series.

Alyafeai Z (2020). A fully-automated deep learning pipeline for cervical cancer classification.

Jundong Li (2017). Feature Selection: A Data Perspective. ACM Computing Surveys. 50, 6, Article 94, 45 pages.

Rohith Gandhi (2018). Support Vector Machine — Introduction to Machine Learning Algorithms, Towards Data Science.

abhishek (2023). Decision Tree, GeeksforGeeks.

Onesmus Mbaabu (2020). Introduction to Random Forest in Machine Learning.

Downloads

Published

24-03-2024

How to Cite

Huang, Y. (2024). A Machine Learning Pipeline for Cervical Cancer Prediction. Transactions on Materials, Biotechnology and Life Sciences, 3, 59-65. https://doi.org/10.62051/c048dt92