Survival Prediction and Comparison of the Titanic based on Machine Learning Classifiers

Authors

  • Tony Wayne Wang

DOI:

https://doi.org/10.62051/8fcnnp84

Keywords:

Machine learning; Random Forest Classifier; Logistic Regression; Decision Tree Classifier.

Abstract

This study conducts a comparison of machine learning algorithms, including Logistic Regression (LR), Decision Tree Classifier (DT), and Random Forest Classifier (RF), to predict the survival outcomes of passengers on the Titanic. The dataset used in the study includes variables such as socio-economic status, age, gender, and family relationships; this paper meticulously prepares and analyzes the data to train and evaluate these models. The study's objective is to determine the impact of various passenger features on survival outcomes, employing machine learning algorithms to generate survival predictions. The findings demonstrate that the RF model, particularly with 45 or 75 trees, significantly outperforms LR and DT in terms of precision and recall, establishing it as a more robust classifier for this dataset. The research underscores the importance of the utility of different machine learning models for binary classification tasks and the role of parameter tuning in enhancing model performance. This comparative analysis not only contributes to the ongoing exploration of the Titanic disaster through data science but also highlights key considerations in the application of machine learning algorithms for predictive modeling.

Downloads

Download data is not yet available.

References

A. Singh, S. Saraswat, N. Faujdar. Analyzing Titanic disaster using machine learning algorithms, 2017 International Conference on Computing, Communication and Automation (ICCCA), 2017, pp: 406-411.

A. Dasgupta,V. P. Mishra, S. Jha, et al. Predicting the Likelihood of Survival of Titanic’s Passengers by Machine Learning, 2021 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE). IEEE, 2021 pp: 52-57.

C. Shawn, et al. Classification of titanic passenger data and chances of surviving the disaster, Proceedings of student-faculty research day, csis, and 2014.

T. Chatterjee. Prediction of survivors in titanic dataset: a comparative study using machine learning algorithms, Int J Emerg Res Manag Technol. Department of Management Studies, 2017.

A. Singh, S. Saraswat and N. Faujdar, Analyzing Titanic disaster using machine learning algorithms, Computing Communication and Automation (ICCCA), 2017 International Conference, pp: 406-411.

E. Lam and C. Tang, Titanic Machine Learning From Disaster, LamTang-Titanic Machine Learning From Disaster, 2012.

K. Vyas, Z. Zheng, L. Li. Titanic-machine learning from disaster. Machine Learning Final Project, UMass Lowell, 2015, pp: 1-7.

S. Cicoria, J. Sherlock, M. Muniswamaiah, et al. Classification of titanic passenger data and chances of surviving the disaster, Proceedings of student-faculty research day, csis, 2014, pp: 1-6.

Information on: https://www.kaggle.com/competitions/titanic

J. M. Bower, D. Beeman, The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System, Springer Science & Business Media, 2012.

A. Unwin, H. Hofmann, GUI and Command-line Conict or Synergy, In K Berk, M Pourahmadi, Computing Science and Statistics, 2019.

G. James, D. Witten, T. Hastie, et al. An introduction to statistical learning. New York: springer, 2013.

D. J. Hand, Principles of data mining, Drug safety, 30, 2007, pp: 621-622.

H. Trevor, T. Robert, Friedman, the Elements of Statistical Learning, Springer, 2018, pp: 587-588.

Downloads

Published

12-08-2024

How to Cite

Wang, T.W. (2024) “Survival Prediction and Comparison of the Titanic based on Machine Learning Classifiers”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 443–450. doi:10.62051/8fcnnp84.