Survival prediction and analysis of Titanic based on logistic regression and KNN
DOI:
https://doi.org/10.62051/6xfdek19Keywords:
Survival of Titanic passengers; Logistic regression model; Machine learning.Abstract
The main purpose of this paper is to use machine learning methods to identify factors related to the survival of Titanic passengers and analyze model parameters. The study first speculated on the survival factors of Titanic passengers and conducted data visualization tests on these speculations. Subsequently, this article cleaned up the relevant data, abandoned the factor of excessive missing data, made up for a small amount of missing data, and then constructed a relevant machine model. Finally, by comparing the accuracy of the two models, a more efficient and accurate model was selected. Through relevant work, it has been concluded that the logistic regression model is more effective than the K-nearest neighbor (KNN) model and that passenger age, economic status, and cabin class are closely related to passenger survival. This article provides valuable references for relevant experimental research by introducing machine learning methods for effective survival prediction. In the future, research will further delve into the methods and solutions of deep learning.
Downloads
References
S. Aakriti, S. Saraswat, and N. Faujdar. Analyzing Titanic disaster using machine learning algorithms, 2017 International Conference on Computing, Communication and Automation (ICCCA), IEEE, 2017.
L. Eric, and C. Tang, Titanic machine learning from disaster, LamTang-TitanicMachineLearningFromDisaster, 2012.
S. Trevor, Titanic: Getting Started With R-Part 3: Decision Trees, 2014.
S. Bruno, A. David, and B. Torgler, Behavior under extreme conditions: The Titanic disaster, Journal of Economic Perspectives, 25(1), 2011, pp: 209-222.
Information on: https://www.kaggle.com/code/nadintamer/titanic-survival-predictions-beginner
IEEE Reliability Society, International Integrated Reliability Workshop Final Report, Electron Device Society and Reliability Society of the Institute of Electrical and Electronics Engineers, 2002.
C. Thomas, and P. Hart, Nearest neighbor pattern classification, IEEE transactions on information theory, 13(1), 1967, pp: 21-27.
S. Ekin, İlhan Omurca, and Neytullah Acun, A comparative study on machine learning techniques using Titanic dataset, 7th international conference on advanced technologies, 2018.
Information on: https://www.kaggle.com/code/nadintamer/titanic-survival-predictions-beginner
Information on: https://www.kaggle.com/code/nadintamer/titanic-survival-predictions-beginner
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.