Predicting and Analysing Road Accident Severity with Machine learning Models and Resampling
DOI:
https://doi.org/10.62051/p0xb2k56Keywords:
Road accident; machine learning; resampling.Abstract
Road accident threaten people’s life safety as well as wealth seriously, therefore predicting and analysing car accidents have great significance. This research first compared the performance of four machine learning models in analysing the severity of traffic accidents, including Random Forest, Naïve Bayes, Logistic Regression and Multi-Layer Perceptron. Naïve Bayes performs badly with a low accuracy while other three models have equal level of performance. Multi-Layer Perceptron performs better in minority classes than Random Forest and Logistic Regression. Among all features, Random Forest focus on geographical and time features, Logistic Regression and Multi-Layer Perceptron focus on driving features, including lighting and road surface. Then resampling method is applied to imbalanced data. After trained in resampled data, Random Forest and Logistic Regression perform better in minority classes with higher precision and recall. In summary, this research compares the performance of machine learning models in road accident severity predicting and analysing, and addresses the challenge of imbalanced data with resampling method.
Downloads
References
World Health Organization, Road traffic injuries, 2023.12.13, accessed 2024.4.15, https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries.
Bureau of Infrastructure and Transport Research Economics, Road Trauma Australia—Annual Summaries, 2023.5.10, accessed 2024.3.29, https://www.bitre.gov.au/publications/ongoing/road_deaths_australia_annual_summaries
L. Chen et al., "Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys," in IEEE Transactions on Intelligent Vehicles, vol. 8, no. 2, pp. 1046-1056, Feb. 2023
Chen, Tianyi, et al. "Key Feature Selection and Risk Prediction for Lane-Changing Behaviors Based on Vehicles' Trajectory Data." Accident Analysis and Prevention, vol. 129, pp. 156–69, 2019
Chen, Chen, et al. "A Rear-End Collision Risk Evaluation and Control Scheme Using a Bayesian Network Model." IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 1, pp. 264–84, 2019
Shangguan, Qiangqiang, et al. "An Integrated Methodology for Real-Time Driving Risk Status Prediction Using Naturalistic Driving Data." Accident Analysis and Prevention, vol. 156, pp. 106122–106122, 2021
United States Government's open data site, Crash Reporting - Drivers Data, 2024.2.9, accessed 2024.2.18, https://catalog.data.gov/dataset/crash-reporting-drivers-data
Chen, Tianyi, et al. “Key Feature Selection and Risk Prediction for Lane-Changing Behaviors Based on Vehicles’ Trajectory Data.” Accident Analysis and Prevention., vol. 129, pp. 156–69, 2019
Transport for NSW, NSW Road Crash Data, 2024.1.11, accessed 2024.2.18. https://data.nsw.gov.au/data/dataset/2-nsw-crash-data
LeCun, Yann, Leon Bottou, Genevieve B. Orr, and Klaus -Robert Müller. “Efficient BackProp.” In Lecture Notes in Computer Science, 9–50. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998.
Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001)
Rish, Irina, An Empirical Study of the Naïve Bayes Classifier, IJCAI 2001 Work Empir Methods Artif Intell. 3, 2001
Cox, D. R. “The Regression Analysis of Binary Sequences.” Journal of the Royal Statistical Society. Series B (Methodological), vol. 20, no. 2, pp. 215–42. JSTOR, 1958
Rosenblatt, F, The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408, 1958
Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, pp. 2825-2830, 2011.
Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.” arXiv.org (2019).
Lemaitre, Guillaume, Fernando Nogueira, and Christos K Aridas. “Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning.” Journal of machine learning research 18 (2017): 1–5.
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.