Predicting and Analysing Road Accident Severity with Machine learning Models and Resampling

Authors

  • Xin Wang

DOI:

https://doi.org/10.62051/p0xb2k56

Keywords:

Road accident; machine learning; resampling.

Abstract

Road accident threaten people’s life safety as well as wealth seriously, therefore predicting and analysing car accidents have great significance. This research first compared the performance of four machine learning models in analysing the severity of traffic accidents, including Random Forest, Naïve Bayes, Logistic Regression and Multi-Layer Perceptron. Naïve Bayes performs badly with a low accuracy while other three models have equal level of performance. Multi-Layer Perceptron performs better in minority classes than Random Forest and Logistic Regression. Among all features, Random Forest focus on geographical and time features, Logistic Regression and Multi-Layer Perceptron focus on driving features, including lighting and road surface.  Then resampling method is applied to imbalanced data. After trained in resampled data, Random Forest and Logistic Regression perform better in minority classes with higher precision and recall. In summary, this research compares the performance of machine learning models in road accident severity predicting and analysing, and addresses the challenge of imbalanced data with resampling method.

Downloads

Download data is not yet available.

References

World Health Organization, Road traffic injuries, 2023.12.13, accessed 2024.4.15, https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries.

Bureau of Infrastructure and Transport Research Economics, Road Trauma Australia—Annual Summaries, 2023.5.10, accessed 2024.3.29, https://www.bitre.gov.au/publications/ongoing/road_deaths_australia_annual_summaries

L. Chen et al., "Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys," in IEEE Transactions on Intelligent Vehicles, vol. 8, no. 2, pp. 1046-1056, Feb. 2023

Chen, Tianyi, et al. "Key Feature Selection and Risk Prediction for Lane-Changing Behaviors Based on Vehicles' Trajectory Data." Accident Analysis and Prevention, vol. 129, pp. 156–69, 2019

Chen, Chen, et al. "A Rear-End Collision Risk Evaluation and Control Scheme Using a Bayesian Network Model." IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 1, pp. 264–84, 2019

Shangguan, Qiangqiang, et al. "An Integrated Methodology for Real-Time Driving Risk Status Prediction Using Naturalistic Driving Data." Accident Analysis and Prevention, vol. 156, pp. 106122–106122, 2021

United States Government's open data site, Crash Reporting - Drivers Data, 2024.2.9, accessed 2024.2.18, https://catalog.data.gov/dataset/crash-reporting-drivers-data

Chen, Tianyi, et al. “Key Feature Selection and Risk Prediction for Lane-Changing Behaviors Based on Vehicles’ Trajectory Data.” Accident Analysis and Prevention., vol. 129, pp. 156–69, 2019

Transport for NSW, NSW Road Crash Data, 2024.1.11, accessed 2024.2.18. https://data.nsw.gov.au/data/dataset/2-nsw-crash-data

LeCun, Yann, Leon Bottou, Genevieve B. Orr, and Klaus -Robert Müller. “Efficient BackProp.” In Lecture Notes in Computer Science, 9–50. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998.

Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001)

Rish, Irina, An Empirical Study of the Naïve Bayes Classifier, IJCAI 2001 Work Empir Methods Artif Intell. 3, 2001

Cox, D. R. “The Regression Analysis of Binary Sequences.” Journal of the Royal Statistical Society. Series B (Methodological), vol. 20, no. 2, pp. 215–42. JSTOR, 1958

Rosenblatt, F, The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408, 1958

Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, pp. 2825-2830, 2011.

Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.” arXiv.org (2019).

Lemaitre, Guillaume, Fernando Nogueira, and Christos K Aridas. “Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning.” Journal of machine learning research 18 (2017): 1–5.

Downloads

Published

12-08-2024

How to Cite

Wang, X. (2024) “Predicting and Analysing Road Accident Severity with Machine learning Models and Resampling”, Transactions on Computer Science and Intelligent Systems Research, 5, pp. 928–934. doi:10.62051/p0xb2k56.