A Review of the Safety Survey of Intelligent Driving

Hao Wang

doi:10.62051/f94s5m17

Authors

Hao Wang

DOI:

https://doi.org/10.62051/f94s5m17

Keywords:

Intelligent driving; object detection; YOLO; Transformer; R CNN.

Abstract

With the rapid advancement of artificial intelligence and sensing technologies, intelligent driving systems have transitioned from high‑end model exclusivity to widespread adoption across diverse vehicle segments. This survey reviews three major object detection methods—traditional object detection, object detection based on Yolo, and object detection based on Transformer—and evaluates their respective strengths and limitations in autonomous driving contexts. We summarize representative improvements to R‑CNN and Faster R‑CNN that enhance small‑object detection, occlusion handling, and anchor optimization, as well as YOLO variants that deliver real‑time inference (up to 200 FPS) without sacrificing accuracy. Object detections which are based on Transformer, such as DETR, Deformable DETR, and Swin Transformer are shown to simplify pipelines, capture global context, and improve robustness against sparse or overlapping targets. Beyond algorithms, this paper identify critical challenges in perception latency, model generalization, data privacy, legal compliance, and hardware constraints. Finally, this paper outline future directions—lightweight end‑to‑end architectures via neural architecture search, self‑supervised and federated learning for privacy‑preserving adaptation, and heterogeneous edge‑cloud hardware co‑design—to accelerate the safe, efficient, and scalable deployment of intelligent driving technologies.

Downloads

Download data is not yet available.

References

[1] Guo T., Jiang Z., Ma H. Intelligent Driving Accelerates Forward. Economic Daily, 2020. https://cs.com.cn/cj2020/202503/t20250330_6482366.html.

[2] Fraedrich E., Heinrichs D., Bahamonde-Birke F. J., Cyganski R. Autonomous Driving, the Built Environment and Policy Implications. Transportation Research Part A: Policy and Practice, 2019, 122: 162–172.

[3] GeeksforGeeks. How Does R-CNN Work for Object Detection?, 2024 https://www.geeksforgeeks.org/ how-does-r-cnn-work-for-object-detection/.

[4] Zhou Y., Wen S., Wang D., Mu J., Richard I. Object Detection in Autonomous Driving Scenarios Based on an Improved Faster-RCNN. Applied Sciences, 2021, 11(24): 11630.

[5] Thompson E., Chen E. A Study on Obstacle Detection in Unmanned Driving Using an Improved Faster R-CNN Model. Journal of Computer Technology and Software, 2024, 3(5).

[6] Carranza-García M., Lara-Benítez P., García-Gutiérrez J., Riquelme J. C. Enhancing Object Detection for Autonomous Driving by Optimizing Anchor Generation and Addressing Class Imbalance. Neurocomputing, 2021, 449: 229–244.

[7] DataCamp. YOLO Object Detection Explained. https://www.datacamp.com/blog/yolo-object-detection-explained.

[8] Kundu R. YOLO: Algorithm for Object Detection Explained [+Examples]. V7 Labs. https://www.v7labs.com/blog/yolo-object-detection.

[9] Adarsh, P., Rathi, P., & Kumar, M. YOLO v3-Tiny: Object Detection and Recognition Using One Stage Improved Model. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS) (pp. 687-694). IEEE, 2020.

[10] Kandasamy S. YOLO for Computer Vision: A Deep Dive into Real-Time Object Detection. Juhomi, 2025. https://www.juhomi.com/post/yolo-for-computer-vision-a-deep-dive-into-real-time-object-detection.

[11] Sharma A. Understanding a Real-Time Object Detection Network: You Only Look Once (YOLOv1). PyImageSearch, 2022. https://pyimagesearch.com/2022/04/11/understanding-a-real-time-object-detection-network-you-only-look-once-yolov1/.

[12] Alif, M. A. R., & Hussain, M. YOLOv1 to YOLOv10: A Comprehensive Review of YOLO Variants and Their Application in the Agricultural Domain. arXiv preprint arXiv:2406.10139, 2024.

[13] Redmon, J., & Farhadi, A. Yolov3: An Incremental Improvement. arXiv preprint arXiv:1804.02767, 2018.

[14] Meel V. YOLOv3: Real-Time Object Detection Algorithm (Guide), 2022. https://viso.ai/deep-learning/yolov3-overview/.

[15] Padmanabula S. S., Puvvada R. C., Sistla V., Kolli V. K. K. Object Detection Using Stacked YOLOv3. Ingénierie des Systèmes d’Information, 2020, 25(5): 691–697.

[16] Solawetz J. What is YOLOv5? A Guide for Beginners, 2020. https://blog.roboflow.com/yolov5-improvements-and-evaluation/.

[17] NVIDIA Developer Forums. YOLOv3 is Very Slow. https://forums.developer.nvidia.com/t/yolov3-is-very-slow/74073.

[18] Ultralytics. Issue #2790: Inference Speed Improvements. https://github.com/ultralytics/yolov5/ issues/2790.

[19] Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. End-to-End Object Detection with Transformers. In: European Conference on Computer Vision. Cham: Springer, 2020: 213–229.

[20] Facebook Research. DETR: End-to-End Object Detection with Transformers. https://github.com/facebookresearch/detr.

[21] Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv preprint arXiv:2010.04159, 2020.

[22] Zhu X., Su W., Lu L., Li B., Wang X., Dai J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In: International Conference on Learning Representations, 2021. https://iclr.cc/virtual/2021/oral/3448.

[23] Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., Guo B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012–10022.

[24] Hugging Face. Swin Transformer—Model Doc. https://huggingface.co/docs/transformers/model_ doc/swin.

[25] Microsoft. Swin-Transformer. https://github.com/microsoft/Swin-Transformer.

[26] Spice B. New Perception Metric Balances Reaction Time, Accuracy: Both Elements Are Critical for Applications Such as Self-Driving Cars. Carnegie Mellon University, 2020. https://www.cs.cmu.edu/news/2020/new-perception-metric-balances-reaction-time-accuracy.

[27] Liu L., Lee J., Shin K. G. RT-BEV: Enhancing Real-Time BEV Perception for Autonomous Vehicles. In: 2024 IEEE Real-Time Systems Symposium (RTSS). IEEE, 2024: 267–279.

[28] Liu X., Li J., Ma J., Sun H., Xu Z., Zhang T., Yu H. Deep Transfer Learning for Intelligent Vehicle Perception: A Survey. Green Energy and Intelligent Transportastion, 2023, 2(12): 100125.

[29] Cui Y., Zhu J., Li J. FLAV: Federated Learning for Autonomous Vehicle Privacy Protection. Ad Hoc Networks, 2025, 166: 103685. https://doi.org/10.1016/j.adhoc.2024.103685.

[30] Xu H., Pan M., Huang X., Shen A. Federated Learning in Autonomous Vehicles Using Cross-Border Training. NVIDIA Developer Blog, 2024. https://developer.nvidia.com/blog/federated-learning-in-autonomous-vehicles-using-cross-border-training/.

[31] Fulbright, N. R. The Privacy Implications of Autonomous Vehicles. Data Protection Report, 2017.

[32] Mulder T., Vellinga N. E. Exploring Data Protection Challenges of Automated Driving. Computer Law & Security Review, 2021, 40: 105530. https://doi.org/10.1016/j.clsr.2021.105530.

[33] Brody A. J., Rolecki J. J., Stefan J. M. Navigating the Data Privacy Landscape for Autonomous and Connected Vehicles: Best Practices. The National Law Review, 2022. https://natlawreview.com/article/navigating-data-privacy-landscape-autonomous-and-connected-vehicles-best-practices.

[34] Xie J., Zhou X. Edge Computing for Real-Time Decision Making in Autonomous Driving: Review of Challenges, Solutions, and Future Trends. International Journal of Advanced Computer Science & Applications, 2024, 15(7).

[35] Rafie, M. Edge AI Computing Advancements Driving Autonomous Vehicle Potential. Global Semiconductor Alliance Tech Forum Article, 2021.