Comparative Study of Vision-Based and LiDAR-Based Object Detection Algorithms for Autonomous Driving
DOI:
https://doi.org/10.62051/aqwws043Keywords:
Autonomous Driving; Object Detection; Convolutional Neural Network (CNN); Light Detection and Ranging (LiDAR).Abstract
Autonomous vehicles achieve motion perception and path planning through object recognition. With the integration of deep learning into object detection, numerous practical and innovative studies have emerged in this field. Considering both hardware costs and recognition accuracy, object detection algorithms can be categorized into vision-based and Light Detection and Ranging (LiDAR)-based approaches. This paper reviews classic literature and cutting-edge technologies of both methods, detailing how vision-based approaches utilize computer vision to implement the You Only Look Once (YOLO) series and Faster Region-Convolutional Neural Network (Faster-RCNN). It also explains LiDAR-based approaches, focusing on classic models of point cloud and image fusion, including early fusion, deep fusion, and late fusion architectures. The advantages and limitations of both methods are compared, with experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset showing that vision-based single-stage detection algorithms perform excellently, while the LiDAR-based Cascade Bi-directional Fusion algorithm stands out. The paper concludes with promising research directions, highlighting that continuous hardware advancements and constant updates in object detection algorithms are bringing us closer to fully realizing the goal of autonomous driving.
Downloads
References
[1] Yan Z. and Wei Q. Traffic sign recognition based on deep learning. Multimedia Tools and Applications, 2022, 81 (13): 17779 – 17791.
[2] Sarthak B. and Jatin B. Real-time traffic, accident, and potholes detection by deep learning techniques: a modern approach for traffic management. Neural Computing and Applications, 2023, 35 (26): 19465 – 19479.
[3] Ya T. Z. Tian H. Song G. and Martin R. Incorporating multimodal context information into traffic speed forecasting through graph deep learning. International Journal of Geographical Information Science, 2023, 37 (9):1909 – 1935.
[4] Ren S. He K. Girshick R. and Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39 (6): 1137 - 149.
[5] Jia X. Tong Y. Qiao H. et al. Fast and accurate object detector for autonomous driving based on improved YOLOv5. 2023, 13: 9711.
[6] Glenn J. “Ultralytics yolov8”, 2023, https://github.com/ultralytics/ultralytics.
[7] Qi C. Liu W. Wu C. Su H. and Guibas L. Frustum PointNets for 3D Object Detection from RGB-D Data. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 918 - 927.
[8] Chen X. Ma H. Wan J. Li B. and Xia T. Multi-view 3D Object Detection Network for Autonomous Driving. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, 6526 - 6534.
[9] Liu Z. Huang T. Li B. Chen X. Wang X. and Bai X. EPNet++: Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (7): 8324 - 8341.
[10] Yang C. Object Detection in the KITTI Dataset using YOLO and Faster R-CNN, 2023.
[11] Mulia D.A. Safitri S. & Negara, I.G.P.K. YOLOv8 and Faster R-CNN Performance Evaluation with Super-resolution in License Plate Recognition. International Journal of Computing and Digital Systems, 2024, 15 (1); 365 - 375.
Downloads
Published
Conference Proceedings Volume
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.