Progress of Object Detection Based on Deep Learning
DOI:
https://doi.org/10.62051/tjd37347Keywords:
Deep Learning; 2D Object Detection; Computer VisionAbstract
Object detection is commonly utilized in fundamental computer vision research fields: medical, autonomous driving, sensing monitoring, and other fields. With the advancement of high-performance computing technology, the computing speed of hardware has made a significant advancement. Meanwhile, machine learning-related algorithms, especially deep learning-based algorithms, greatly boosted the detection speed and accuracy. In practical applications, engineers have little understanding of the principles, effects, and performance of these algorithms, which seriously hinders the industrial application of object detection. However, the advancement has fueled challenges and competition for better modules and industrialization. Accordingly, the main purpose of this study is to present the algorithms of object detection and summarize it from three aspects: data set, algorithm, and performance, so as to provide a reference for researchers in related fields. In terms of algorithms, the paper mainly introduces anchor-free, single-stage, and two-stage algorithms. Finally, the existing problems and future research directions for target detection are discussed.
Downloads
References
Z. Zou, et al. Object detection in 20 years: A survey. Proceedings of the IEEE, 2023, 111(3):257-76.
Law H, and Deng J. Cornernet: Detecting objects as paired keypoints. Proceedings of the European conference on computer vision (ECCV), 2018, 734-750.
X. Zhou, et al. Objects as points. arXiv preprint, 2019, arXiv:1904.07850.
Z. Tian, et al. FCOS: Fully convolutional one-stage object detection. arXiv preprint arXiv:1904.01355. 2019.
Sermanet P, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint, 2013, arXiv:1312.6229
J. Terven, et al. A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Machine Learning and Knowledge Extraction, 2023, 5(4), 1680-1716.
W. Liu, et al. Ssd: Single shot multibox detector. Computer Vision–ECCV 2016: 14th European Conference, 2016, Part I 14, 21-37.
TY. Lin, et al. Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision, 2017, 2980-2988.
Girshick, R., et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, 580-587.
Girshick R. Fast r-cnn. Proceedings of the IEEE international conference on computer vision, 2015, 1440-1448.
S. Ren, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6):1137-49.
TY. Lin, et al. Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, 2117-2125.
K. He, et al. Mask r-cnn. Proceedings of the IEEE international conference on computer vision, 2017, 2961-2969.
W. Wang. Internimage: Exploring large-scale vision foundation models with deformable convolutions. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 14408-14419.
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.