Understanding Convolutional Neural Networks for Image Classification

Jinghan Xu

doi:10.62051/rwf1e012

Authors

Jinghan Xu

DOI:

https://doi.org/10.62051/rwf1e012

Keywords:

Image classification, convolutional neural network, computer vision.

Abstract

In an era where artificial intelligence is becoming increasingly globalized, the development of AI-based image processing has undergone a significant transformation compared to previous generations of mechanical algorithms. Whether compared to classical machine learning classification models or traditional image classification methods that do not involve machine learning, technologies based on Convolutional Neural Networks (CNNs) demonstrate notable advantages—including higher accuracy, stronger feature extraction capabilities, and broader cross-domain applicability. These benefits have opened new and viable pathways for the deployment of CNNs across various fields. However, no algorithm is perfect. With continued advancements in research, new challenges are constantly being discovered and addressed, contributing to the ongoing refinement of the CNN framework. Today, the concepts of CNNs and deep CNNs (DCNNs) have been widely integrated into a range of popular AI applications. This paper aims to provide a comprehensive review of classical CNN-based image classification models, examine the strengths and limitations present in various architectures, and further explore their potential future applications and developmental trajectories.

Downloads

Download data is not yet available.

References

[1] Lu, Dengsheng, and Qihao Weng. "A survey of image classification methods and techniques for improving classification performance." International journal of Remote sensing 28.5 (2007): 823-870.

[2] Zhang, Hao, et al. "SVM-KNN: Discriminative nearest neighbor classification for visual category recognition." 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). Vol. 2. IEEE, 2006.

[3] Chen, Leiyu, et al. "Review of image classification algorithms based on convolutional neural networks." Remote Sensing 13.22 (2021): 4712.

[4] McCulloch, Warren S., and Walter Pitts. "A logical calculus of the ideas immanent in nervous activity." The bulletin of mathematical biophysics 5 (1943): 115-133.

[5] Werbos, P.J. beyond Regression: New Tools for Prediction and Analysis in the Behavioral Science. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974.

[6] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.

[7] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in neural information processing systems, vol. 25, no. 2, 2012.

[8] Li, Zewen, et al. "A survey of convolutional neural networks: analysis, applications, and prospects." IEEE transactions on neural networks and learning systems 33.12 (2021): 6999-7019.

[9] Chen, Leiyu, et al. "Review of image classification algorithms based on convolutional neural networks." Remote Sensing 13.22 (2021): 4712.

[10] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv: 1409.1556 (2014).

[11] Swapna, M., Yogesh Kumar Sharma, and B. M. G. Prasadh. "CNN Architectures: Alex Net, Le Net, VGG, Google Net, Res Net." Int. J. Recent Technol. Eng 8.6 (2020): 953-960.

[12] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

[13] Swapna, M., Yogesh Kumar Sharma, and B. M. G. Prasadh. "CNN Architectures: Alex Net, Le Net, VGG, Google Net, Res Net." Int. J. Recent Technol. Eng 8.6 (2020): 953-960.

[14] Arora, Sanjeev, et al. "Provable bounds for learning some deep representations." International conference on machine learning. PMLR, 2014.

[15] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

[16] Swapna, M., Yogesh Kumar Sharma, and B. M. G. Prasadh. "CNN Architectures: Alex Net, Le Net, VGG, Google Net, Res Net." Int. J. Recent Technol. Eng 8.6 (2020): 953-960.

[17] Wu, Zifeng, Chunhua Shen, and Anton Van Den Hengel. "Wider or deeper: Revisiting the resnet model for visual recognition." Pattern recognition 90 (2019): 119-133.

[18] He, Kaiming, et al. "Identity mappings in deep residual networks." Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, Proceedings, Part IV 14. Springer International Publishing, 2016.

[19] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778

[20] Swapna, M., Yogesh Kumar Sharma, and B. M. G. Prasadh. "CNN Architectures: Alex Net, Le Net, VGG, Google Net, Res Net." Int. J. Recent Technol. Eng 8.6 (2020): 953-960.

[21] Huang, GAO, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.