Trends in Image Object Detection and Segmentation with Deep Neural Network on Natural and Earth Observations

Claudia Hoeser; Gongyi Zhang

doi:10.62051/ijcsit.v4n3.05

Authors

Claudia Hoeser
Gongyi Zhang

DOI:

https://doi.org/10.62051/ijcsit.v4n3.05

Keywords:

Deep learning, Neural networks, Convolutional neural networks, Image segmentation and detection

Abstract

Deep learning has a profound impact across multiple scientific domains and has increasingly positioned itself as a versatile tool for addressing new challenges in the field of natural (Earth) observation. However, researchers in Earth observation often face significant entry barriers, primarily due to the complexity and rapid evolution of the field, which is heavily driven by innovations in computer vision. To assist researchers in overcoming these challenges, this paper provides a comprehensive discussion of the development of deep learning, with a particular focus on image segmentation and object detection using convolutional neural networks. The paper begins with key developments when convolutional neural networks set new benchmarks in image recognition, and continues through to the innovations made recently. This paper traces the interconnections among the most influential convolutional architectures and major milestones originating from computer vision, facilitating a clear understanding of how these advances shape modern deep learning models. Additionally, we offer insights into the evolution of popular deep learning frameworks and present a summary of datasets commonly used in natural observation. By exploring well-performing architectures and evaluating their applications on Earth observation datasets, we assess how breakthroughs in computer vision influence future research in natural observation. This paper bridges the gap between theoretical advancements in computer vision and their practical implementation in natural observation, equipping researchers with the knowledge needed to effectively integrate deep learning into their work.

Downloads

Download data is not yet available.

References

[1] J. Huo, S.F. Quan, J. Roveda, A. Li, Coupling analysis of heart rate variability and cortical arousal using a deep learning algorithm, Plos one 18(4) (2023) e0284167.

[2] R. Li, S. Sun, M. Elhoseiny, P. Torr, OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 20293-20303.

[3] X. Chen, W. Liu, X. Liu, Y. Zhang, T. Mei, A cross-modality and progressive person search system, Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 4550-4552.

[4] L.V. Jospin, H. Laga, F. Boussaid, W. Buntine, M. Bennamoun, Hands-on Bayesian neural networks—A tutorial for deep learning users, IEEE Computational Intelligence Magazine 17(2) (2022) 29-48.

[5] X. Chen, X. Liu, K. Liu, W. Liu, T. Mei, A baseline framework for part-level action parsing and action recognition, arXiv preprint arXiv:2110.03368 (2021).

[6] X. Chen, X. Liu, W. Liu, K. Liu, D. Wu, Y. Zhang, T. Mei, Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework, 2022 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, 2022, pp. 419-423.

[7] M. Qu, X. Chen, W. Liu, A. Li, Y. Zhao, ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 1847-1856.

[8] L. Yang, Z. Zhang, J. Han, B. Zeng, R. Li, P. Torr, W. Zhang, Semantic Score Distillation Sampling for Compositional Text-to-3D Generation, arXiv preprint arXiv:2410.09009 (2024).

[9] Z. Gui, S. Sun, R. Li, J. Yuan, Z. An, K. Roth, A. Prabhu, P. Torr, kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies, arXiv preprint arXiv:2404.09447 (2024).

[10] M. Yin, T. Li, H. Lei, Y. Hu, S. Rangan, Q. Zhu, Zero-Shot Wireless Indoor Navigation through Physics-Informed Reinforcement Learning, 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 5111-5118.

[11] J. Huo, H. Li, J. Roveda, S.F. Quan, A. Li, A Multi-task Deep Learning Algorithm for Sleep Stage Scoring and Sleep Arousal Detection, Authorea Preprints (2023).

[12] H. Guo, A.B. Tikhomirov, A. Mitchell, I.P.J. Alwayn, H. Zeng, K.C. Hewitt, Real-time assessment of liver fat content using a filter-based Raman system operating under ambient light through lock-in amplification, Biomedical Optics Express 13(10) (2022) 5231-5245.

[13] H. Guo, B.L. Gala-Lopez, I.P. Alwayn, K.C. Hewitt, Liver discard rate due to conservative estimations of steatosis: an inference-based approach, medRxiv (2023) 2023.12. 04.23299406.

[14] J. Huo, Machine Learning Application in Sleep Disorder Analysis, The University of Arizona, 2023.

[15] C. Ding, T. Yao, C. Wu, J. Ni, Deep Learning for Personalized Electrocardiogram Diagnosis: A Review, arXiv preprint arXiv:2409.07975 (2024).

[16] J. Huo, S.F. Quan, J. Roveda, A. Li, BASH-GN: a new machine learning–derived questionnaire for screening obstructive sleep apnea, Sleep and Breathing 27(2) (2023) 449-457.

[17] J. Huo, Y. Wang, N. Wang, W. Gao, J. Zhou, Y. Cao, Data-driven design and optimization of ultra-tunable acoustic metamaterials, Smart Materials and Structures 32(5) (2023) 05LT01.

[18] J. Huo, Y. Wang, Y. Cao, 3D computational study of arc splitting during power interruption: the influence of metal vapor enhanced radiation on arc dynamics, Journal of Physics D: Applied Physics 54(8) (2020) 085502.

[19] Z. Hu, Y. Sun, Y. Yang, Switch to generalize: Domain-switch learning for cross-domain few-shot classification, International Conference on Learning Representations, 2022.

[20] M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, International journal of computer vision 88 (2010) 303-338.