Advances in Weakly Supervised Object Detection: Leveraging Unlabeled Data for Enhanced Performance
DOI:
https://doi.org/10.62051/w7vsy682Keywords:
Weak supervised object detection model algorithm.Abstract
Weakly supervised object detection represents a burgeoning field within the realm of computer vision, reflecting the growing interest in developing models that can effectively identify and classify objects with minimal labeled data. This paper offers a comprehensive classification of contemporary, state-of-the-art deep learning models tailored for weakly supervised target detection. The classification encompasses four principal categories: Multi-Instance Learning (MIL), Class Activation Mapping (CAM), Deep Weakly Supervised Learning leveraging Attention Mechanisms, and Weakly Supervised Object Detection employing Pseudo-labels. Each category represents a unique approach to the challenge of discerning and localizing objects with limited supervision, emphasizing different aspects of learning from sparse or imprecise annotations. Our analysis delves into the intricate methodologies and theoretical foundations underlying these models, offering insights into their practical applications and performance metrics. Furthermore, we explore the evolutionary trajectory of these techniques, highlighting their advancements and the pivotal role they play in advancing the frontiers of automated object detection in diverse and complex environments. This synthesis not only charts the current landscape of weakly supervised object detection but also paves the way for future research directions in this dynamic and rapidly evolving field.
Downloads
References
Wang, Z., Zhang, W., & Zhang, M. L. (2023). Transformer-based Multi-Instance Learning for Weakly Supervised Object Detection. preprint, 2303.14999.
Yang, Y., Pan, Z., Hu, Y., et al. (2022). PistonNet: Object separating from background by attention for weakly supervised ship detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 5190-5202. DOI: https://doi.org/10.1109/JSTARS.2022.3184637
Yao, X., Feng, X., Han, J., et al. (2020). Automatic weakly supervised object detection from high spatial resolution remote sensing images via dynamic curriculum learning. IEEE Transactions on Geoscience and Remote Sensing, 59(1), 675-685. DOI: https://doi.org/10.1109/TGRS.2020.2991407
Shao, F., Chen, L., Shao, J., et al. (2022). Deep learning for weakly-supervised object detection and localization: A survey. Neurocomputing, 496, 192-207. DOI: https://doi.org/10.1016/j.neucom.2022.01.095
Eloy, P., M. J. T., Pau, G., et al. (2023). Deep machine learning for meteor monitoring: Advances with transfer learning and gradient-weighted class activation mapping. Planetary and Space Science, 238. DOI: https://doi.org/10.1016/j.pss.2023.105802
Yueyue, H., Yingyan, H., Hangcheng, D., et al. Continuous gradient fusion class activation mapping: segmentation of laser-induced damage on large-aperture optics.
Liu, Q., Li, W., Yu, S., et al. A monocular 3D target detection algorithm combining depth information guidance and multi-scale channel attention mechanism. Journal of Shandong University (Science Edition).
Zhou, H., Zhan, F., Zhou, C., et al. (2024). Pedestrian re-recognition method based on attention mechanism and multi-branch association. Microelectronics and Computers, (02).
Li, J., Ji, W., Bi, Q., et al. (2021). Joint semantic mining for weakly supervised RGB-D salient object detection. Advances in Neural Information Processing Systems, 34, 11945-11959.
Zhang, Y., Bai, Y., Ding, M., et al. (2018). W2f: A weakly-supervised to fully-supervised framework for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 928-936). DOI: https://doi.org/10.1109/CVPR.2018.00103
Lang, H., Vijayaraghavan, A., & Sontag, D. (2022). Training subset selection for weak supervision. Advances in Neural Information Processing Systems, 35, 16023-1603.
Xu, M., Zhang, Z., Hu, H., et al. (2021). End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3060-3069). DOI: https://doi.org/10.1109/ICCV48922.2021.00305
Downloads
Published
Conference Proceedings Volume
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.