Song, Shuo. “Research on Cross-Modal Interaction Techniques Between Natural Language Processing and Computer Vision”. International Journal of Computer Science and Information Technology 7, no. 2 (September 27, 2025): 31–36. Accessed June 18, 2026. https://wepub.org/index.php/IJCSIT/article/view/5770.