MSCA++-UNet: Image Segmentation Model with Coordinate and Channel Attention Mechanisms

Yan Gao; Xinru Xue; Jiaxin Zhang; Shengnan Dai; Jiayao Chen; Jincan Zhang

doi:10.62051/ijcsit.v8n4.06

Authors

Yan Gao
Xinru Xue
Jiaxin Zhang
Shengnan Dai
Jiayao Chen
Jincan Zhang

DOI:

https://doi.org/10.62051/ijcsit.v8n4.06

Keywords:

Retinal vessel segmentation, MSCA++-UNet, Multi-scale coordinate attention

Abstract

Retinal vessel segmentation in fundus images is critical for ophthalmic and systemic disease diagnosis, yet it is challenged by intricate vessel structures, low image contrast, and fine detail loss in feature extraction. Existing U-Net-based deep learning methods suffer from insufficient channel feature utilization, incomplete single-dimension attention enhancement, and poor fine vessel continuity, while lightweight models trade accuracy for efficiency, limiting clinical application. This paper proposes MSCA++-UNet,a novel segmentation model integrating a Multi-Scale Coordinate and Channel Attention Plus Plus (MSCA++) dual module into the U-Net backbone. The MSCA++ module realizes parallel computation and early fusion of multi-scale coordinate and adaptive channel attention, capturing spatial positional dependencies and dynamic channel correlations synergistically; combined with the lightweight AttentionDoubleConv block and optimized encoder-decoder fusion, it enhances edge preservation and feature discrimination with high computational efficiency. Extensive experiments on the DRIVE dataset (comparative, ablation, quantitative/qualitative analyses) show that MSCA++-UNet achieves a Dice coefficient of 0.787, mIoU of 79.7% and loss of 0.3624, outperforming baseline and single-attention variants in key metrics. It improves fine vessel detection accuracy and continuity, suppresses background noise, and balances performance (2.92 M parameters, 7.48 G FLOPs) with efficiency for edge device deployment. This work verifies the dual attention fusion strategy’s effectiveness, and MSCA++-UNet provides a high-precision, clinically feasible solution for fundus image analysis, supporting computer-aided diagnosis and large-scale retinal screening.

Downloads

Download data is not yet available.

References

[1] Arsalan, M., Haider, A., Lee, Y. W., Park, K. R. (2022). Detecting retinal vasculature as a key biomarker for deep learning based intelligent screening and analysis of diabetic and hypertensive retinopathy. Expert Systems with Applications, 200, 117009. https://doi.org/10.1016/j.eswa.2022.117009

[2] Staal, J., Abramoff, M. D., Niemeijer, M., Viergever, M. A., van Ginneken, B. (2004). Ridge-based vessel segmentation in color images of the retina. IEEE Transactions on Medical Imaging, 23(4), 501–509. https://doi.org/10.1109/TMI.2004.825627

[3] Li, Y., Zhang, Y., Cui, W., Lei, B., Kuang, X., Zhang, T. (2022). Dual encoder-based dynamic-channel graph convolutional network with edge enhancement for retinal vessel segmentation. IEEE Transactions on Medical Imaging, 41(8), 1975–1989. https://doi.org/10.1109/TMI.2022.3151666

[4] Hoover, A. D., Kouznetsova, V., Goldbaum, M. (2000). Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Transactions on Medical Imaging, 19(3), 203–210. https://doi.org/10.1109/42.845178

[5] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI), 9351, 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

[6] Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., Liang, J. (2018). UNet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer, 3–11. https://doi.org/10.1007/978-3-030-04261-6_3

[7] Badrinarayanan, V., Kendall, A., Cipolla, R. (2017). SegNet: A deep convolutional encoder–decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615

[8] Oktay, O., Schlemper, J., Le Folgoc, L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N., Kainz, B., Glocker, B., Rueckert, D. (2018). Attention U-Net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999. https://arxiv.org/abs/1804.03999

[9] Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E. (2020). Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8), 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372

[10] Woo, S., Park, J., Lee, J., Kweon, I. (2018). CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1

[11] Hou, Q., Zhou, D., Feng, J. (2021). Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13713–13722. https://doi.org/10.1109/CVPR46437.2021.01350

[12] Li, L., Verma, M., Nakashima, Y., Nagahara, H., Kawasaki, R. (2019). IterNet: Retinal image segmentation utilizing structural redundancy in vessel networks. arXiv preprint arXiv:1910.03297. https://arxiv.org/abs/1910.03297

[13] Mou, L., Zhao, Y., Fu, H., Liu, Y., Cheng, J., Zheng, Y., Su, P., Yang, J., Chen, L., Frangi, A. F., Akiba, M., Liu, J. (2021). CS²-Net: Deep learning segmentation of curvilinear structures in medical imaging. Medical Image Analysis, 67, 101874. https://doi.org/10.1016/j.media.2020.101874

[14] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

[15] Huang, G., Liu, Z., Pleiss, G., van der Maaten, L., Weinberger, K. Q. (2022). Convolutional networks with dense connectivity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 8704–8716. https://doi.org/10.1109/TPAMI.2019.2918284