SA-UNetv2: Rethinking Spatial Attention U-Net for Retinal Vessel Segmentation

πŸ“… 2025-09-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address insufficient multi-scale feature fusion and severe foreground-background class imbalance in retinal vessel segmentation, this paper proposes SA-UNetv2β€”a lightweight U-Net variant. Methodologically, it introduces cross-scale spatial attention mechanisms across *all* skip connections for adaptive, weighted fusion of multi-level encoder-decoder featuresβ€”a first in U-Net architectures. Furthermore, it jointly optimizes a weighted binary cross-entropy (WBCE) loss and a Matthews Correlation Coefficient (MCC)-based loss to explicitly mitigate class imbalance. The model achieves state-of-the-art performance on DRIVE and STARE benchmarks with only 0.26M parameters and 1.2MB memory footprint. It processes a single 592Γ—592 image in just one second on CPU, demonstrating high accuracy, strong robustness, and exceptional suitability for edge deployment.

Technology Category

Application Category

πŸ“ Abstract
Retinal vessel segmentation is essential for early diagnosis of diseases such as diabetic retinopathy, hypertension, and neurodegenerative disorders. Although SA-UNet introduces spatial attention in the bottleneck, it underuses attention in skip connections and does not address the severe foreground-background imbalance. We propose SA-UNetv2, a lightweight model that injects cross-scale spatial attention into all skip connections to strengthen multi-scale feature fusion and adopts a weighted Binary Cross-Entropy (BCE) plus Matthews Correlation Coefficient (MCC) loss to improve robustness to class imbalance. On the public DRIVE and STARE datasets, SA-UNetv2 achieves state-of-the-art performance with only 1.2MB memory and 0.26M parameters (less than 50% of SA-UNet), and 1 second CPU inference on 592 x 592 x 3 images, demonstrating strong efficiency and deployability in resource-constrained, CPU-only settings.
Problem

Research questions and friction points this paper is trying to address.

Addressing underutilized attention in skip connections
Solving severe foreground-background class imbalance
Improving retinal vessel segmentation efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-scale spatial attention in skip connections
Weighted BCE plus MCC loss function
Lightweight model for efficient CPU inference
πŸ”Ž Similar Papers
No similar papers found.