🤖 AI Summary
This paper addresses the challenges of weak generalization, loss of fine-grained anatomical details, and insufficient global distribution modeling in semi-supervised medical image segmentation, primarily caused by scarce labeled data. To this end, we propose a novel hybrid framework that synergistically integrates diffusion models with convolutional neural networks (CNNs). Methodologically: (1) we introduce a cross-pseudo-supervision mechanism—first of its kind—to enhance the effective utilization of unlabeled data; (2) we design a high-frequency spectral-domain Mamba module to strengthen boundary delineation and local structural modeling; and (3) we incorporate contrastive learning to enable robust label propagation from labeled to unlabeled samples. Evaluated on three benchmark datasets—left atrium, brain tumor, and NIH pancreas—our method achieves state-of-the-art performance, demonstrating significant improvements in both segmentation accuracy and robustness under limited labeling budgets.
📝 Abstract
Semi-supervised learning utilizes insights from unlabeled data to improve model generalization, thereby reducing reliance on large labeled datasets. Most existing studies focus on limited samples and fail to capture the overall data distribution. We contend that combining distributional information with detailed information is crucial for achieving more robust and accurate segmentation results. On the one hand, with its robust generative capabilities, diffusion models (DM) learn data distribution effectively. However, it struggles with fine detail capture, leading to generated images with misleading details. Combining DM with convolutional neural networks (CNNs) enables the former to learn data distribution while the latter corrects fine details. While capturing complete high-frequency details by CNNs requires substantial computational resources and is susceptible to local noise. On the other hand, given that both labeled and unlabeled data come from the same distribution, we believe that regions in unlabeled data similar to overall class semantics to labeled data are likely to belong to the same class, while regions with minimal similarity are less likely to. This work introduces a semi-supervised medical image segmentation framework from the distribution perspective (Diff-CL). Firstly, we propose a cross-pseudo-supervision learning mechanism between diffusion and convolution segmentation networks. Secondly, we design a high-frequency mamba module to capture boundary and detail information globally. Finally, we apply contrastive learning for label propagation from labeled to unlabeled data. Our method achieves state-of-the-art (SOTA) performance across three datasets, including left atrium, brain tumor, and NIH pancreas datasets.