🤖 AI Summary
This work addresses the challenges of scarce annotations and the lack of feature distribution constraints in conventional discriminative approaches, which often lead to non-robust semantic representations in medical image segmentation. To overcome these limitations, the authors propose a generative dual-distribution alignment framework that employs a dual-encoder architecture to model the joint feature distribution of images and masks. The framework introduces a novel dual-distribution alignment module together with consistency-driven skip adapters, enabling multi-scale structured feature fusion and fine-grained cross-branch semantic alignment. Extensive experiments across multiple medical imaging datasets demonstrate that the proposed method significantly outperforms existing semi-supervised segmentation approaches, achieving enhanced semantic learning capability and improved generalization across diverse clinical scenarios.
📝 Abstract
Semi-supervised learning addresses label scarcity and high annotation costs in medical image segmentation by exploiting the latent information in unlabeled data to enhance model performance. Traditional discriminative segmentation relies on segmentation masks, neglecting feature-level distribution constraints. This limits robust semantic representation learning and adaptive modeling of unlabeled data in scenarios with few labels. To address these limitations, we propose SemiGDA, a novel Generative Dual-distribution Alignment framework for semi-supervised medical image segmentation. Our SemiGDA overcomes the reliance of discriminative methods on large labeled datasets by aligning feature and semantic distributions to boost semantic learning and scene adaptability. Specifically, we propose a Dual-distribution Alignment Module (DAM), which employs two structurally distinct encoders to model image and mask feature distributions. It enforces their alignment in the latent space via distributional constraints, establishing structured feature consistency. Moreover, we design a Consistency-Driven Skip Adapter (CDSA) strategy, which introduces dual skip adapters (Image and Mask) to fuse multi-scale features via skip connections. Using a consistency loss, CDSA enhances cross-branch semantic alignment and reinforces fine-grained semantic consistency. Experimental results on diverse medical datasets show that our method outperforms other state-of-the-art semi-supervised segmentation methods. Code is released at: https://github.com/taozh2017/SemiGDA.