π€ AI Summary
Medical image segmentation is hindered by scarce annotated data, while existing diffusion models generate synthetic image-mask pairs with low fidelity and insufficient morphological detail. To address this, we propose Siamese-Diffusion, a novel dual-branch diffusion framework that introduces noise consistency lossβthe first to align the noise evolution trajectories of image and mask branches in parameter space, enabling conditional joint modeling. During training, both branches are co-optimized; during inference, only the mask branch is sampled, ensuring efficiency and diversity. This decouples training from inference, significantly improving mask morphological fidelity. On Polyps and ISIC2018 benchmarks, downstream segmentation models (SANet/UNet) achieve +3.6% and +1.52% gains in mDice, and +4.4% and +1.64% in mIoU, respectively, demonstrating substantial improvements in segmentation robustness and generalization.
π Abstract
Deep learning has revolutionized medical image segmentation, yet its full potential remains constrained by the paucity of annotated datasets. While diffusion models have emerged as a promising approach for generating synthetic image-mask pairs to augment these datasets, they paradoxically suffer from the same data scarcity challenges they aim to mitigate. Traditional mask-only models frequently yield low-fidelity images due to their inability to adequately capture morphological intricacies, which can critically compromise the robustness and reliability of segmentation models. To alleviate this limitation, we introduce Siamese-Diffusion, a novel dual-component model comprising Mask-Diffusion and Image-Diffusion. During training, a Noise Consistency Loss is introduced between these components to enhance the morphological fidelity of Mask-Diffusion in the parameter space. During sampling, only Mask-Diffusion is used, ensuring diversity and scalability. Comprehensive experiments demonstrate the superiority of our method. Siamese-Diffusion boosts SANet's mDice and mIoU by 3.6% and 4.4% on the Polyps, while UNet improves by 1.52% and 1.64% on the ISIC2018. Code is available at GitHub.