🤖 AI Summary
Diffusion probabilistic models (DPMs) suffer from inefficient training and suboptimal solutions in medical lesion segmentation due to uneven attention distribution across diffusion timesteps. To address this, we propose UniSegDiff—the first stage-wise diffusion framework supporting unified, multimodal, multi-organ segmentation. Its core contributions are: (1) a stage-wise diffusion architecture that decouples coarse-grained anatomical structure modeling from fine-grained lesion refinement; (2) a dynamic target adjustment strategy that promotes balanced attention allocation across timesteps; and (3) integration of a pretrained feature encoder to enhance cross-modal and cross-organ generalization. Extensive experiments across six organ types and multiple imaging modalities demonstrate that UniSegDiff significantly outperforms state-of-the-art methods—achieving sharper lesion boundaries, faster convergence, and higher segmentation accuracy.
📝 Abstract
The Diffusion Probabilistic Model (DPM) has demonstrated remarkable performance across a variety of generative tasks. The inherent randomness in diffusion models helps address issues such as blurring at the edges of medical images and labels, positioning Diffusion Probabilistic Models (DPMs) as a promising approach for lesion segmentation. However, we find that the current training and inference strategies of diffusion models result in an uneven distribution of attention across different timesteps, leading to longer training times and suboptimal solutions. To this end, we propose UniSegDiff, a novel diffusion model framework designed to address lesion segmentation in a unified manner across multiple modalities and organs. This framework introduces a staged training and inference approach, dynamically adjusting the prediction targets at different stages, forcing the model to maintain high attention across all timesteps, and achieves unified lesion segmentation through pre-training the feature extraction network for segmentation. We evaluate performance on six different organs across various imaging modalities. Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches. The code is available at https://github.com/HUYILONG-Z/UniSegDiff.