🤖 AI Summary
In medical image segmentation, annotation variability arises from ambiguous imaging boundaries and inter-expert clinical preferences, hindering models’ ability to reconcile population-level consensus with individual expert specificity. To address this, we propose the first two-stage diffusion model framework: Stage I generates a probabilistic consensus representation from multi-expert annotations, serving as a population-standard reference; Stage II incorporates an adaptive prompting mechanism to explicitly model and preserve each expert’s discriminative preferences. This framework uniquely unifies consensus-driven and preference-driven segmentation within a single architecture, enabling both holistic clinical judgment and personalized prediction. Evaluated on LIDC-IDRI and NPC-170, our method consistently surpasses state-of-the-art approaches across all quantitative metrics—demonstrating superior accuracy, robustness to annotation variability, and clinical applicability.
📝 Abstract
Annotation variability remains a substantial challenge in medical image segmentation, stemming from ambiguous imaging boundaries and diverse clinical expertise. Traditional deep learning methods producing single deterministic segmentation predictions often fail to capture these annotator biases. Although recent studies have explored multi-rater segmentation, existing methods typically focus on a single perspective -- either generating a probabilistic ``gold standard'' consensus or preserving expert-specific preferences -- thus struggling to provide a more omni view. In this study, we propose DiffOSeg, a two-stage diffusion-based framework, which aims to simultaneously achieve both consensus-driven (combining all experts' opinions) and preference-driven (reflecting experts' individual assessments) segmentation. Stage I establishes population consensus through a probabilistic consensus strategy, while Stage II captures expert-specific preference via adaptive prompts. Demonstrated on two public datasets (LIDC-IDRI and NPC-170), our model outperforms existing state-of-the-art methods across all evaluated metrics. Source code is available at https://github.com/string-ellipses/DiffOSeg .