ProSona: Prompt-Guided Personalization for Multi-Expert Medical Image Segmentation

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

In medical image segmentation, expert annotations exhibit substantial inter-observer variability (e.g., lung nodule delineation); existing methods either enforce consensus or assign separate branches per expert, lacking controllable personalization. This paper introduces the first natural language prompt-driven two-stage personalized segmentation framework. It employs a probabilistic U-Net to generate diverse segmentation hypotheses, incorporates multi-level contrastive learning to align textual prompts with visual representations, and designs a prompt-guided latent-space projection mechanism to achieve disentangled, interpretable modeling of expert-specific styles. Evaluated on LIDC-IDRI and prostate MRI datasets, our method reduces the Generalized Energy Distance by 17% and improves mean Dice score by over 1.0 points compared to DPersona. It is the first to enable fine-grained, language-driven, style-controllable segmentation.

Technology Category

Application Category

📝 Abstract

Automated medical image segmentation suffers from high inter-observer variability, particularly in tasks such as lung nodule delineation, where experts often disagree. Existing approaches either collapse this variability into a consensus mask or rely on separate model branches for each annotator. We introduce ProSona, a two-stage framework that learns a continuous latent space of annotation styles, enabling controllable personalization via natural language prompts. A probabilistic U-Net backbone captures diverse expert hypotheses, while a prompt-guided projection mechanism navigates this latent space to generate personalized segmentations. A multi-level contrastive objective aligns textual and visual representations, promoting disentangled and interpretable expert styles. Across the LIDC-IDRI lung nodule and multi-institutional prostate MRI datasets, ProSona reduces the Generalized Energy Distance by 17% and improves mean Dice by more than one point compared with DPersona. These results demonstrate that natural-language prompts can provide flexible, accurate, and interpretable control over personalized medical image segmentation. Our implementation is available online 1 .

Problem

Research questions and friction points this paper is trying to address.

Addresses high inter-observer variability in medical image segmentation

Enables personalized segmentation using natural language prompts

Learns continuous latent space of annotation styles for control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns continuous latent space of annotation styles

Uses prompt-guided projection for personalized segmentations

Aligns text and visual representations with contrastive objective

🔎 Similar Papers

No similar papers found.