🤖 AI Summary
To address the generalization bottleneck in cross-domain few-shot segmentation (CD-FSS) caused by data scarcity and domain shift, this paper proposes a composable meta-prompting framework tailored for the Segment Anything Model (SAM). To mitigate SAM’s reliance on handcrafted prompts and its limited cross-domain adaptability, we introduce three key components: reference augmentation and transformation, composable meta-prompt generation, and frequency-domain-aware interaction—enabling automatic prompt construction, semantic expansion, and domain-difference suppression. Crucially, our framework avoids fine-tuning SAM’s backbone, achieving strong cross-domain transfer solely via lightweight prompt engineering. Evaluated on four standard CD-FSS benchmarks, it achieves 71.8% and 74.5% mIoU under 1-shot and 5-shot settings, respectively—outperforming prior methods significantly. This work establishes an efficient, generalizable, and interpretable paradigm for CD-FSS.
📝 Abstract
Cross-Domain Few-Shot Segmentation (CD-FSS) remains challenging due to limited data and domain shifts. Recent foundation models like the Segment Anything Model (SAM) have shown remarkable zero-shot generalization capability in general segmentation tasks, making it a promising solution for few-shot scenarios. However, adapting SAM to CD-FSS faces two critical challenges: reliance on manual prompt and limited cross-domain ability. Therefore, we propose the Composable Meta-Prompt (CMP) framework that introduces three key modules: (i) the Reference Complement and Transformation (RCT) module for semantic expansion, (ii) the Composable Meta-Prompt Generation (CMPG) module for automated meta-prompt synthesis, and (iii) the Frequency-Aware Interaction (FAI) module for domain discrepancy mitigation. Evaluations across four cross-domain datasets demonstrate CMP's state-of-the-art performance, achieving 71.8% and 74.5% mIoU in 1-shot and 5-shot scenarios respectively.