BALD-SAM: Disagreement-based Active Prompting in Interactive Segmentation

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the limitation of existing interactive segmentation methods, which rely on human judgment to assess mask quality and place prompts, lacking an automated mechanism for selecting highly informative locations. The authors propose an active prompting framework that treats the image space as an unlabeled pool and introduces, for the first time in this context, the Bayesian Active Learning by Disagreement (BALD) criterion from Bayesian active learning. By employing a lightweight prediction head to model epistemic uncertainty, the method enables efficient spatial prompt selection. Built upon a frozen SAM backbone and leveraging Laplace approximation for uncertainty quantification, the approach achieves state-of-the-art performance across 16 diverse datasets—ranking among the top two on 14—and significantly outperforms both human annotators and several oracle prompting strategies, particularly excelling on objects with fine or complex structures compared to single-prompt baselines.

Technology Category

Application Category

📝 Abstract

The Segment Anything Model (SAM) has revolutionized interactive segmentation through spatial prompting. While existing work primarily focuses on automating prompts in various settings, real-world annotation workflows involve iterative refinement where annotators observe model outputs and strategically place prompts to resolve ambiguities. Current pipelines typically rely on the annotator's visual assessment of the predicted mask quality. We postulate that a principled approach for automated interactive prompting is to use a model-derived criterion to identify the most informative region for the next prompt. In this work, we establish active prompting: a spatial active learning approach where locations within images constitute an unlabeled pool and prompts serve as queries to prioritize information-rich regions, increasing the utility of each interaction. We further present BALD-SAM: a principled framework adapting Bayesian Active Learning by Disagreement (BALD) to spatial prompt selection by quantifying epistemic uncertainty. To do so, we freeze the entire model and apply Bayesian uncertainty modeling only to a small learned prediction head, making intractable uncertainty estimation practical for large multi-million parameter foundation models. Across 16 datasets spanning natural, medical, underwater, and seismic domains, BALD-SAM demonstrates strong cross-domain performance, ranking first or second on 14 of 16 benchmarks. We validate these gains through a comprehensive ablation suite covering 3 SAM backbones and 35 Laplace posterior configurations, amounting to 38 distinct ablation settings. Beyond strong average performance, BALD-SAM surpasses human prompting and, in several categories, even oracle prompting, while consistently outperforming one-shot baselines in final segmentation quality, particularly on thin and structurally complex objects.

Problem

Research questions and friction points this paper is trying to address.

interactive segmentation

active prompting

spatial prompting

uncertainty quantification

annotation efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

active prompting

Bayesian Active Learning by Disagreement (BALD)

epistemic uncertainty