Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation

📅 2023-12-30

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the limitations of Segment Anything Model (SAM) in detail-preserved boundary delineation and segmentation accuracy for binary image segmentation (DIS), this paper proposes DIS-SAM, a two-stage framework. In the first stage, an adapted SAM generates coarse segmentation masks; in the second stage, a lightweight refinement network optimizes mask boundaries. This work represents the first systematic adaptation of SAM to high-precision DIS tasks and introduces a ground-truth mask augmentation strategy to enhance training robustness. The method fully preserves SAM’s promptable paradigm—supporting both point and box prompts—while maintaining real-time inference capability. Evaluated on the DIS benchmark, DIS-SAM achieves over 12% improvement in Fβ score compared to the original SAM, significantly outperforming state-of-the-art methods. These results demonstrate its effectiveness and generalizability for fine-grained boundary segmentation.

Technology Category

Application Category

📝 Abstract

The Segment Anything Model (SAM) represents a significant breakthrough into foundation models for computer vision, providing a large-scale image segmentation model. However, despite SAM's zero-shot performance, its segmentation masks lack fine-grained details, particularly in accurately delineating object boundaries. Therefore, it is both interesting and valuable to explore whether SAM can be improved towards highly accurate object segmentation, which is known as the dichotomous image segmentation (DIS) task. To address this issue, we propose DIS-SAM, which advances SAM towards DIS with extremely accurate details. DIS-SAM is a framework specifically tailored for highly accurate segmentation, maintaining SAM's promptable design. DIS-SAM employs a two-stage approach, integrating SAM with a modified advanced network that was previously designed to handle the prompt-free DIS task. To better train DIS-SAM, we employ a ground truth enrichment strategy by modifying original mask annotations.

Problem

Research questions and friction points this paper is trying to address.

Improving SAM for fine-grained object boundary delineation

Adapting SAM for highly accurate dichotomous image segmentation

Enhancing segmentation masks with a two-stage promptable framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage approach integrating SAM with advanced network

Promptable design tailored for accurate segmentation

Ground truth enrichment strategy for training

🔎 Similar Papers

On Efficient Variants of Segment Anything Model: A Survey