Granular Computing-driven SAM: From Coarse-to-Fine Guidance for Prompt-Free Segmentation

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing prompt-free image segmentation methods (e.g., SAM) suffer from two key limitations: weak locality—lacking autonomous region localization—and poor scalability—insufficient fine-grained modeling at high resolutions. To address these, we propose Grc-SAM, a coarse-to-fine multi-granularity prompt-free segmentation framework. Its core innovations include an adaptive foreground localization mechanism and sparse local Swin-style attention, enabling end-to-end inference from coarse response regions to fine-grained local optimization via high-response feature extraction and latent prompt embedding. Built upon a vision transformer backbone, Grc-SAM eliminates reliance on hand-crafted prompts and supports accurate segmentation of high-resolution inputs. Extensive experiments demonstrate that Grc-SAM significantly outperforms state-of-the-art prompt-free methods across multiple benchmarks, achieving both higher segmentation accuracy and superior resolution scalability.

Technology Category

Application Category

📝 Abstract

Prompt-free image segmentation aims to generate accurate masks without manual guidance. Typical pre-trained models, notably Segmentation Anything Model (SAM), generate prompts directly at a single granularity level. However, this approach has two limitations: (1) Localizability, lacking mechanisms for autonomous region localization; (2) Scalability, limited fine-grained modeling at high resolution. To address these challenges, we introduce Granular Computing-driven SAM (Grc-SAM), a coarse-to-fine framework motivated by Granular Computing (GrC). First, the coarse stage adaptively extracts high-response regions from features to achieve precise foreground localization and reduce reliance on external prompts. Second, the fine stage applies finer patch partitioning with sparse local swin-style attention to enhance detail modeling and enable high-resolution segmentation. Third, refined masks are encoded as latent prompt embeddings for the SAM decoder, replacing handcrafted prompts with an automated reasoning process. By integrating multi-granularity attention, Grc-SAM bridges granular computing with vision transformers. Extensive experimental results demonstrate Grc-SAM outperforms baseline methods in both accuracy and scalability. It offers a unique granular computational perspective for prompt-free segmentation.

Problem

Research questions and friction points this paper is trying to address.

Achieves autonomous region localization without manual prompts

Enables fine-grained modeling for high-resolution image segmentation

Replaces handcrafted prompts with automated granular computing reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Coarse stage adaptively extracts high-response regions

Fine stage applies patch partitioning with attention

Encodes refined masks as latent prompt embeddings

🔎 Similar Papers

No similar papers found.

Bosch Group

Hildesheim, NDS, DE

PhD - Effiziente Neuronale Repräsentation von Datensätzen

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)