CoCo-SAM3: Harnessing Concept Conflict in Open-Vocabulary Semantic Segmentation

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the challenges of inter-class conflict and intra-class drift in multi-class open-vocabulary semantic segmentation, which arise from inconsistent evidence scales across class-specific prompts and synonymous expressions. To mitigate these issues, the authors propose a decoupled inference framework that separates the process into two stages: intra-class enhancement followed by inter-class competition. First, evidence from synonymous prompts is aligned and aggregated to strengthen conceptual consistency; then, pixel-wise inter-class competition is performed under a unified evidence scale. This approach explicitly models the mechanism of conceptual conflict for the first time and improves the stability and accuracy of multi-class inference without requiring additional training. Evaluated within the SAM3 framework, the method consistently achieves performance gains across eight open-vocabulary segmentation benchmarks, effectively alleviating both inter-class conflict and intra-class drift.

Technology Category

Application Category

📝 Abstract

SAM3 advances open-vocabulary semantic segmentation by introducing a prompt-driven mask generation paradigm. However, in multi-class open-vocabulary scenarios, masks generated independently from different category prompts lack a unified and inter-class comparable evidence scale, often resulting in overlapping coverage and unstable competition. Moreover, synonymous expressions of the same concept tend to activate inconsistent semantic and spatial evidence, leading to intra-class drift that exacerbates inter-class conflicts and compromises overall inference stability. To address these issues, we propose CoCo-SAM3 (Concept-Conflict SAM3), which explicitly decouples inference into intra-class enhancement and inter-class competition. Our method first aligns and aggregates evidence from synonymous prompts to strengthen concept consistency. It then performs inter-class competition on a unified comparable scale, enabling direct pixel-wise comparisons among all candidate classes. This mechanism stabilizes multi-class inference and effectively mitigates inter-class conflicts. Without requiring any additional training, CoCo-SAM3 achieves consistent improvements across eight open-vocabulary semantic segmentation benchmarks.

Problem

Research questions and friction points this paper is trying to address.

open-vocabulary semantic segmentation

concept conflict

inter-class competition

intra-class drift

prompt-driven mask generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

open-vocabulary semantic segmentation

concept conflict

prompt-driven mask generation