Toward Faithful Segmentation Attribution via Benchmarking and Dual-Evidence Fusion

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the limitations of existing attribution methods for semantic segmentation, which predominantly rely on visual plausibility without systematic evaluation of causal faithfulness or out-of-target leakage. To bridge this gap, the authors introduce the first comprehensive benchmark tailored for segmentation attribution, encompassing four key dimensions: intervention faithfulness, out-of-target leakage, perturbation robustness, and computational efficiency. They further propose a Dual-Evidence Attribution (DEA) method that enhances causal credibility by consistently fusing gradient-based and region-intervention signals through a weighted integration scheme. Extensive experiments across multiple backbones on Pascal VOC and SBD datasets demonstrate that DEA substantially outperforms gradient-based baselines, achieving higher faithfulness in deletion tests while maintaining strong robustness—thereby validating both the effectiveness of the proposed benchmark and the practical utility of the DEA approach.

Technology Category

Application Category

📝 Abstract

Attribution maps for semantic segmentation are almost always judged by visual plausibility. Yet looking convincing does not guarantee that the highlighted pixels actually drive the model's prediction, nor that attribution credit stays within the target region. These questions require a dedicated evaluation protocol. We introduce a reproducible benchmark that tests intervention-based faithfulness, off-target leakage, perturbation robustness, and runtime on Pascal VOC and SBD across three pretrained backbones. To further demonstrate the benchmark, we propose Dual-Evidence Attribution (DEA), a lightweight correction that fuses gradient evidence with region-level intervention signals through agreement-weighted fusion. DEA increases emphasis where both sources agree and retains causal support when gradient responses are unstable. Across all completed runs, DEA consistently improves deletion-based faithfulness over gradient-only baselines and preserves strong robustness, at the cost of additional compute from intervention passes. The benchmark exposes a faithfulness-stability tradeoff among attribution families that is entirely hidden under visual evaluation, providing a foundation for principled method selection in segmentation explainability. Code is available at https://github.com/anmspro/DEA.

Problem

Research questions and friction points this paper is trying to address.

semantic segmentation

attribution

faithfulness

benchmarking

off-target leakage

Innovation

Methods, ideas, or system contributions that make the work stand out.

faithful attribution

semantic segmentation

intervention-based evaluation