🤖 AI Summary
Existing visual counterfactual explainers (VCEs) overemphasize sample quality or change minimization while neglecting core interpretability principles—fidelity, comprehensibility, and sufficiency. To address this, we propose an explanation-quality-driven counterfactual generation paradigm—the first to systematically integrate hermeneutic principles with generative mechanisms—via the Smooth Counterfactual Explorer (SCE). SCE jointly optimizes gradient smoothing, latent-space regularization, and semantic constraints to produce high-fidelity, human-understandable, and semantically grounded counterfactuals. We conduct comprehensive evaluations on both synthetic and real-world image datasets across multiple dimensions. Experiments demonstrate that SCE significantly improves explanation fidelity (+23.6%) and human comprehensibility (+31.2%) over state-of-the-art baselines, while rigorously satisfying counterfactual validity and minimality. Our approach establishes a more holistic, reliable, and principle-grounded framework for interpretable image classification.
📝 Abstract
Visual counterfactual explainers (VCEs) are a straightforward and promising approach to enhancing the transparency of image classifiers. VCEs complement other types of explanations, such as feature attribution, by revealing the specific data transformations to which a machine learning model responds most strongly. In this paper, we argue that existing VCEs focus too narrowly on optimizing sample quality or change minimality; they fail to consider the more holistic desiderata for an explanation, such as fidelity, understandability, and sufficiency. To address this shortcoming, we explore new mechanisms for counterfactual generation and investigate how they can help fulfill these desiderata. We combine these mechanisms into a novel 'smooth counterfactual explorer' (SCE) algorithm and demonstrate its effectiveness through systematic evaluations on synthetic and real data.