Cross-modal Counterfactual Explanations: Uncovering Decision Factors and Dataset Biases in Subjective Classification

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

To address the weak interpretability and difficulty in identifying data bias in subjective image classification (e.g., privacy assessment), this paper proposes DeX, a novel training-free, cross-modal, concept-driven counterfactual explanation framework. DeX employs cross-modal disentanglement and image-specific concept mining to generate natural-language counterfactual explanations. It introduces a decision-factor selection mechanism that jointly optimizes similarity and confidence, and establishes an image-grounded, sparse, and comparable evaluation framework for explanations. Experiments demonstrate that DeX significantly outperforms existing methods: it precisely localizes key visual factors underlying subjective decisions and automatically detects and quantifies dataset biases—thereby providing interpretable, actionable insights for fairness-aware model optimization.

Technology Category

Application Category

📝 Abstract

Concept-driven counterfactuals explain decisions of classifiers by altering the model predictions through semantic changes. In this paper, we present a novel approach that leverages cross-modal decompositionality and image-specific concepts to create counterfactual scenarios expressed in natural language. We apply the proposed interpretability framework, termed Decompose and Explain (DeX), to the challenging domain of image privacy decisions, which are contextual and subjective. This application enables the quantification of the differential contributions of key scene elements to the model prediction. We identify relevant decision factors via a multi-criterion selection mechanism that considers both image similarity for minimal perturbations and decision confidence to prioritize impactful changes. This approach evaluates and compares diverse explanations, and assesses the interdependency and mutual influence among explanatory properties. By leveraging image-specific concepts, DeX generates image-grounded, sparse explanations, yielding significant improvements over the state of the art. Importantly, DeX operates as a training-free framework, offering high flexibility. Results show that DeX not only uncovers the principal contributing factors influencing subjective decisions, but also identifies underlying dataset biases allowing for targeted mitigation strategies to improve fairness.

Problem

Research questions and friction points this paper is trying to address.

Explains subjective classifier decisions via semantic changes

Quantifies scene elements' impact on image privacy predictions

Identifies dataset biases to enhance model fairness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal decompositionality for counterfactual natural language explanations

Multi-criterion selection for minimal perturbations and impactful changes

Training-free framework generating image-grounded sparse explanations

🔎 Similar Papers

Evaluating the Reliability of Self-Explanations in Large Language Models