DeepFaith: A Domain-Free and Model-Agnostic Unified Framework for Highly Faithful Explanations

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing eXplainable AI (XAI) approaches lack a unified optimization objective and objective evaluation benchmarks, hindering the generation of optimal explanations. To address this, we propose DeepFaith—the first model-agnostic and domain-agnostic unified explanation framework. Methodologically, it jointly synthesizes high-quality supervision signals from multiple explanation methods via explanation signal deduplication and filtering, pattern consistency loss, and local relevance optimization, enabling training of a general-purpose explainer. Crucially, DeepFaith introduces the first theoretically grounded, multi-dimensional faithfulness objective—integrating fidelity, consistency, and local relevance—to define and optimize an interpretable “explanation ground truth.” Evaluated across 12 tasks (6 models × 6 datasets), DeepFaith significantly outperforms baselines on 10 out of 10 faithfulness metrics, demonstrating state-of-the-art overall faithfulness and strong cross-domain generalization capability.

Technology Category

Application Category

📝 Abstract

Explainable AI (XAI) builds trust in complex systems through model attribution methods that reveal the decision rationale. However, due to the absence of a unified optimal explanation, existing XAI methods lack a ground truth for objective evaluation and optimization. To address this issue, we propose Deep architecture-based Faith explainer (DeepFaith), a domain-free and model-agnostic unified explanation framework under the lens of faithfulness. By establishing a unified formulation for multiple widely used and well-validated faithfulness metrics, we derive an optimal explanation objective whose solution simultaneously achieves optimal faithfulness across these metrics, thereby providing a ground truth from a theoretical perspective. We design an explainer learning framework that leverages multiple existing explanation methods, applies deduplicating and filtering to construct high-quality supervised explanation signals, and optimizes both pattern consistency loss and local correlation to train a faithful explainer. Once trained, DeepFaith can generate highly faithful explanations through a single forward pass without accessing the model being explained. On 12 diverse explanation tasks spanning 6 models and 6 datasets, DeepFaith achieves the highest overall faithfulness across 10 metrics compared to all baseline methods, highlighting its effectiveness and cross-domain generalizability.

Problem

Research questions and friction points this paper is trying to address.

Lack of unified optimal explanation in XAI methods

Absence of ground truth for objective evaluation

Need for domain-free model-agnostic faithful explanation framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified framework for optimal faithfulness metrics

Leverages multiple methods for supervised explanation signals

Generates faithful explanations via single forward pass

🔎 Similar Papers

FaithLM: Towards Faithful Explanations for Large Language Models