🤖 AI Summary
Existing eXplainable AI (XAI) approaches lack a unified optimization objective and objective evaluation benchmarks, hindering the generation of optimal explanations. To address this, we propose DeepFaith—the first model-agnostic and domain-agnostic unified explanation framework. Methodologically, it jointly synthesizes high-quality supervision signals from multiple explanation methods via explanation signal deduplication and filtering, pattern consistency loss, and local relevance optimization, enabling training of a general-purpose explainer. Crucially, DeepFaith introduces the first theoretically grounded, multi-dimensional faithfulness objective—integrating fidelity, consistency, and local relevance—to define and optimize an interpretable “explanation ground truth.” Evaluated across 12 tasks (6 models × 6 datasets), DeepFaith significantly outperforms baselines on 10 out of 10 faithfulness metrics, demonstrating state-of-the-art overall faithfulness and strong cross-domain generalization capability.
📝 Abstract
Explainable AI (XAI) builds trust in complex systems through model attribution methods that reveal the decision rationale. However, due to the absence of a unified optimal explanation, existing XAI methods lack a ground truth for objective evaluation and optimization. To address this issue, we propose Deep architecture-based Faith explainer (DeepFaith), a domain-free and model-agnostic unified explanation framework under the lens of faithfulness. By establishing a unified formulation for multiple widely used and well-validated faithfulness metrics, we derive an optimal explanation objective whose solution simultaneously achieves optimal faithfulness across these metrics, thereby providing a ground truth from a theoretical perspective. We design an explainer learning framework that leverages multiple existing explanation methods, applies deduplicating and filtering to construct high-quality supervised explanation signals, and optimizes both pattern consistency loss and local correlation to train a faithful explainer. Once trained, DeepFaith can generate highly faithful explanations through a single forward pass without accessing the model being explained. On 12 diverse explanation tasks spanning 6 models and 6 datasets, DeepFaith achieves the highest overall faithfulness across 10 metrics compared to all baseline methods, highlighting its effectiveness and cross-domain generalizability.