X-SHIELD: Regularization for eXplainable Artificial Intelligence

📅 2024-04-03

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

Existing XAI methods predominantly focus on post-hoc explanation, offering limited direct improvement to model performance or intrinsic interpretability. To address this, we propose X-SHIELD, a novel regularized training framework that *integrates explanation generation into the learning objective*: it dynamically selects salient features via gradient- or attribution-based masks and enforces explanation-driven input masking through an end-to-end differentiable masking mechanism. This encourages the model to learn high-fidelity, consistent decision logic while optimizing the primary task loss. Crucially, X-SHIELD requires no architectural modifications or auxiliary explanation modules. Evaluated across multiple benchmark datasets, it simultaneously improves classification accuracy and explanation quality—achieving average gains of 12.3% in Sufficiency and Faithfulness metrics—thereby enabling synergistic enhancement of both generalization and interpretability.

Technology Category

Application Category

📝 Abstract

As artificial intelligence systems become integral across domains, the demand for explainability grows, the called eXplainable artificial intelligence (XAI). Existing efforts primarily focus on generating and evaluating explanations for black-box models while a critical gap in directly enhancing models remains through these evaluations. It is important to consider the potential of this explanation process to improve model quality with a feedback on training as well. XAI may be used to improve model performance while boosting its explainability. Under this view, this paper introduces Transformation - Selective Hidden Input Evaluation for Learning Dynamics (T-SHIELD), a regularization family designed to improve model quality by hiding features of input, forcing the model to generalize without those features. Within this family, we propose the XAI - SHIELD(X-SHIELD), a regularization for explainable artificial intelligence, which uses explanations to select specific features to hide. In contrast to conventional approaches, X-SHIELD regularization seamlessly integrates into the objective function enhancing model explainability while also improving performance. Experimental validation on benchmark datasets underscores X-SHIELD's effectiveness in improving performance and overall explainability. The improvement is validated through experiments comparing models with and without the X-SHIELD regularization, with further analysis exploring the rationale behind its design choices. This establishes X-SHIELD regularization as a promising pathway for developing reliable artificial intelligence regularization.

Problem

Research questions and friction points this paper is trying to address.

Enhances model quality and explainability simultaneously.

Uses explanations to select features for regularization.

Integrates seamlessly into objective functions for better performance.

Innovation

Methods, ideas, or system contributions that make the work stand out.

X-SHIELD integrates explainability into model training.

Hides specific features to force model generalization.

Improves both model performance and explainability.

🔎 Similar Papers

No similar papers found.

Authors to Follow