On Measuring Intrinsic Causal Attributions in Deep Neural Networks

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This paper addresses the weak causality and poor stability of existing global explanation methods for deep neural networks (DNNs). To this end, we introduce—*for the first time*—the notion of *Intrinsic Causal Contribution (ICC)*, modeling DNNs as structural causal models (SCMs) and establishing an identifiable generative posterior explanation framework. Theoretically, we derive a rigorous equivalence between ICC and Sobol’ sensitivity indices, thereby transcending conventional direct/indirect effect paradigms and enabling unbiased quantification of the intrinsic causal effects of input features. Methodologically, we integrate causal intervention estimation with generative posterior inference. Extensive experiments on synthetic and real-world datasets demonstrate that ICC yields more intuitive and significantly more robust global attributions than state-of-the-art approaches—including SHAP and Integrated Gradients—thereby laying a novel foundation for causal interpretability in trustworthy AI.

Technology Category

Application Category

📝 Abstract

Quantifying the causal influence of input features within neural networks has become a topic of increasing interest. Existing approaches typically assess direct, indirect, and total causal effects. This work treats NNs as structural causal models (SCMs) and extends our focus to include intrinsic causal contributions (ICC). We propose an identifiable generative post-hoc framework for quantifying ICC. We also draw a relationship between ICC and Sobol' indices. Our experiments on synthetic and real-world datasets demonstrate that ICC generates more intuitive and reliable explanations compared to existing global explanation techniques.

Problem

Research questions and friction points this paper is trying to address.

Measure intrinsic causal attributions in neural networks

Quantify causal influence of input features

Compare ICC with existing explanation techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Treats NNs as structural causal models

Proposes identifiable generative post-hoc framework

Links ICC to Sobol' indices for explanations

🔎 Similar Papers

No similar papers found.