🤖 AI Summary
Deep learning models are often distrusted and difficult to verify in high-stakes applications due to their causal opacity. To address this, we propose the Causal Concept Graph Model (Causal CGM), the first neural architecture that explicitly embeds concept-level causal graphs at the design level, enabling causal traceability throughout inference. Our method integrates structured concept modeling, differentiable symbolic reasoning, causal intervention, and counterfactual reasoning—supporting real-time human correction of intermediate reasoning steps while guaranteeing the reliability of post-correction explanations. Experiments demonstrate that Causal CGM matches state-of-the-art black-box models in generalization performance, while substantially improving explanation credibility, causal intervention analysis capability, and verifiability of fairness. By unifying predictive accuracy with causal transparency, Causal CGM establishes a new paradigm for trustworthy, high-risk AI systems.
📝 Abstract
Causal opacity denotes the difficulty in understanding the"hidden"causal structure underlying the decisions of deep neural network (DNN) models. This leads to the inability to rely on and verify state-of-the-art DNN-based systems, especially in high-stakes scenarios. For this reason, circumventing causal opacity in DNNs represents a key open challenge at the intersection of deep learning, interpretability, and causality. This work addresses this gap by introducing Causal Concept Graph Models (Causal CGMs), a class of interpretable models whose decision-making process is causally transparent by design. Our experiments show that Causal CGMs can: (i) match the generalisation performance of causally opaque models, (ii) enable human-in-the-loop corrections to mispredicted intermediate reasoning steps, boosting not just downstream accuracy after corrections but also the reliability of the explanations provided for specific instances, and (iii) support the analysis of interventional and counterfactual scenarios, thereby improving the model's causal interpretability and supporting the effective verification of its reliability and fairness.