Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning

📅 2025-02-15

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Existing deep neural network (DNN)-based supervised causal learning (SCL) methods adopt a “node-edge” paradigm that suffers from inherent systematic bias—unresolvable even by scaling up model capacity or data volume. Method: We propose the first end-to-end identifiable SCL framework: (i) jointly modeling an identifiable skeleton matrix and a v-structure tensor within the Markov equivalence class; (ii) introducing a unidirectional attention-based paired encoder to uniformly characterize intra- and inter-node causal dependencies; and (iii) ensuring consistent estimation of the underlying causal graph. Contribution/Results: Our approach overcomes the fundamental identifiability bottleneck of conventional architectures. Extensive experiments on synthetic and real-world benchmarks demonstrate significant improvements over state-of-the-art DNN-based SCL methods, effectively eliminating long-standing systematic bias while yielding provably consistent causal structure estimates.

Technology Category

Application Category

📝 Abstract

Causal discovery is a structured prediction task that aims to predict causal relations among variables based on their data samples. Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the"Node-Edge approach", in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. In this paper, we first show that this architecture has some systematic bias that cannot be mitigated regardless of model size and data size. We then propose SiCL, a DNN-based SCL method that predicts a skeleton matrix together with a v-tensor (a third-order tensor representing the v-structures). According to the Markov Equivalence Class (MEC) theory, both the skeleton and the v-structures are identifiable causal structures under the canonical MEC setting, so predictions about skeleton and v-structures do not suffer from the identifiability limit in causal discovery, thus SiCL can avoid the systematic bias in Node-Edge architecture, and enable consistent estimators for causal discovery. Moreover, SiCL is also equipped with a specially designed pairwise encoder module with a unidirectional attention layer to model both internal and external relationships of pairs of nodes. Experimental results on both synthetic and real-world benchmarks show that SiCL significantly outperforms other DNN-based SCL approaches.

Problem

Research questions and friction points this paper is trying to address.

Mitigates bias in DNN-based causal learning

Identifies causal structures using skeleton and v-tensor

Improves accuracy in supervised causal discovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

SiCL predicts skeleton matrix

SiCL uses v-tensor for v-structures

SiCL employs unidirectional attention layer

🔎 Similar Papers

No similar papers found.

Altos Labs

Scientist I, Machine Learning: $200,900 - $257,500 Scientist II, Machine Learning: $226,200 - $290,000 Senior Scientist I, Machine Learning: $257,400 - $330,000 (Redwood City, CA) Scientist I, Machine Learning: $179,400 - $230,000 Scientist II, Machine Learning: $212,900 - $273,000 Senior Scientist I, Machine Learning: $239,500 - $307,000 (San Diego, CA)

Redwood City, CA, USA / San Diego, CA, USA

Data Scientist

Schlumberger / SLB

Houston, United States

Research Engineer, Monetization AI