Arrow: A Foundation Model for Causal Discovery

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
This work addresses zero-shot causal discovery from observational tabular data. It proposes Arrow, the first model to incorporate large-scale pretraining into causal discovery, leveraging a Transformer architecture to jointly model variable contexts. Arrow decomposes the causal graph into an undirected skeleton and a topological ordering, predicting edge existence probabilities and node ordering scores in an end-to-end differentiable manner that guarantees acyclicity. The approach employs a skeleton-order factorized representation, a composite likelihood loss, and diverse synthetic data for pretraining. Evaluated on synthetic, semi-synthetic, and real-world datasets, Arrow matches or surpasses existing methods while achieving significantly faster inference, demonstrating the effectiveness and generalization capability of pretrained causal foundation models.
📝 Abstract
We introduce Arrow, a foundation model for zero-shot causal discovery on observational tabular data. Arrow factorizes a directed acyclic graph into an undirected skeleton and a topological order, guaranteeing acyclicity by construction. Given a new dataset, it uses a transformer-based architecture to contextualize variables within and across observations, then predicts skeleton edge probabilities and node order scores that together define a graph. Arrow is trained in a supervised fashion on synthetic datasets with ground-truth graphs, using an end-to-end differentiable directed edge composite likelihood induced by the skeleton-order factorization. The training distribution spans diverse graph families, functional forms, noise models, and dataset shapes. Across in- and out-of-distribution synthetic, semi-synthetic, and real datasets, Arrow matches or outperforms existing causal discovery methods at substantially lower inference cost than competitive alternatives. Our results demonstrate that large-scale pretraining on diverse synthetic data can yield zero-shot causal discovery models that are fast, accurate, and reusable on new datasets.
Problem

Research questions and friction points this paper is trying to address.

causal discovery
zero-shot learning
observational data
directed acyclic graph
foundation model
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal discovery
foundation model
directed acyclic graph
zero-shot learning
transformer architecture
🔎 Similar Papers
No similar papers found.