CauScale: Neural Causal Discovery at Scale

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work addresses the severe spatiotemporal efficiency bottlenecks faced by existing causal discovery methods when scaling to large graphs. To overcome these limitations, the authors propose CauScale, an efficient neural architecture for causal discovery that integrates embedding compression, tied attention mechanisms, and a dual-stream design—separately processing observational data and graph-structural priors—to dramatically enhance scalability. CauScale is the first method to enable end-to-end training and inference on graphs with up to one thousand nodes, thereby breaking through the spatial constraints of prior approaches. On 500-node graphs, the model achieves 99.6% in-distribution mean average precision (mAP) and 84.4% out-of-distribution mAP, while accelerating inference by 4 to 13,000 times compared to baseline methods.

Technology Category

Application Category

📝 Abstract

Causal discovery is essential for advancing data-driven fields such as scientific AI and data analysis, yet existing approaches face significant time- and space-efficiency bottlenecks when scaling to large graphs. To address this challenge, we present CauScale, a neural architecture designed for efficient causal discovery that scales inference to graphs with up to 1000 nodes. CauScale improves time efficiency via a reduction unit that compresses data embeddings and improves space efficiency by adopting tied attention weights to avoid maintaining axis-specific attention maps. To keep high causal discovery accuracy, CauScale adopts a two-stream design: a data stream extracts relational evidence from high-dimensional observations, while a graph stream integrates statistical graph priors and preserves key structural signals. CauScale successfully scales to 500-node graphs during training, where prior work fails due to space limitations. Across testing data with varying graph scales and causal mechanisms, CauScale achieves 99.6% mAP on in-distribution data and 84.4% on out-of-distribution data, while delivering 4-13,000 times inference speedups over prior methods. Our project page is at https://github.com/OpenCausaLab/CauScale.

Problem

Research questions and friction points this paper is trying to address.

causal discovery

scalability

large graphs

efficiency bottleneck

neural architecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Discovery

Scalable Neural Architecture

Tied Attention

Two-Stream Design

Efficient Inference

🔎 Similar Papers

No similar papers found.