Use What You Know: Causal Foundation Models with Partial Graphs

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge that existing causal foundation models struggle to effectively incorporate domain knowledge, particularly when only partial causal graphs—such as ancestral relationships—are available, leading to performance degradation. To overcome this limitation, the authors propose a novel causal foundation model based on learnable attention biases, which enables flexible integration of causal priors at arbitrary granularities, whether complete or partial graph structures. This approach unifies causal discovery and inference within a single framework and, for the first time, eliminates the reliance on fully specified causal graphs. The method achieves performance comparable to specialized, task-specific models across diverse settings, substantially enhancing the practicality and generalization capability of data-driven causal inference.

Technology Category

Application Category

📝 Abstract

Estimating causal quantities traditionally relies on bespoke estimators tailored to specific assumptions. Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step. However, in their current state, they do not allow for the incorporation of any domain knowledge, which can lead to suboptimal predictions. We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information. When access to complete causal graph information is too strict a requirement, our approach also effectively leverages partial causal information. We systematically evaluate conditioning strategies and find that injecting learnable biases into the attention mechanism is the most effective method to utilise full and partial causal information. Our experiments show that this conditioning allows a general-purpose CFM to match the performance of specialised models trained on specific causal structures. Overall, our approach addresses a central hurdle on the path towards all-in-one causal foundation models: the capability to answer causal queries in a data-driven manner while effectively leveraging any amount of domain expertise.

Problem

Research questions and friction points this paper is trying to address.

Causal Foundation Models

domain knowledge

causal graph

partial causal information

causal inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Foundation Models

Partial Causal Graphs

Attention Mechanism