Learning Causal Orderings for In-Context Tabular Prediction

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

173K/year
🤖 AI Summary
This work addresses the unreliability of in-context learning on tabular data under distribution shifts or interventions, which stems from overreliance on spurious correlations and the typical disconnect between causal discovery and predictive modeling. The authors propose TabOrder, the first framework to directly embed a learnable causal variable ordering into the prediction architecture. TabOrder infers a causal topological order via an unsupervised likelihood objective and restricts predictions to use only predecessor features in this causal sequence. It further introduces a causally constrained attention mechanism that jointly optimizes causal discovery, prediction, and missing value imputation. Theoretical analysis and experiments demonstrate that TabOrder accurately recovers causal orders, achieves strong performance in both prediction and imputation tasks, and yields interpretable insights on real-world biological intervention data.
📝 Abstract
In-context learning for tabular data sets strong predictive standards in observational settings; it however primarily relies on correlational structure, which becomes unreliable under distribution shift or intervention. While established methods to discover causal structure exist, they are often focused on structure identifiability and decoupled from the predictive architectures that could benefit from them. To bridge these perspectives, we study how to simultaneously infer and enforce causal structure in the form of topological variable orderings into tabular prediction. Unlike standard architectures, our model TabOrder uses causal order-constrained attention, basing predictions only on features that precede a target under a learned causal order. Similar to causal discovery methods, TabOrder learns the optimal variable ordering in an unsupervised manner through a likelihood-based objective. We justify this choice under standard functional model classes and also study how sample missingness, a common challenge in tabular data, interacts with causal direction identification. Empirically, we confirm that TabOrder recovers accurate variable orderings while addressing prediction and imputation tasks, as well as gives insight into real-world biological data under intervention.
Problem

Research questions and friction points this paper is trying to address.

causal ordering
in-context learning
tabular data
distribution shift
intervention
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal ordering
in-context learning
tabular prediction
causal attention
unsupervised structure learning
🔎 Similar Papers
💼 Related Jobs
S
Sascha Xu
CISPA Helmholtz Center
S
Sarah Mameche
CISPA Helmholtz Center
Jilles Vreeken
Jilles Vreeken
CISPA Helmholtz Center for Information Security
Machine LearningCausal InferenceData Mining