TRACE: Scalable Amortized Causal Discovery from Single Sequences via Autoregressive Density Estimation

📅 2026-02-01

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenge of causal discovery from a single discrete event sequence, which suffers from the absence of repeated samples, high dimensionality, and long-range dependencies. The authors propose the first approach that leverages autoregressive density estimation for causal discovery in such settings, using a pretrained autoregressive model to efficiently estimate conditional mutual information and thereby infer the causal graph among event types. The method accommodates time-lagged causal effects, scales linearly with sequence length, and exploits GPU parallelism for computational efficiency. Notably, it guarantees causal identifiability even under model misspecification. Experimental results demonstrate robust performance across diverse baselines and vocabulary sizes, with a successful real-world application to root cause analysis in vehicle diagnostics involving 29,100 distinct event types.

Technology Category

Application Category

📝 Abstract

We study causal discovery from a single observed sequence of discrete events generated by a stochastic process, as encountered in vehicle logs, manufacturing systems, or patient trajectories. This regime is particularly challenging due to the absence of repeated samples, high dimensionality, and long-range temporal dependencies of the single observation during inference. We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation. TRACE infers the summary causal graph between event types in a sequence, scaling linearly with the event vocabulary and supporting delayed causal effects, while being fully parallel on GPUs. We establish its theoretical identifiability under imperfect autoregressive models. Experiments demonstrate robust performance across different baselines and varying vocabulary sizes including an application to root-cause analysis in vehicle diagnostics with over 29,100 event types.

Problem

Research questions and friction points this paper is trying to address.

causal discovery

single sequence

discrete events

temporal dependencies

high dimensionality

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal discovery

autoregressive density estimation

conditional mutual information