🤖 AI Summary
This work addresses the challenge of causal discovery from a single discrete event sequence, which suffers from the absence of repeated samples, high dimensionality, and long-range dependencies. The authors propose the first approach that leverages autoregressive density estimation for causal discovery in such settings, using a pretrained autoregressive model to efficiently estimate conditional mutual information and thereby infer the causal graph among event types. The method accommodates time-lagged causal effects, scales linearly with sequence length, and exploits GPU parallelism for computational efficiency. Notably, it guarantees causal identifiability even under model misspecification. Experimental results demonstrate robust performance across diverse baselines and vocabulary sizes, with a successful real-world application to root cause analysis in vehicle diagnostics involving 29,100 distinct event types.
📝 Abstract
We study causal discovery from a single observed sequence of discrete events generated by a stochastic process, as encountered in vehicle logs, manufacturing systems, or patient trajectories. This regime is particularly challenging due to the absence of repeated samples, high dimensionality, and long-range temporal dependencies of the single observation during inference. We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation. TRACE infers the summary causal graph between event types in a sequence, scaling linearly with the event vocabulary and supporting delayed causal effects, while being fully parallel on GPUs. We establish its theoretical identifiability under imperfect autoregressive models. Experiments demonstrate robust performance across different baselines and varying vocabulary sizes including an application to root-cause analysis in vehicle diagnostics with over 29,100 event types.