🤖 AI Summary
This work addresses causal discovery from sparse, root-cause-driven temporal event sequences (e.g., financial events), where the goal is to jointly infer a directed acyclic graph (DAG) structure, instantaneous and lagged causal effects among nodes, and the temporal occurrence and industry-level spatial attribution of underlying root causes—given only limited, asynchronous, and noisy event timestamps.
Method: We propose DAG-TFRC, a gradient-based, differentiable DAG learning framework built upon structural vector autoregression. It unifies instantaneous and lagged dependencies in a dual-scale causal graph model and incorporates a sparse root-cause assumption via sparsity-inducing optimization under continuous DAG constraints.
Contribution/Results: On synthetic data with up to 1,000 nodes, DAG-TFRC significantly outperforms state-of-the-art methods. Applied to real S&P 500 event data, it achieves accurate industry-level clustering and precisely identifies root-cause events driving large market fluctuations.
📝 Abstract
We introduce DAG-TFRC, a novel method for learning directed acyclic graphs (DAGs) from time series with few root causes. By this, we mean that the data are generated by a small number of events at certain, unknown nodes and time points under a structural vector autoregression model. For such data, we (i) learn the DAGs representing both the instantaneous and time-lagged dependencies between nodes, and (ii) discover the location and time of the root causes. For synthetic data with few root causes, DAG-TFRC shows superior performance in accuracy and runtime over prior work, scaling up to thousands of nodes. Experiments on simulated and real-world financial data demonstrate the viability of our sparse root cause assumption. On S&P 500 data, DAG-TFRC successfully clusters stocks by sectors and discovers major stock movements as root causes.