🤖 AI Summary
Current time-series causal discovery lacks benchmark datasets that simultaneously incorporate ground-truth causal graphs and realistic temporal characteristics—particularly systematic modeling of nonstationarity (trends/seasonality), irregular sampling, and unobserved confounding. To address this gap, we propose the first comprehensive synthetic benchmark suite, generating linear and nonlinear time-series data via structural equation models. Our framework explicitly incorporates stochastic trends, periodic components, sparse causal graphs, heterogeneous noise (Gaussian, heavy-tailed, heteroscedastic), and controllable latent confounders, while orthogonally decoupling causal graph density from noise distribution. The suite provides dual versions of each causal graph—with and without confounding—for rigorous evaluation. Extensive experiments on state-of-the-art algorithms (PCMCI+, LPCMCI, FGES) reveal substantial performance degradation under nonstationary and confounded settings. All code, data-generation scripts, and standardized evaluation protocols are publicly released.
📝 Abstract
Robust causal discovery in time series datasets depends on reliable benchmark datasets with known ground-truth causal relationships. However, such datasets remain scarce, and existing synthetic alternatives often overlook critical temporal properties inherent in real-world data, including nonstationarity driven by trends and seasonality, irregular sampling intervals, and the presence of unobserved confounders. To address these challenges, we introduce TimeGraph, a comprehensive suite of synthetic time-series benchmark datasets that systematically incorporates both linear and nonlinear dependencies while modeling key temporal characteristics such as trends, seasonal effects, and heterogeneous noise patterns. Each dataset is accompanied by a fully specified causal graph featuring varying densities and diverse noise distributions and is provided in two versions: one including unobserved confounders and one without, thereby offering extensive coverage of real-world complexity while preserving methodological neutrality. We further demonstrate the utility of TimeGraph through systematic evaluations of state-of-the-art causal discovery algorithms including PCMCI+, LPCMCI, and FGES across a diverse array of configurations and metrics. Our experiments reveal significant variations in algorithmic performance under realistic temporal conditions, underscoring the need for robust synthetic benchmarks in the fair and transparent assessment of causal discovery methods. The complete TimeGraph suite, including dataset generation scripts, evaluation metrics, and recommended experimental protocols, is freely available to facilitate reproducible research and foster community-driven advancements in time-series causal discovery.