🤖 AI Summary
To address the high computational complexity and poor scalability of causal discovery in high-dimensional time series, this paper proposes a hierarchical causal graph construction framework with linear time complexity. Methodologically, it introduces (1) a novel dynamics-based community detection approach grounded in information imbalance optimization to automatically identify strongly intra-coupled variable groups; (2) a community-level causal ordering mechanism enabling efficient modeling of inter-community dependencies; and (3) the first scalable causal discovery paradigm supporting both discrete- and continuous-time dynamical systems. Evaluated on synthetic and benchmark datasets up to 80 dimensions, the method achieves significantly higher causal structure recovery accuracy than state-of-the-art approaches. Computationally, it reduces runtime by over two orders of magnitude compared to standard algorithms such as PC and GES, and—critically—achieves the first linear scaling of causal discovery complexity with respect to the number of variables.
📝 Abstract
Understanding which parts of a dynamical system cause each other is extremely relevant in fundamental and applied sciences. However, inferring causal links from observational data, namely without direct manipulations of the system, is still computationally challenging, especially if the data are high-dimensional. In this study we introduce a framework for constructing causal graphs from high-dimensional time series, whose computational cost scales linearly with the number of variables. The approach is based on the automatic identification of dynamical communities, groups of variables which mutually influence each other and can therefore be described as a single node in a causal graph. These communities are efficiently identified by optimizing the Information Imbalance, a statistical quantity that assigns a weight to each putative causal variable based on its information content relative to a target variable. The communities are then ordered starting from the fully autonomous ones, whose evolution is independent from all the others, to those that are progressively dependent on other communities, building in this manner a community causal graph. We demonstrate the computational efficiency and the accuracy of our approach on time-discrete and time-continuous dynamical systems including up to 80 variables.