π€ AI Summary
To address the high computational costβcubic time complexity $O(n^3)$ and quadratic space complexity $O(n^2)$βof kernelized generalized score functions in large-scale causal discovery, this paper proposes, for the first time, a linear-complexity $O(n)$ approximate kernelized scoring function. Our method integrates low-rank kernel matrix approximation, composite matrix operation reduction, and a data-adaptive random sampling strategy accommodating heterogeneous data types, collectively reducing computational overhead. Experiments on synthetic and real-world benchmarks demonstrate up to three orders of magnitude speedup, substantial memory footprint reduction, and competitive causal graph recovery accuracy relative to state-of-the-art methods. Notably, the approach scales effectively to datasets with millions of samples, establishing a new, efficient, and robust paradigm for scalable causal structure learning.
π Abstract
Score-based causal discovery methods can effectively identify causal relationships by evaluating candidate graphs and selecting the one with the highest score. One popular class of scores is kernel-based generalized score functions, which can adapt to a wide range of scenarios and work well in practice because they circumvent assumptions about causal mechanisms and data distributions. Despite these advantages, kernel-based generalized score functions pose serious computational challenges in time and space, with a time complexity of O (n3) and a memory complexity of O (n2), where n is the sample size. In this paper, we propose an approximate kernel-based generalized score function with O (n) time and space complexities by using low-rank technique and designing a set of rules to handle the complex composite matrix operations required to calculate the score, as well as developing sampling algorithms for different data types to benefit the handling of diverse data types efficiently. Our extensive causal discovery experiments on both synthetic and real-world data demonstrate that compared to the state-of-the-art method, our method can not only significantly reduce computational costs, but also achieve comparable accuracy, especially for large datasets.