🤖 AI Summary
In data-scarce settings, existing DAG ensemble methods for causal discovery rely solely on first-order edge-wise confidence scores, neglecting higher-order structural dependencies among edges (e.g., edge pairs, triplets).
Method: We propose the first higher-order edge-structure-aware causal graph ensemble framework, integrating bagging with bootstrap sampling and designing a novel aggregation algorithm that explicitly captures statistical dependencies between edges. This enables robust DAG estimation under low-sample-size and high-dimensional conditions.
Contribution/Results: Theoretically, we establish an interpretable probabilistic model for higher-order structural confidence. Empirically, our method significantly outperforms state-of-the-art approaches across multiple synthetic benchmarks—improving average structural Hamming distance accuracy by 23.6% in challenging small-sample (n < 50) and high-dimensional (p > 100) regimes—while maintaining computational efficiency.
📝 Abstract
Causal discovery combines data with knowledge provided by experts to learn the DAG representing the causal relationships between a given set of variables. When data are scarce, bagging is used to measure our confidence in an average DAG obtained by aggregating bootstrapped DAGs. However, the aggregation step has received little attention from the specialized literature: the average DAG is constructed using only the confidence in the individual edges of the bootstrapped DAGs, thus disregarding complex higher-order edge structures. In this paper, we introduce a novel theoretical framework based on higher-order structures and describe a new DAG aggregation algorithm. We perform a simulation study, discussing the advantages and limitations of the proposed approach. Our proposal is both computationally efficient and effective, outperforming state-of-the-art solutions, especially in low sample size regimes and under high dimensionality settings.