A Knowledge-Informed Pretrained Model for Causal Discovery

๐Ÿ“… 2026-03-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing causal discovery methods often rely on strong assumptions, interventional data, or lack integration of domain knowledge, limiting their practical deployment. This work proposes a pretraining framework that incorporates weak prior domain knowledge in a principled manner, marking the first approach to systematically embed coarse-grained priors into causal discovery. The method introduces a dual-source encoderโ€“decoder architecture combined with a curriculum learning strategy to jointly model observational data and prior knowledge, adaptively handling varying levels of prior strength, graph density, and variable scale. Evaluated on in-distribution, out-of-distribution, and real-world datasets, the proposed approach significantly outperforms current state-of-the-art methods, demonstrating remarkable robustness and practical applicability.

Technology Category

Application Category

๐Ÿ“ Abstract
Causal discovery has been widely studied, yet many existing methods rely on strong assumptions or fall into two extremes: either depending on costly interventional signals or partial ground truth as strong priors, or adopting purely data driven paradigms with limited guidance, which hinders practical deployment. Motivated by real-world scenarios where only coarse domain knowledge is available, we propose a knowledge-informed pretrained model for causal discovery that integrates weak prior knowledge as a principled middle ground. Our model adopts a dual source encoder-decoder architecture to process observational data in a knowledge-informed way. We design a diverse pretraining dataset and a curriculum learning strategy that smoothly adapts the model to varying prior strengths across mechanisms, graph densities, and variable scales. Extensive experiments on in-distribution, out-of distribution, and real-world datasets demonstrate consistent improvements over existing baselines, with strong robustness and practical applicability.
Problem

Research questions and friction points this paper is trying to address.

causal discovery
weak prior knowledge
data-driven paradigm
interventional signals
domain knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge-informed
pretrained model
causal discovery
weak prior
curriculum learning
๐Ÿ”Ž Similar Papers
No similar papers found.