🤖 AI Summary
This work addresses the challenge in causal effect estimation where an unknown true causal graph often leads to misspecification of adjustment sets, resulting in prediction intervals that either under-cover or are overly conservative. To tackle this issue, the authors propose CausalGuard, a novel framework that integrates large language model–guided graph priors, conditional independence test–based pruning, and Bayesian Information Criterion reweighting to generate weighted pseudo-outcomes. By combining doubly robust estimation with conformal inference, CausalGuard achieves valid marginal coverage in finite samples. Empirical evaluations across five benchmark datasets demonstrate that the method consistently attains over 90% target coverage while substantially narrowing interval width, effectively avoids invalid adjustments under stress tests, and remains robust to misspecified priors.
📝 Abstract
Estimating treatment effects from observational data requires choosing an adjustment set, but valid adjustment depends on an unknown causal graph. Graph misspecification can cause under-coverage, while graph-agnostic conformal wrappers may regain nominal coverage only through large padding. We introduce CausalGuard, a structure-weighted conformal framework that calibrates after aggregating graph-conditional doubly robust pseudo-outcomes. Candidate DAGs are proposed from an LLM-derived edge prior, pruned by conditional-independence tests, and reweighted by Bayesian Information Criterion. A composite nonconformity score then calibrates the posterior-weighted pseudo-outcome. CausalGuard provides distribution-free finite-sample marginal coverage for this aggregated pseudo-outcome; under causal identification, overlap, conditional-mean nuisance stability, and concentration on target-aligned valid adjustment strategies, its conditional mean converges to the true Conditional Average Treatment Effect. Across five benchmarks, CausalGuard attains mean coverage above the nominal 90% level for the directly evaluable target and reduces width when graph-agnostic conformal baselines require large padding. Stress tests show that CausalGuard suppresses invalid collider adjustment and remains stable under misspecified priors when the retained candidate set is data-supported.