Effect-Level Validation for Causal Discovery

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

In strongly self-selecting feedback systems, conventional causal discovery methods often fail to ensure reliable causal effect estimation due to their heavy reliance on the accuracy of recovered graph structures. This work proposes an effect-centric validation framework that prioritizes identifiability, treating causal graphs as testable structural hypotheses and evaluating them through identifiability, stability, and falsifiability. Moving beyond the prevailing paradigm that equates graph recovery accuracy with methodological validity, the framework emphasizes effect consistency tailored to specific causal queries and reveals a novel phenomenon: causal effect estimates can converge across distinct graph structures. Experiments on real-world game telemetry data demonstrate that algorithms satisfying identifiability conditions yield robust and consistent effect estimates, whereas those suffering from endpoint ambiguity produce unstable or attenuated effects.

Technology Category

Application Category

📝 Abstract

Causal discovery is increasingly applied to large-scale telemetry data to estimate the effects of user-facing interventions, yet its reliability for decision-making in feedback-driven systems with strong self-selection remains unclear. In this paper, we propose an effect-centric, admissibility-first framework that treats discovered graphs as structural hypotheses and evaluates them by identifiability, stability, and falsification rather than by graph recovery accuracy alone. Empirically, we study the effect of early exposure to competitive gameplay on short-term retention using real-world game telemetry. We find that many statistically plausible discovery outputs do not admit point-identified causal queries once minimal temporal and semantic constraints are enforced, highlighting identifiability as a critical bottleneck for decision support. When identification is possible, several algorithm families converge to similar, decision-consistent effect estimates despite producing substantially different graph structures, including cases where the direct treatment-outcome edge is absent and the effect is preserved through indirect causal pathways. These converging estimates survive placebo, subsampling, and sensitivity refutation. In contrast, other methods exhibit sporadic admissibility and threshold-sensitive or attenuated effects due to endpoint ambiguity. These results suggest that graph-level metrics alone are inadequate proxies for causal reliability for a given target query. Therefore, trustworthy causal conclusions in telemetry-driven systems require prioritizing admissibility and effect-level validation over causal structural recovery alone.

Problem

Research questions and friction points this paper is trying to address.

causal discovery

effect-level validation

identifiability

telemetry data

decision support

Innovation

Methods, ideas, or system contributions that make the work stand out.

effect-level validation

causal discovery

identifiability