Score-based Generative Modeling for Conditional Independence Testing

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Addressing the challenges of conditional independence (CI) testing in high-dimensional settings—namely, statistical difficulty, inaccurate modeling, and training instability in generative approaches—this paper proposes the first CI testing framework based on score-based generative modeling. Our method introduces sliced conditional score matching and Langevin dynamics-based conditional sampling to enable stable and accurate modeling of complex conditional distributions. We theoretically derive an error bound for the test statistic and, for the first time in generative CI testing, rigorously control the Type I error rate. Furthermore, we integrate goodness-of-fit validation to enhance interpretability and statistical power. Extensive experiments on synthetic and real-world datasets demonstrate that our approach significantly outperforms state-of-the-art methods: it strictly maintains the nominal Type I error level while achieving average power gains of 12.7%–34.5%.

Technology Category

Application Category

📝 Abstract

Determining conditional independence (CI) relationships between random variables is a fundamental yet challenging task in machine learning and statistics, especially in high-dimensional settings. Existing generative model-based CI testing methods, such as those utilizing generative adversarial networks (GANs), often struggle with undesirable modeling of conditional distributions and training instability, resulting in subpar performance. To address these issues, we propose a novel CI testing method via score-based generative modeling, which achieves precise Type I error control and strong testing power. Concretely, we first employ a sliced conditional score matching scheme to accurately estimate conditional score and use Langevin dynamics conditional sampling to generate null hypothesis samples, ensuring precise Type I error control. Then, we incorporate a goodness-of-fit stage into the method to verify generated samples and enhance interpretability in practice. We theoretically establish the error bound of conditional distributions modeled by score-based generative models and prove the validity of our CI tests. Extensive experiments on both synthetic and real-world datasets show that our method significantly outperforms existing state-of-the-art methods, providing a promising way to revitalize generative model-based CI testing.

Problem

Research questions and friction points this paper is trying to address.

Testing conditional independence in high-dimensional data

Overcoming limitations of GAN-based CI testing methods

Ensuring precise Type I error control and power

Innovation

Methods, ideas, or system contributions that make the work stand out.

Score-based generative modeling for CI testing

Sliced conditional score matching for accuracy

Langevin dynamics for Type I error control

🔎 Similar Papers

PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation