A Conditional Independence Test in the Presence of Discretization

📅 2024-04-26

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Discretization of observations distorts conditional independence (CI) relationships among underlying continuous variables, leading conventional CI tests—such as kernel-based or constraint-based methods—to produce erroneous inferences. To address this, we propose a novel bridging-equation framework that implicitly recovers the statistical structure of latent continuous variables solely from discretized data, enabling accurate CI testing. Our key contributions are threefold: (i) the first systematic identifiability analysis of the relationship between discretization mechanisms and latent continuous variables; (ii) construction of a new test statistic with a well-characterized asymptotic distribution; and (iii) theoretical guarantees on consistency and finite-sample Type-I error control. Extensive experiments on synthetic and real-world datasets demonstrate substantial improvements over state-of-the-art methods, significantly enhancing the robustness and reliability of causal discovery and Bayesian network learning under discretization.

Technology Category

Application Category

📝 Abstract

Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $ ilde{X}_2$ and $X_3$ are observed variables, where $ ilde{X}_2$ is a discretization of latent variables $X_2$. Applying existing test methods to the observations of $X_1$, $ ilde{X}_2$ and $X_3$ can lead to a false conclusion about the underlying conditional independence of variables $X_1$, $X_2$ and $X_3$. Motivated by this, we propose a conditional independence test specifically designed to accommodate the presence of such discretization. To achieve this, we design the bridge equations to recover the parameter reflecting the statistical information of the underlying latent continuous variables. An appropriate test statistic and its asymptotic distribution under the null hypothesis of conditional independence have also been derived. Both theoretical results and empirical validation have been provided, demonstrating the effectiveness of our test methods.

Problem

Research questions and friction points this paper is trying to address.

Testing conditional independence with discretized observations.

Existing methods fail with discretized latent variables.

Proposed test recovers latent continuous variable information.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops conditional independence test for discretized data.

Uses bridge equations to recover latent variable parameters.

Derives test statistic with asymptotic distribution analysis.

🔎 Similar Papers

Practical Kernel Tests of Conditional Independence