🤖 AI Summary
Discretization of observations distorts conditional independence (CI) relationships among underlying continuous variables, leading conventional CI tests—such as kernel-based or constraint-based methods—to produce erroneous inferences. To address this, we propose a novel bridging-equation framework that implicitly recovers the statistical structure of latent continuous variables solely from discretized data, enabling accurate CI testing. Our key contributions are threefold: (i) the first systematic identifiability analysis of the relationship between discretization mechanisms and latent continuous variables; (ii) construction of a new test statistic with a well-characterized asymptotic distribution; and (iii) theoretical guarantees on consistency and finite-sample Type-I error control. Extensive experiments on synthetic and real-world datasets demonstrate substantial improvements over state-of-the-art methods, significantly enhancing the robustness and reliability of causal discovery and Bayesian network learning under discretization.
📝 Abstract
Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $ ilde{X}_2$ and $X_3$ are observed variables, where $ ilde{X}_2$ is a discretization of latent variables $X_2$. Applying existing test methods to the observations of $X_1$, $ ilde{X}_2$ and $X_3$ can lead to a false conclusion about the underlying conditional independence of variables $X_1$, $X_2$ and $X_3$. Motivated by this, we propose a conditional independence test specifically designed to accommodate the presence of such discretization. To achieve this, we design the bridge equations to recover the parameter reflecting the statistical information of the underlying latent continuous variables. An appropriate test statistic and its asymptotic distribution under the null hypothesis of conditional independence have also been derived. Both theoretical results and empirical validation have been provided, demonstrating the effectiveness of our test methods.