🤖 AI Summary
This study addresses the substantial discretization-induced bias that arises when continuous variables are binned in causal functional estimation, which severely compromises the accuracy of causal quantities such as mediation effects. The paper demonstrates that this bias stems from the approximation of the target functional itself rather than from statistical estimation error. To mitigate this issue, the authors propose a second-order bias correction method based on within-bin conditional mean regression. By analyzing the bias order via Taylor expansion under smoothness assumptions and integrating plug-in estimation with a one-step estimation procedure, they derive a computationally feasible estimator for the corrected functional. Simulations show that the proposed approach substantially reduces bias even under coarse binning and yields confidence intervals whose empirical coverage closely matches the nominal level.
📝 Abstract
A class of causal effect functionals requires integration over conditional densities of continuous variables, as in mediation effects and nonparametric identification in causal graphical models. Estimating such densities and evaluating the resulting integrals can be statistically and computationally demanding. A common workaround is to discretize the variable and replace integrals with finite sums. Although convenient, discretization alters the population-level functional and can induce non-negligible approximation bias, even under correct identification. Under smoothness conditions, we show that this coarsening bias is first order in the bin width and arises at the level of the target functional, distinct from statistical estimation error. We propose a simple bias-reduced functional that evaluates the outcome regression at within-bin conditional means, eliminating the leading term and yielding a second-order approximation error. We derive plug-in and one-step estimators for the bias-reduced functional. Simulations demonstrate substantial bias reduction and near-nominal confidence interval coverage, even under coarse binning. Our results provide a simple framework for controlling the impact of variable discretization on parameter approximation and estimation.