🤖 AI Summary
This study addresses the challenge of confounding bias, estimation error, and instability in causal effect estimation under high-dimensional or small-sample settings. To this end, the authors propose Decoupled Double Machine Learning (DDML), a novel approach that integrates causal role decoupling with residual dependency orthogonalization. Specifically, DDML disentangles confounders, treatment-specific factors, and outcome-specific factors, and applies orthogonalization to the resulting residuals to substantially improve estimation accuracy. By uniquely incorporating variable decoupling and residual orthogonalization into the double machine learning framework, DDML significantly enhances robustness and precision in complex data scenarios. Extensive experiments on synthetic, semi-synthetic, and real-world datasets demonstrate that DDML consistently outperforms 13 state-of-the-art baseline methods in terms of both MAE and RMSE.
📝 Abstract
Confounding bias is a key challenge in causal effect estimation from observational data. Double Machine Learning (DML) addresses this issue by estimating treatment and outcome nuisance functions, constructing treatment and outcome residuals, and estimating causal effects from the residuals. However, DML often produces biased and unstable estimates in highdimensional or finite-sample scenarios. One reason is that DML estimates nuisance functions using all covariates without disentangling distinct latent factors, resulting in unreliable nuisance function estimation. Another is that imprecise nuisance estimation further introduces residual dependence between the treatment residual and the remaining outcome error, undermining the accuracy of causal effect estimates. To address these issues, in this paper, we propose Disentangled Double Machine Learning (DDML), a novel algorithm that integrates two key strategies. First, a causal role disentanglement strategy decomposes covariates into confounders, treatment-specific factors, and outcomespecific factors for enabling reliable nuisance function estimation. And second, a residual dependence orthogonalization strategy mitigates residual dependence caused by nuisance estimation errors for enhancing the precision of causal effect estimates. Experimental results on synthetic, semi-synthetic, and real-world datasets demonstrate that DDML significantly outperforms 13 state-of-the-art baseline algorithms in both MAE and RMSE.