Bayesian Double Machine Learning for Causal Inference

📅 2025-08-18

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

In high-dimensional settings with partial linear models, machine learning regularization induces confounding bias in causal inference. To address this, we propose the first fully Bayesian double machine learning (DML) framework. Our method modifies the Bayesian multivariate regression model to identify causal effects from the reduced-form covariance matrix, seamlessly integrating Bayesian modeling with DML to construct a generative probabilistic model that strictly adheres to the likelihood principle—without requiring selective observability assumptions. We establish theoretical guarantees: asymptotic unbiasedness, normality, and semiparametric efficiency under finite samples, justified by the Bernstein–von Mises theorem. Simulation studies demonstrate that our approach significantly reduces mean squared error compared to existing Bayesian and frequentist methods, while improving confidence interval coverage probability and narrowing interval width. It further exhibits superior robustness and statistical efficiency.

Technology Category

Application Category

📝 Abstract

This paper proposes a simple, novel, and fully-Bayesian approach for causal inference in partially linear models with high-dimensional control variables. Off-the-shelf machine learning methods can introduce biases in the causal parameter known as regularization-induced confounding. To address this, we propose a Bayesian Double Machine Learning (BDML) method, which modifies a standard Bayesian multivariate regression model and recovers the causal effect of interest from the reduced-form covariance matrix. Our BDML is related to the burgeoning frequentist literature on DML while addressing its limitations in finite-sample inference. Moreover, the BDML is based on a fully generative probability model in the DML context, adhering to the likelihood principle. We show that in high dimensional setups the naive estimator implicitly assumes no selection on observables--unlike our BDML. The BDML exhibits lower asymptotic bias and achieves asymptotic normality and semiparametric efficiency as established by a Bernstein-von Mises theorem, thereby ensuring robustness to misspecification. In simulations, our BDML achieves lower RMSE, better frequentist coverage, and shorter confidence interval width than alternatives from the literature, both Bayesian and frequentist.

Problem

Research questions and friction points this paper is trying to address.

Addresses regularization-induced confounding in causal inference

Proposes Bayesian Double Machine Learning for high-dimensional models

Ensures robustness and efficiency in finite-sample inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Double Machine Learning method

Generative probability model adherence

Reduced asymptotic bias robustness

🔎 Similar Papers

No similar papers found.