Estimation of causal dose-response functions under data fusion

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Estimating causal dose–response functions over the full exposure range remains challenging with a single data source due to limited support and insufficient variation. Method: We propose a multi-source partially aligned data fusion framework. Its core is a Neyman-orthogonal loss function tailored for data fusion, coupled with a stochastic approximation algorithm that preserves orthogonality. We instantiate the estimator via kernel ridge regression combined with the orthogonal loss, yielding a closed-form solution that balances statistical accuracy and computational efficiency. Contribution/Results: We theoretically establish that multi-source fusion tightens the finite-sample regret bound and improves worst-case performance. Simulation studies demonstrate substantial gains over single-source estimators—particularly in estimating non-smooth causal parameters—validating the practical efficacy of data fusion in causal inference.

Technology Category

Application Category

📝 Abstract
Estimating the causal dose-response function is challenging, particularly when data from a single source are insufficient to estimate responses precisely across all exposure levels. To overcome this limitation, we propose a data fusion framework that leverages multiple data sources that are partially aligned with the target distribution. Specifically, we derive a Neyman-orthogonal loss function tailored for estimating the dose-response function within data fusion settings. To improve computational efficiency, we propose a stochastic approximation that retains orthogonality. We apply kernel ridge regression with this approximation, which provides closed-form estimators. Our theoretical analysis demonstrates that incorporating additional data sources yields tighter finite-sample regret bounds and improved worst-case performance, as confirmed via minimax lower bound comparison. Simulation studies validate the practical advantages of our approach, showing improved estimation accuracy when employing data fusion. This study highlights the potential of data fusion for estimating non-smooth parameters such as causal dose-response functions.
Problem

Research questions and friction points this paper is trying to address.

Estimating causal dose-response functions with insufficient single-source data
Developing data fusion framework using partially aligned multiple sources
Improving estimation accuracy and computational efficiency via orthogonal methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data fusion framework leverages multiple partially aligned sources
Neyman-orthogonal loss function tailored for dose-response estimation
Kernel ridge regression with stochastic approximation for efficiency
🔎 Similar Papers
No similar papers found.