๐ค AI Summary
This paper addresses the challenge of estimating causal effects of high-dimensional continuous treatments under latent confounding. Methodologically, it proposes a proxy causal learning framework that avoids explicit density ratio estimation by leveraging proxy variables for both treatment and outcome. Specifically, it constructs a regularized estimator via kernel ridge regression, jointly imposing proxy constraints in a functional spaceโthereby enabling the first efficient implementation of the density-ratio-directed paradigm and yielding closed-form solutions for both the dose-response and conditional dose-response curves. Theoretically, the estimator is consistent and naturally accommodates high-dimensional continuous treatments. Empirical evaluations on synthetic and real-world datasets demonstrate that the method achieves superior or comparable estimation accuracy and stability relative to state-of-the-art approaches, significantly enhancing the practicality and reliability of causal inference for continuous treatments under latent confounding.
๐ Abstract
We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Proxy methods accomplish this task using two proxy variables related to the latent confounder: a treatment proxy (related to the treatment) and an outcome proxy (related to the outcome). Two approaches have been proposed to perform causal effect estimation given proxy variables; however only one of these has found mainstream acceptance, since the other was understood to require density ratio estimation - a challenging task in high dimensions. In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. We employ kernel ridge regression to derive estimators, resulting in simple closed-form solutions for dose-response and conditional dose-response curves, along with consistency guarantees. Our methods empirically demonstrate superior or comparable performance to existing frameworks on synthetic and real-world datasets.