Bridging Binarization: Causal Inference with Dichotomized Continuous Exposures

📅 2024-05-11

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Binary classification of continuous exposure variables is widely used in causal effect estimation, yet its statistical validity and causal interpretability remain inadequately understood. Method: We establish an equivalence between the binary-classification average treatment effect (ATE) and the difference in average outcomes under two truncation-type interventions. Leveraging this, we formally derive the “relative self-selection invariance” assumption—necessary for valid causal interpretation. We further propose a novel target parameter grounded in the current-world framework, better aligned with real-world policy evaluation. A fully identifiable and estimable parameterization is developed, supported by theoretical analysis and simulation studies. Results: We rigorously characterize how violations of the key assumption induce estimation bias and delineate the correct implementation path for consistent estimators. This work provides both theoretical foundations and practical guidelines for principled binary classification of continuous exposures in causal inference.

Technology Category

Application Category

📝 Abstract

The average treatment effect (ATE) is a common parameter estimated in causal inference literature, but it is only defined for binary treatments. Thus, despite concerns raised by some researchers, many studies seeking to estimate the causal effect of a continuous treatment create a new binary treatment variable by dichotomizing the continuous values into two categories. In this paper, we affirm binarization as a statistically valid method for answering causal questions about continuous treatments by showing the equivalence between the binarized ATE and the difference in the average outcomes of two specific modified treatment policies. These policies impose cut-offs corresponding to the binarized treatment variable and assume preservation of relative self-selection. Relative self-selection is the ratio of the probability density of an individual having an exposure equal to one value of the continuous treatment variable versus another. The policies assume that, for any two values of the treatment variable with non-zero probability density after the cut-off, this ratio will remain unchanged. Through this equivalence, we clarify the assumptions underlying binarization and discuss how to properly interpret the resulting estimator. Additionally, we introduce a new target parameter that can be computed after binarization that considers the status-quo world. We argue that this parameter addresses more relevant causal questions than the traditional binarized ATE parameter. Finally, we present a simulation study to illustrate the implications of these assumptions when analyzing data and to demonstrate how to correctly implement estimators of the parameters discussed.

Problem

Research questions and friction points this paper is trying to address.

Addressing causal inference with dichotomized continuous exposures

Clarifying assumptions underlying binarization method validity

Introducing improved target parameter for relevant causal questions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Equivalence between binarized ATE and modified treatment policies

Introducing new target parameter with observed world benchmark

Preserving relative self-selection ratios through cutoff policies

🔎 Similar Papers

Targeting Relative Risk Heterogeneity with Causal Forests