Sharp variance estimator and causal bootstrap in stratified randomized experiments

📅 2024-01-30
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In hierarchical randomized experiments, small sample sizes or skewed outcomes often lead to overly conservative Neyman variance estimates and failure of normal approximations, thereby undermining the reliability of inference for weighted average treatment effects. To address this, we propose a randomization-based exact variance estimator and introduce two novel causal bootstrap methods—rank-preserving and constant-treatment-effect counterfactual imputation—achieving second-order accuracy improvements; these are further extended to paired-experiment designs. Our approach is grounded in finite-population asymptotic theory and the randomization-inference framework, and is implemented in the open-source R package *CausalBootstrap*. Simulation studies and empirical applications demonstrate that the proposed methods substantially improve confidence interval coverage and statistical power, effectively mitigating inference bias under small-sample and outcome-skewness conditions.

Technology Category

Application Category

📝 Abstract
Randomized experiments are the gold standard for estimating treatment effects, and randomization serves as a reasoned basis for inference. In widely used stratified randomized experiments, randomization-based finite-population asymptotic theory enables valid inference for the average treatment effect, relying on normal approximation and a Neyman-type conservative variance estimator. However, when the sample size is small or the outcomes are skewed, the Neyman-type variance estimator may become overly conservative, and the normal approximation can fail. To address these issues, we propose a sharp variance estimator and two causal bootstrap methods to more accurately approximate the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments. The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation. The second causal bootstrap procedure is based on constant-treatment-effect imputation and is further applicable in paired experiments. In contrast to traditional bootstrap methods, where randomness originates from hypothetical super-population sampling, our analysis for the proposed causal bootstrap is randomization-based, relying solely on the randomness of treatment assignment in randomized experiments. Numerical studies and two real data applications demonstrate advantages of our proposed methods in finite samples. The exttt{R} package exttt{CausalBootstrap} implementing our method is publicly available.
Problem

Research questions and friction points this paper is trying to address.

Improve variance estimation in small or skewed stratified experiments
Develop causal bootstrap methods for accurate sampling distribution
Address conservativeness and normal approximation failure in Neyman-type estimators
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sharp variance estimator for stratified experiments
Rank-preserving imputation causal bootstrap
Constant-effect imputation for paired experiments
H
Haoyang Yu
Department of Statistics and Data Science, Tsinghua University
K
Ke Zhu
Department of Statistics, North Carolina State University; Department of Biostatistics and Bioinformatics, Duke University
Hanzhong Liu
Hanzhong Liu
Tsinghua University
high dimensional statisticscausal inference