🤖 AI Summary
This paper investigates the necessary conditions under which unobserved covariates induce significance reversal—i.e., render a previously insignificant regression coefficient statistically significant. Through analytical derivation and algebraic decomposition of the t-statistic, it rigorously characterizes the dual impact of covariates on both point estimates and standard errors. The analysis shows that merely reducing the standard error is insufficient for reversal; instead, the covariate must simultaneously shift the point estimate and introduce systematic bias. The paper derives, for the first time, closed-form lower bounds on the minimum correlations required among the covariate, outcome, and treatment variable. It formally proves that “pure variance compression” cannot reverse significance. Moreover, it establishes theoretical limits on p-value manipulation and provides a unified algebraic framework explaining common empirical phenomena—such as sign flips or significance reversals—in regression analyses.
📝 Abstract
We study conditions under which the addition of variables to a regression equation can turn a previously statistically insignificant result into a significant one. Specifically, we characterize the minimum strength of association required for these variables--both with the dependent and independent variables, or with the dependent variable alone--to elevate the observed t-statistic above a specified significance threshold. Interestingly, we show that it is considerably difficult to overturn a statistically insignificant result solely by reducing the standard error. Instead, included variables must also alter the point estimate to achieve such reversals in practice. Our results can be used for sensitivity analysis and for bounding the extent of p-hacking, and may also offer algebraic explanations for patterns of reversals seen in empirical research, such as those documented by Lenz and Sahn (2021).