🤖 AI Summary
This study addresses the limitation of prior research, which has only established correlations between code coverage and defect introduction without adequately controlling for confounding factors. For the first time in real-world JavaScript/TypeScript open-source projects, we treat code coverage as a continuous exposure variable, construct a causal directed acyclic graph to identify confounders, and employ generalized propensity scores combined with doubly robust regression to estimate both the average treatment effect and the dose–response relationship between coverage and defect introduction. Our findings reveal a nonlinear causal effect—such as threshold effects or diminishing marginal returns—providing the first empirical evidence grounded in causal inference to inform the optimization of testing strategies.
📝 Abstract
Context: Code coverage is widely used as a software quality assurance measure. However, its effect, and specifically the advisable dose, are disputed in both the research and engineering communities. Prior work reports only correlational associations, leaving results vulnerable to confounding factors. Objective: We aim to quantify the causal effect of code coverage (exposure) on bug introduction (outcome) in the context of mature JavaScript and TypeScript open source projects, addressing both the overall effect and its variance across coverage levels. Method: We construct a causal directed acyclic graph to identify confounders within the software engineering process, modeling key variables from the source code, issue- and review systems, and continuous integration. Using generalized propensity score adjustment, we will apply doubly robust regression-based causal inference for continuous exposure to a novel dataset of bug-introducing and non-bug-introducing changes. We estimate the average treatment effect and dose-response relationship to examine potential non-linear patterns (e.g., thresholds or diminishing returns) within the projects of our dataset.