Generalized Jeffreys's approximate objective Bayes factor: Model-selection consistency, finite-sample accuracy, and statistical evidence in 71,126 clinical trial findings

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address widespread p-value misuse and misinterpretation of statistical significance, this paper proposes the extended Jeffreys Approximate Bayes factor (eJAB)—a generalized, objective Bayesian measure computable in closed form from only the p-value, sample size, and parameter dimension, providing an interpretable quantification of evidence strength. Theoretically, eJAB is proven to achieve model selection consistency under common settings including t-tests and F-tests; Monte Carlo simulations confirm its robustness in small samples. A large-scale analysis of 71,126 clinical trials from ClinicalTrials.gov reveals that 4,088 results carry Type I error risk, and 487 exhibit the Jeffreys–Lindley paradox; 75% adopt α ≥ 0.05, while 35.5% of “significant” findings yield only anecdotal evidence (eJAB < 3). eJAB offers a broadly applicable, implementation-friendly, objective Bayesian alternative to p-values.

Technology Category

Application Category

📝 Abstract
Concerns about the misuse and misinterpretation of p-values and statistical significance have motivated alternatives for quantifying evidence. We define a generalized form of Jeffreys's approximate objective Bayes factor (eJAB), a one-line calculation that is a function of the p-value, sample size, and parameter dimension. We establish conditions under which eJAB is model-selection consistent and verify them for ten statistical tests. We assess finite-sample accuracy by comparing eJAB with Markov chain Monte Carlo computed Bayes factors in 12 simulation studies. We then apply eJAB to 71,126 results from ClinicalTrials.gov (CTG) and find that the proportion of findings with $ ext{p-value} le α$ yet $eJAB_{01}>1$ (favoring the null) closely tracks the significance level $α$, suggesting that such contradictions are pointing to the type I errors. We catalog 4,088 such candidate type I errors and provide details for 131 with reported $ ext{p-value} le 0.01$. We also identify 487 instances of the Jeffreys-Lindley paradox. Finally, we estimate that 75% (6%) of clinical trial plans from CTG set $αge 0.05$ as the target evidence threshold, and that 35.5% (0.22%) of results significant at $α=0.05$ correspond to evidence that is no stronger than anecdotal under eJAB.
Problem

Research questions and friction points this paper is trying to address.

Developing a generalized Bayes factor alternative to p-values
Establishing model selection consistency for statistical tests
Assessing evidence reliability in clinical trial findings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized Jeffreys's approximate objective Bayes factor
One-line calculation using p-value, sample size, dimension
Model-selection consistency verified for ten statistical tests
🔎 Similar Papers
No similar papers found.
P
Puneet Velidi
Department of Mathematics and Statistics, University of Victoria
Z
Zhengxiao Wei
Department of Mathematics and Statistics, University of Victoria
S
Shreena Nisha Kalaria
Deeley Research Centre, British Columbia Cancer Agency
Yimeng Liu
Yimeng Liu
University of California, Santa Barbara
Human-Computer InteractionHuman-AI InteractionHuman-Centered AI
C
Céline M. Laumont
Deeley Research Centre, British Columbia Cancer Agency
B
Brad H. Nelson
Deeley Research Centre, British Columbia Cancer Agency
F
Farouk S. Nathoo
Department of Mathematics and Statistics, University of Victoria