🤖 AI Summary
Clinical trial sample size calculation requires specialized statistical expertise, posing a significant practical barrier for non-statistician researchers. To address this, we propose PowerGPT—the first AI system integrating large language models (LLMs) with a domain-specific statistical engine—enabling automated identification of appropriate statistical tests, parsing of hypothesis specifications, and accurate sample size estimation for diverse procedures (e.g., t-tests, ANOVA, survival analysis) directly from natural language input. Its key innovation lies in constructing an interpretable statistical decision chain that bridges the semantic gap between clinical domain knowledge and formal statistical methodology. In a randomized controlled evaluation, PowerGPT achieved a 99.3% task completion rate and 94.1% accuracy, reducing average computation time from 9.3 to 4.0 minutes. These results demonstrate substantial improvements in accessibility, efficiency, and reliability of clinical trial design.
📝 Abstract
Sample size calculations for power analysis are critical for clinical research and trial design, yet their complexity and reliance on statistical expertise create barriers for many researchers. We introduce PowerGPT, an AI-powered system integrating large language models (LLMs) with statistical engines to automate test selection and sample size estimation in trial design. In a randomized trial to evaluate its effectiveness, PowerGPT significantly improved task completion rates (99.3% vs. 88.9% for test selection, 99.3% vs. 77.8% for sample size calculation) and accuracy (94.1% vs. 55.4% in sample size estimation, p < 0.001), while reducing average completion time (4.0 vs. 9.3 minutes, p < 0.001). These gains were consistent across various statistical tests and benefited both statisticians and non-statisticians as well as bridging expertise gaps. Already under deployment across multiple institutions, PowerGPT represents a scalable AI-driven approach that enhances accessibility, efficiency, and accuracy in statistical power analysis for clinical research.