How many patients could we save with LLM priors?

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Large sample sizes and low statistical efficiency hinder adverse event safety assessment in multicenter clinical trials. Method: We propose a hierarchical Bayesian modeling framework that incorporates prior knowledge from large language models (LLMs). Specifically, we directly embed LLM-generated parametric priors into the Bayesian hyperparameter structure—bypassing synthetic data generation—and systematically extract interpretable, domain-informed priors via temperature-sensitive analysis and cross-validation. These informative priors are integrated into a hierarchical model to enhance safety signal detection. Contribution/Results: Validated on real trial data, our approach significantly reduces the number of patients required to achieve equivalent statistical power compared to conventional methods. It also demonstrates superior predictive stability over standard meta-analyses. By synergizing LLM-derived clinical expertise with rigorous Bayesian inference, the framework establishes a new paradigm for expert-informed, efficient, and reliable pharmacovigilance and regulatory decision-making.

Technology Category

Application Category

📝 Abstract

Imagine a world where clinical trials need far fewer patients to achieve the same statistical power, thanks to the knowledge encoded in large language models (LLMs). We present a novel framework for hierarchical Bayesian modeling of adverse events in multi-center clinical trials, leveraging LLM-informed prior distributions. Unlike data augmentation approaches that generate synthetic data points, our methodology directly obtains parametric priors from the model. Our approach systematically elicits informative priors for hyperparameters in hierarchical Bayesian models using a pre-trained LLM, enabling the incorporation of external clinical expertise directly into Bayesian safety modeling. Through comprehensive temperature sensitivity analysis and rigorous cross-validation on real-world clinical trial data, we demonstrate that LLM-derived priors consistently improve predictive performance compared to traditional meta-analytical approaches. This methodology paves the way for more efficient and expert-informed clinical trial design, enabling substantial reductions in the number of patients required to achieve robust safety assessment and with the potential to transform drug safety monitoring and regulatory decision making.

Problem

Research questions and friction points this paper is trying to address.

Reducing patient numbers needed in clinical trials using LLM knowledge

Improving adverse event modeling with LLM-informed Bayesian priors

Enhancing predictive performance over traditional meta-analytical approaches

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-informed hierarchical Bayesian priors

Parametric priors from model, not synthetic data

Systematic elicitation using pre-trained LLMs

🔎 Similar Papers

No similar papers found.