π€ AI Summary
This study addresses the challenges of automatic, timely labeling of large-scale clinical trial outcomes to improve predictive model performance and accelerate drug development. We construct the CTO knowledge base covering 125K trials and propose the first multimodal, time-series dynamic label generation framework integrating news sentiment, stock price volatility, and literature semantics. We design a continuous-labeling paradigm aligned with the pharmaceutical R&D lifecycle and quantitatively characterize distributional shift in trial data from 2020β2024βthe first such empirical analysis. Leveraging LLM-based semantic parsing, cross-temporal trial alignment, and an expert-curated annotation protocol, our method achieves 94% F1-score on Phase 3 trial labels and 91% average F1 across all phases. We publicly release a fully reproducible, updatable knowledge base and annotation dataset (https://chufangao.github.io/CTOD), enabling real-time updates and serving as a benchmark for downstream evaluation.
π Abstract
Background: The global cost of drug discovery and development exceeds $200 billion annually, with clinical trial outcomes playing a critical role in the regulatory approval of new drugs and impacting patient outcomes. Despite their significance, large-scale, high-quality clinical trial outcome data are not readily available to the public, limiting advances in trial outcome predictive modeling. Methods: We introduce the Clinical Trial Outcome (CTO) knowledge base, a fully reproducible, large-scale (around 125K drug and biologics trials), open-source of clinical trial information including large language model (LLM) interpretations of publications, matched trials over phases, sentiment analysis from news, stock prices of trial sponsors, and other trial-related metrics. From this knowledge base, we additionally performed manual annotation of a set of recent clinical trials from 2020-2024. Results: We evaluated the quality of our knowledge base by generating high-quality trial outcome labels that demonstrate strong agreement with previously published expert annotations, achieving an F1 score of 94 for Phase 3 trials and 91 across all phases. Additionally, we benchmarked a suite of standard machine learning models on our manually annotated set, highlighting the distribution shift of recent trials and the need for continuously updated labeling methods. Conclusions: By analyzing CTO's performance on recent trials, we showed a need for recent, high-quality trial outcome labels. We release our knowledge base and labels to the public at https://chufangao.github.io/CTOD, which will also be regularly updated to support ongoing research in clinical trial outcomes, offering insights that could optimize the drug development process.