TxGemma: Efficient and Agentic LLMs for Therapeutics

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

High costs and failure rates in drug discovery necessitate AI models with both predictive accuracy and scientific reasoning capabilities. Method: We introduce Agentic-Tx, a series of therapy-oriented multimodal large language models (2B/9B/27B parameters) built upon the Gemma-2 architecture. Agentic-Tx integrates autonomous reasoning, tool-augmented execution, and retrieval-augmented generation to support interpretable mechanistic reasoning across small molecules, proteins, nucleic acids, diseases, and cell lines. It is pretrained on diverse biomedical data, fine-tuned via instruction tuning, and optimized using RLHF, with integrated external knowledge retrieval. Contribution/Results: Evaluated on 66 drug discovery tasks, Agentic-Tx outperforms state-of-the-art (SOTA) general-purpose models on 45 tasks and SOTA domain-specific models on 26 tasks. On the Chemistry&Biology benchmark, it achieves a 52.3% improvement over o3-mini, demonstrating superior data efficiency and capability in solving complex biochemical problems.

Technology Category

Application Category

📝 Abstract

Therapeutic development is a costly and high-risk endeavor that is often plagued by high failure rates. To address this, we introduce TxGemma, a suite of efficient, generalist large language models (LLMs) capable of therapeutic property prediction as well as interactive reasoning and explainability. Unlike task-specific models, TxGemma synthesizes information from diverse sources, enabling broad application across the therapeutic development pipeline. The suite includes 2B, 9B, and 27B parameter models, fine-tuned from Gemma-2 on a comprehensive dataset of small molecules, proteins, nucleic acids, diseases, and cell lines. Across 66 therapeutic development tasks, TxGemma achieved superior or comparable performance to the state-of-the-art generalist model on 64 (superior on 45), and against state-of-the-art specialist models on 50 (superior on 26). Fine-tuning TxGemma models on therapeutic downstream tasks, such as clinical trial adverse event prediction, requires less training data than fine-tuning base LLMs, making TxGemma suitable for data-limited applications. Beyond these predictive capabilities, TxGemma features conversational models that bridge the gap between general LLMs and specialized property predictors. These allow scientists to interact in natural language, provide mechanistic reasoning for predictions based on molecular structure, and engage in scientific discussions. Building on this, we further introduce Agentic-Tx, a generalist therapeutic agentic system powered by Gemini 2.5 that reasons, acts, manages diverse workflows, and acquires external domain knowledge. Agentic-Tx surpasses prior leading models on the Humanity's Last Exam benchmark (Chemistry&Biology) with 52.3% relative improvement over o3-mini (high) and 26.7% over o3-mini (high) on GPQA (Chemistry) and excels with improvements of 6.3% (ChemBench-Preference) and 2.4% (ChemBench-Mini) over o3-mini (high).

Problem

Research questions and friction points this paper is trying to address.

Addresses high failure rates in costly therapeutic development

Predicts therapeutic properties with interactive reasoning

Reduces training data needs for clinical applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient generalist LLMs for therapeutic prediction

Interactive reasoning with conversational models

Agentic system for diverse therapeutic workflows

🔎 Similar Papers

No similar papers found.