🤖 AI Summary
In drug discovery, AI tools are fragmented across isolated platforms with incompatible interfaces and scripting environments, leading to inefficient, redundant workflows. To address this, we propose FROGENT, an end-to-end intelligent agent that—uniquely—integrates, within a unified framework, a dynamic biochemical knowledge base, a heterogeneous toolkit (encompassing molecular generation, virtual screening, synthetic route planning, etc.), and domain-specific AI models. Leveraging large language models (LLMs) and the Model Context Protocol (MCP), FROGENT achieves task understanding, adaptive workflow orchestration, and autonomous decision-making. Evaluated on eight benchmark tasks, FROGENT significantly outperforms existing approaches: it achieves a target–ligand hit-rate retrieval three times higher than the strongest baseline, improves protein–small-molecule interaction prediction accuracy by 100%, and consistently surpasses Qwen3-32B and GPT-4o across all metrics.
📝 Abstract
Powerful AI tools for drug discovery reside in isolated web apps, desktop programs, and code libraries. Such fragmentation forces scientists to manage incompatible interfaces and specialized scripts, which can be a cumbersome and repetitive process. To address this issue, a Full-pROcess druG dEsign ageNT, named FROGENT, has been proposed. Specifically, FROGENT utilizes a Large Language Model and the Model Context Protocol to integrate multiple dynamic biochemical databases, extensible tool libraries, and task-specific AI models. This agentic framework allows FROGENT to execute complicated drug discovery workflows dynamically, including component tasks such as target identification, molecule generation and retrosynthetic planning. FROGENT has been evaluated on eight benchmarks that cover various aspects of drug discovery, such as knowledge retrieval, property prediction, virtual screening, mechanistic analysis, molecular design, and synthesis. It was compared against six increasingly advanced ReAct-style agents that support code execution and literature searches. Empirical results demonstrated that FROGENT triples the best baseline performance in hit-finding and doubles it in interaction profiling, significantly outperforming both the open-source model Qwen3-32B and the commercial model GPT-4o. In addition, real-world cases have been utilized to validate the practicability and generalization of FROGENT. This development suggests that streamlining the agentic drug discovery pipeline can significantly enhance researcher productivity.