🤖 AI Summary
This work addresses the challenges faced by large language models (LLMs) in molecular drug optimization—specifically, their difficulty in modeling the complex relationships between molecular structure and pharmacological properties, compounded by a scarcity of labeled data. To overcome these limitations, we propose a novel framework that integrates an explicit pharmacological reasoning mechanism. By combining domain-specific continual pretraining, supervised fine-tuning based on inverse data engineering, and self-balancing multi-granularity reinforcement learning, our approach introduces, for the first time, interpretable, stepwise pharmacological reasoning into LLM-driven molecular optimization. The method simultaneously preserves molecular structural similarity and target binding affinity while significantly improving multiple ADMET properties, thereby enabling efficient, comprehensive, and interpretable drug molecule optimization and advancing knowledge-driven automated drug discovery.
📝 Abstract
Molecule generation and optimization is a fundamental task in chemical domain. The rapid development of intelligent tools, especially large language models (LLMs) with powerful knowledge reserves and interactive capabilities, has provided new paradigms for it. Nevertheless, the intrinsic challenge for LLMs lies in the complex implicit relationship between molecular structure and pharmacological properties and the lack of corresponding labeled data. To bridge this gap, we propose DrugR, an LLM-based method that introduces explicit, step-by-step pharmacological reasoning into the optimization process. Our approach integrates domain-specific continual pretraining, supervised fine-tuning via reverse data engineering, and self-balanced multi-granular reinforcement learning. This framework enables DrugR to effectively improve key ADMET properties while preserving the original molecule's core efficacy. Experimental results demonstrate that DrugR achieves comprehensive enhancement across multiple properties without compromising structural similarity or target binding affinity. Importantly, its explicit reasoning process provides clear, interpretable rationales for each optimization step, yielding actionable design insights and advancing toward automated, knowledge-driven scientific discovery. Our code and model checkpoints are open-sourced to foster future research.