🤖 AI Summary
Molecular multi-objective optimization remains a critical bottleneck in drug discovery; conventional trial-and-error approaches are inefficient, while existing AI methods struggle to simultaneously optimize multiple pharmacologically relevant properties—such as selectivity and bioavailability. To address this, we propose MultiMol, the first dual-agent collaborative large language model (LLM) framework for molecular design. The generation agent performs data-driven molecular graph modeling and Pareto-optimal frontier search over multiple objectives, while the research agent enables scientific guidance via dynamic literature retrieval and knowledge distillation. The system integrates experimental-data fine-tuning, instruction tuning, and joint property prediction. Evaluated on six benchmark tasks, MultiMol achieves an 82.30% success rate—surpassing the prior state-of-the-art (27.50%). It further demonstrates practical utility by enhancing XAC’s selectivity for the A1 adenosine receptor and improving Saquinavir’s oral bioavailability, thereby advancing knowledge-enhanced, multi-objective molecular design.
📝 Abstract
Molecular optimization is a crucial yet complex and time-intensive process that often acts as a bottleneck for drug development. Traditional methods rely heavily on trial and error, making multi-objective optimization both time-consuming and resource-intensive. Current AI-based methods have shown limited success in handling multi-objective optimization tasks, hampering their practical utilization. To address this challenge, we present MultiMol, a collaborative large language model (LLM) system designed to guide multi-objective molecular optimization. MultiMol comprises two agents, including a data-driven worker agent and a literature-guided research agent. The data-driven worker agent is a large language model being fine-tuned to learn how to generate optimized molecules considering multiple objectives, while the literature-guided research agent is responsible for searching task-related literature to find useful prior knowledge that facilitates identifying the most promising optimized candidates. In evaluations across six multi-objective optimization tasks, MultiMol significantly outperforms existing methods, achieving a 82.30% success rate, in sharp contrast to the 27.50% success rate of current strongest methods. To further validate its practical impact, we tested MultiMol on two real-world challenges. First, we enhanced the selectivity of Xanthine Amine Congener (XAC), a promiscuous ligand that binds both A1R and A2AR, successfully biasing it towards A1R. Second, we improved the bioavailability of Saquinavir, an HIV-1 protease inhibitor with known bioavailability limitations. Overall, these results indicate that MultiMol represents a highly promising approach for multi-objective molecular optimization, holding great potential to accelerate the drug development process and contribute to the advancement of pharmaceutical research.