ToolMol: Evolutionary Agentic Framework for Multi-objective Drug Discovery

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work proposes ToolMol, a novel framework that integrates multi-objective genetic algorithms with tool-augmented large language model (LLM) agents to overcome the limitations of existing LLM-driven molecular generation methods, which often produce invalid or low-quality ligands due to syntactic constraints in molecular string representations. By leveraging precise chemical operations from the RDKit toolkit and guided by chain-of-thought reasoning, ToolMol iteratively optimizes ligand populations for efficient and synthesizable de novo small-molecule design. Experimental results across three protein targets demonstrate that ToolMol generates candidates with over 10% improvement in binding affinity and achieves absolute binding free energy scores more than 35% better than current state-of-the-art methods, while maintaining high validity and structural diversity.

📝 Abstract

Advances in large language models (LLMs) have recently opened new and promising avenues for small-molecule drug discovery. Yet existing LLM-based approaches for molecular generation often suffer from high rates of invalid and low-quality ligand candidates, a result of the syntactic limitations of current models with regard to molecular strings. In this paper, we introduce $\texttt{ToolMol}$, an evolutionary agentic framework for de novo drug design. $\texttt{ToolMol}$ combines a multi-objective genetic algorithm with an agentic LLM operator that iteratively updates the ligand population. We build a comprehensive toolbox of RDKit-backed functions that allows our agentic operator to consisently make precise ligand modifications. $\texttt{ToolMol}$ achieves state-of-the-art performance on multi-objective property optimization tasks, discovering drug-like and synthesizable ligands that have $>10\%$ stronger predicted binding affinity compared to existing methods, evaluated on three protein targets. $\texttt{ToolMol}$ ligands additionally achieve state-of-the-art results in gold-standard Absolute Binding Free Energy scores, gaining over existing methods by over $35\%$. By studying chain-of-thought reasoning traces, we observe that tool-calling enables the model to more faithfully execute its planned modifications, efficiently exploiting the strong chemical prior knowledge in LLMs.

Problem

Research questions and friction points this paper is trying to address.

molecular generation

invalid ligands

low-quality candidates

syntactic limitations

drug discovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

evolutionary agentic framework

multi-objective drug discovery

LLM-based molecular design