🤖 AI Summary
This study addresses the low target-matching fidelity and poor drug-likeness of de novo molecules generated for targeted therapeutics. We propose a protein-specific fine-tuning framework for molecular language models. Methodologically, we design a composite reward function integrating drug–target interaction prediction with multi-dimensional molecular validity constraints—quantitative estimate of drug-likeness (QED), octanol–water partition coefficient (logP), and molecular weight (MW)—and perform end-to-end differentiable reinforcement learning fine-tuning via the proximal policy optimization (PPO) algorithm. Our key contribution is the first incorporation of target-specific interaction modeling and chemical validity optimization into the generative language model training paradigm. Experimental results demonstrate that the generated molecules achieve QED = 65.37, MW = 321.55 Da, logP = 4.47, and structural novelty of 99.959%, significantly outperforming baseline approaches.
📝 Abstract
Developing new drugs is laborious and costly, demanding extensive time investment. In this paper, we introduce a de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for generating drugs tailored to protein targets. The proposed method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity. Following RL fine-tuning, the proposed method demonstrates promising outcomes, yielding notable improvements in molecular validity, interaction efficacy, and critical chemical properties, achieving 65.37 for Quantitative Estimation of Drug-likeness (QED), 321.55 for Molecular Weight (MW), and 4.47 for Octanol-Water Partition Coefficient (logP), respectively. Furthermore, out of the generated drugs, only 0.041% do not exhibit novelty.