Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

📅 2024-05-10
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the low target-matching fidelity and poor drug-likeness of de novo molecules generated for targeted therapeutics. We propose a protein-specific fine-tuning framework for molecular language models. Methodologically, we design a composite reward function integrating drug–target interaction prediction with multi-dimensional molecular validity constraints—quantitative estimate of drug-likeness (QED), octanol–water partition coefficient (logP), and molecular weight (MW)—and perform end-to-end differentiable reinforcement learning fine-tuning via the proximal policy optimization (PPO) algorithm. Our key contribution is the first incorporation of target-specific interaction modeling and chemical validity optimization into the generative language model training paradigm. Experimental results demonstrate that the generated molecules achieve QED = 65.37, MW = 321.55 Da, logP = 4.47, and structural novelty of 99.959%, significantly outperforming baseline approaches.

Technology Category

Application Category

📝 Abstract
Developing new drugs is laborious and costly, demanding extensive time investment. In this paper, we introduce a de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for generating drugs tailored to protein targets. The proposed method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity. Following RL fine-tuning, the proposed method demonstrates promising outcomes, yielding notable improvements in molecular validity, interaction efficacy, and critical chemical properties, achieving 65.37 for Quantitative Estimation of Drug-likeness (QED), 321.55 for Molecular Weight (MW), and 4.47 for Octanol-Water Partition Coefficient (logP), respectively. Furthermore, out of the generated drugs, only 0.041% do not exhibit novelty.
Problem

Research questions and friction points this paper is trying to address.

Enhancing targeted drug generation using language models and reinforcement learning
Optimizing molecular validity and drug-target interaction via composite reward functions
Improving key chemical properties like QED, MW, and logP in drug design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning language models via reinforcement learning
Using PPO for targeted drug generation
Composite reward function for drug validity
🔎 Similar Papers
No similar papers found.