Diversity-Aware Reinforcement Learning for de novo Drug Design

📅 2024-10-14
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Pretrained generative models for drug molecule design often suffer from premature convergence to local optima during reinforcement learning (RL)-based reward optimization, resulting in limited molecular diversity and suboptimal drug-likeness. To address this, we propose an RL framework featuring adaptive reward function updating. We systematically investigate diverse intrinsic motivation mechanisms for controlling molecular diversity and introduce a novel synergistic reward correction strategy that jointly incorporates structural similarity penalization and uncertainty-aware predictive rewards. Our method integrates graph neural networks (GNNs) with policy gradient optimization. Evaluated on multiple benchmark datasets, the generated molecule sets achieve a 37% average improvement in diversity—measured by scaffold and fingerprint dissimilarity—while maintaining or improving drug-likeness (quantitative estimate of drug-likeness, QED; synthetic accessibility, SA) and target-binding activity (pIC₅₀). The framework significantly outperforms state-of-the-art baselines, demonstrating superior balance between exploration and exploitation in de novo molecular generation.

Technology Category

Application Category

📝 Abstract
Fine-tuning a pre-trained generative model has demonstrated good performance in generating promising drug molecules. The fine-tuning task is often formulated as a reinforcement learning problem, where previous methods efficiently learn to optimize a reward function to generate potential drug molecules. Nevertheless, in the absence of an adaptive update mechanism for the reward function, the optimization process can become stuck in local optima. The efficacy of the optimal molecule in a local optimization may not translate to usefulness in the subsequent drug optimization process or as a potential standalone clinical candidate. Therefore, it is important to generate a diverse set of promising molecules. Prior work has modified the reward function by penalizing structurally similar molecules, primarily focusing on finding molecules with higher rewards. To date, no study has comprehensively examined how different adaptive update mechanisms for the reward function influence the diversity of generated molecules. In this work, we investigate a wide range of intrinsic motivation methods and strategies to penalize the extrinsic reward, and how they affect the diversity of the set of generated molecules. Our experiments reveal that combining structure- and prediction-based methods generally yields better results in terms of molecular diversity.
Problem

Research questions and friction points this paper is trying to address.

Optimizing drug molecule generation using reinforcement learning
Addressing local optima in reward function adaptation
Enhancing molecular diversity via adaptive reward mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diversity-aware reinforcement learning for drug design
Adaptive reward function update mechanisms
Combining structure- and prediction-based methods
🔎 Similar Papers
No similar papers found.
H
Hampus Gummesson Svensson
Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden; Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
C
C. Tyrchan
Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
Ola Engkvist
Ola Engkvist
AstraZeneca R&D Gothenburg Orcid:0000-0003-4970-6461
CheminformaticsDrug DiscoveryMachine LearningSemantic Web TechnologiesOpen Innovation
M
M. Chehreghani
Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden