Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In de novo drug design, inefficient chemical space exploration, high evaluation costs (e.g., physical simulations or human feedback), and mode collapse in reinforcement learning (RL) remain critical challenges. To address these, this paper proposes a diversity-driven RL framework. Its core innovation is the first integration of Determinantal Point Processes (DPPs) into RL minibatch selection, explicitly modeling molecular pairwise similarities to prioritize structurally diverse and high-quality molecule subsets during policy updates. The method is rigorously evaluated across multiple molecular generation oracles. Results show significant improvements in generated molecule diversity—measured by FCD and SNN scores (+12–28% over baselines)—while preserving drug-likeness (unchanged QED and SA scores) and target activity. This work establishes a new paradigm for efficient, robust molecular generation under high-cost evaluation settings.

Technology Category

Application Category

📝 Abstract
In many real-world applications, evaluating the goodness of instances is often costly and time-consuming, e.g., human feedback and physics simulations, in contrast to proposing new instances. In particular, this is even more critical in reinforcement learning, as new interactions with the environment (i.e., new instances) need to be evaluated to provide a reward signal to learn from. As sufficient exploration is crucial, learning from a diverse mini-batch can have a large impact and help mitigate mode collapse. In this paper, we introduce diverse mini-batch selection for reinforcement learning and propose to use determinantal point processes for this task. We study this framework in the context of a real-world problem, namely drug discovery. We experimentally study how our proposed framework can improve the effectiveness of chemical exploration in de novo drug design, where finding diverse and high-quality solutions is essential. We conduct a comprehensive evaluation with three well-established molecular generation oracles over numerous generative steps. Our experiments conclude that our diverse mini-batch selection framework can substantially improve the diversity of the solutions, while still obtaining solutions of high quality. In drug discovery, such outcome can potentially lead to fulfilling unmet medication needs faster.
Problem

Research questions and friction points this paper is trying to address.

Costly evaluation of instance goodness in real-world applications
Critical need for diverse exploration in reinforcement learning
Enhancing diversity and quality in de novo drug design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diverse mini-batch selection in RL
Determinantal point processes for diversity
Enhancing drug discovery exploration efficiency
🔎 Similar Papers
No similar papers found.
H
Hampus Gummesson Svensson
Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
Ola Engkvist
Ola Engkvist
AstraZeneca R&D Gothenburg Orcid:0000-0003-4970-6461
CheminformaticsDrug DiscoveryMachine LearningSemantic Web TechnologiesOpen Innovation
Jon Paul Janet
Jon Paul Janet
AstraZeneca R&D
Machine learning for molecular systems
C
Christian Tyrchan
Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
Morteza Haghir Chehreghani
Morteza Haghir Chehreghani
Chalmers University of Technology
Artificial IntelligenceMachine LearningData ScienceDeep Learning