Diverse Mini-Batch Selection in Reinforcement Learning for Efficient Chemical Exploration in de novo Drug Design

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

In de novo drug design, inefficient chemical space exploration, high evaluation costs (e.g., physical simulations or human feedback), and mode collapse in reinforcement learning (RL) remain critical challenges. To address these, this paper proposes a diversity-driven RL framework. Its core innovation is the first integration of Determinantal Point Processes (DPPs) into RL minibatch selection, explicitly modeling molecular pairwise similarities to prioritize structurally diverse and high-quality molecule subsets during policy updates. The method is rigorously evaluated across multiple molecular generation oracles. Results show significant improvements in generated molecule diversity—measured by FCD and SNN scores (+12–28% over baselines)—while preserving drug-likeness (unchanged QED and SA scores) and target activity. This work establishes a new paradigm for efficient, robust molecular generation under high-cost evaluation settings.

Technology Category

Application Category

📝 Abstract

In many real-world applications, evaluating the goodness of instances is often costly and time-consuming, e.g., human feedback and physics simulations, in contrast to proposing new instances. In particular, this is even more critical in reinforcement learning, as new interactions with the environment (i.e., new instances) need to be evaluated to provide a reward signal to learn from. As sufficient exploration is crucial, learning from a diverse mini-batch can have a large impact and help mitigate mode collapse. In this paper, we introduce diverse mini-batch selection for reinforcement learning and propose to use determinantal point processes for this task. We study this framework in the context of a real-world problem, namely drug discovery. We experimentally study how our proposed framework can improve the effectiveness of chemical exploration in de novo drug design, where finding diverse and high-quality solutions is essential. We conduct a comprehensive evaluation with three well-established molecular generation oracles over numerous generative steps. Our experiments conclude that our diverse mini-batch selection framework can substantially improve the diversity of the solutions, while still obtaining solutions of high quality. In drug discovery, such outcome can potentially lead to fulfilling unmet medication needs faster.

Problem

Research questions and friction points this paper is trying to address.

Costly evaluation of instance goodness in real-world applications

Critical need for diverse exploration in reinforcement learning

Enhancing diversity and quality in de novo drug design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diverse mini-batch selection in RL

Determinantal point processes for diversity

Enhancing drug discovery exploration efficiency

🔎 Similar Papers

Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity