Mol-MoE: Training Preference-Guided Routers for Molecule Generation

📅 2025-02-08

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Current molecular generation methods predominantly rely on single-objective reinforcement learning (RL), failing to address the practical need for multi-attribute co-optimization in drug design; conventional multi-objective approaches require retraining for each new preference specification, resulting in poor efficiency. This paper proposes a preference-guided Mixture-of-Experts (MoE) architecture, introducing a novel router training objective explicitly conditioned on user preferences to precisely align expert selection with desired chemical property trade-offs. The method enables zero-shot, real-time, and interpretable test-time steering without retraining. By integrating sequential molecular representation, decoupled multi-objective RL training, and router distillation, our model significantly outperforms state-of-the-art baselines across multiple drug-property benchmarks—yielding higher-quality molecules with millisecond-level steerability and explicit control over property prioritization.

Technology Category

Application Category

📝 Abstract

Recent advances in language models have enabled framing molecule generation as sequence modeling. However, existing approaches often rely on single-objective reinforcement learning, limiting their applicability to real-world drug design, where multiple competing properties must be optimized. Traditional multi-objective reinforcement learning (MORL) methods require costly retraining for each new objective combination, making rapid exploration of trade-offs impractical. To overcome these limitations, we introduce Mol-MoE, a mixture-of-experts (MoE) architecture that enables efficient test-time steering of molecule generation without retraining. Central to our approach is a preference-based router training objective that incentivizes the router to combine experts in a way that aligns with user-specified trade-offs. This provides improved flexibility in exploring the chemical property space at test time, facilitating rapid trade-off exploration. Benchmarking against state-of-the-art methods, we show that Mol-MoE achieves superior sample quality and steerability.

Problem

Research questions and friction points this paper is trying to address.

Optimize multiple competing molecule properties

Efficient test-time steering without retraining

Preference-guided router for molecule generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-experts architecture

Preference-guided router training

Test-time steering without retraining

🔎 Similar Papers

Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity