🤖 AI Summary
Therapeutic peptide design faces conflicting multi-objective optimization challenges—simultaneously enhancing target affinity, membrane permeability, solubility, low hemolytic activity, and antifouling properties. To address this, we propose the first Monte Carlo Tree Search (MCTS)-guided discrete diffusion generative framework. Our method integrates state-dependent masking scheduling, classifier-based reward modeling, and a penalty-driven multi-objective loss to efficiently approximate Pareto-optimal solutions directly in discrete sequence space. Leveraging a Masked Discrete Language Model (MDLM), it enables de novo generation of chemically valid peptide sequences without continuous-space projection or post-hoc refinement. Evaluated on diabetes- and cancer-related targets, the generated peptides significantly outperform single-objective and continuous-generation baselines across all five key pharmacological properties—demonstrating superior diversity, bioactivity, and synthetic feasibility.
📝 Abstract
Peptide therapeutics, a major class of medicines, have achieved remarkable success across diseases such as diabetes and cancer, with landmark examples such as GLP-1 receptor agonists revolutionizing the treatment of type-2 diabetes and obesity. Despite their success, designing peptides that satisfy multiple conflicting objectives, such as target binding affinity, solubility, and membrane permeability, remains a major challenge. Classical drug development and structure-based design are ineffective for such tasks, as they fail to optimize global functional properties critical for therapeutic efficacy. Existing generative frameworks are largely limited to continuous spaces, unconditioned outputs, or single-objective guidance, making them unsuitable for discrete sequence optimization across multiple properties. To address this, we present PepTune, a multi-objective discrete diffusion model for the simultaneous generation and optimization of therapeutic peptide SMILES. Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures valid peptide structures with state-dependent masking schedules and penalty-based objectives. To guide the diffusion process, we propose a Monte Carlo Tree Search (MCTS)-based strategy that balances exploration and exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates classifier-based rewards with search-tree expansion, overcoming gradient estimation challenges and data sparsity inherent to discrete spaces. Using PepTune, we generate diverse, chemically-modified peptides optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling characteristics on various disease-relevant targets. In total, our results demonstrate that MCTS-guided discrete diffusion is a powerful and modular approach for multi-objective sequence design in discrete state spaces.