🤖 AI Summary
Addressing the challenge of designing biomolecular sequences under conflicting objectives—such as affinity, solubility, hemolytic activity, half-life, and antifouling capability—this paper introduces the first discrete flow model optimization framework with theoretical guarantees of Pareto front convergence. Methodologically, we embed the Tchebycheff scalarization into a modified discrete flow (ReDi) architecture, augmented with a locally balanced proposal distribution and an annealed Metropolis–Hastings update mechanism, enabling efficient multi-objective-guided discrete sequence optimization. Compared to evolutionary algorithms and diffusion-based baselines, our approach achieves superior Pareto-optimal solution quality and search efficiency. Empirical evaluation on peptide and SMILES sequence generation tasks demonstrates enhanced capability in balancing multiple biochemical properties and improved practical utility.
📝 Abstract
Designing sequences that satisfy multiple, often conflicting, objectives is a central challenge in therapeutic and biomolecular engineering. Existing generative frameworks largely operate in continuous spaces with single-objective guidance, while discrete approaches lack guarantees for multi-objective Pareto optimality. We introduce AReUReDi (Annealed Rectified Updates for Refining Discrete Flows), a discrete optimization algorithm with theoretical guarantees of convergence to the Pareto front. Building on Rectified Discrete Flows (ReDi), AReUReDi combines Tchebycheff scalarization, locally balanced proposals, and annealed Metropolis-Hastings updates to bias sampling toward Pareto-optimal states while preserving distributional invariance. Applied to peptide and SMILES sequence design, AReUReDi simultaneously optimizes up to five therapeutic properties (including affinity, solubility, hemolysis, half-life, and non-fouling) and outperforms both evolutionary and diffusion-based baselines. These results establish AReUReDi as a powerful, sequence-based framework for multi-property biomolecule generation.