Multi-Objective-Guided Discrete Flow Matching for Controllable Biological Sequence Design

📅 2025-05-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Biological sequence design requires balancing multiple conflicting objectives—e.g., binding affinity, toxicity, and stability—yet existing methods suffer from distribution distortion due to continuous embedding. This paper introduces the first multi-objective flow matching framework tailored for discrete biological sequences. It innovatively integrates a hybrid rank-directional gradient estimator with an adaptive hyperconic filtering mechanism, enabling plug-and-play Pareto optimization over arbitrary pre-trained discrete generative models (e.g., PepDFM, EnhancerDFM). By operating directly in the discrete sequence space, it avoids mapping biases inherent in continuous latent representations and supports end-to-end controllable generation. Experiments demonstrate that the generated peptides achieve Pareto optimality across five pharmacological attributes—exhibiting both high bioactivity and low hemolytic toxicity. Designed enhancer DNA sequences precisely modulate cell-type specificity and 3D structural features, with functional validation confirming substantial performance improvement.

Technology Category

Application Category

📝 Abstract
Designing biological sequences that satisfy multiple, often conflicting, functional and biophysical criteria remains a central challenge in biomolecule engineering. While discrete flow matching models have recently shown promise for efficient sampling in high-dimensional sequence spaces, existing approaches address only single objectives or require continuous embeddings that can distort discrete distributions. We present Multi-Objective-Guided Discrete Flow Matching (MOG-DFM), a general framework to steer any pretrained discrete-time flow matching generator toward Pareto-efficient trade-offs across multiple scalar objectives. At each sampling step, MOG-DFM computes a hybrid rank-directional score for candidate transitions and applies an adaptive hypercone filter to enforce consistent multi-objective progression. We also trained two unconditional discrete flow matching models, PepDFM for diverse peptide generation and EnhancerDFM for functional enhancer DNA generation, as base generation models for MOG-DFM. We demonstrate MOG-DFM's effectiveness in generating peptide binders optimized across five properties (hemolysis, non-fouling, solubility, half-life, and binding affinity), and in designing DNA sequences with specific enhancer classes and DNA shapes. In total, MOG-DFM proves to be a powerful tool for multi-property-guided biomolecule sequence design.
Problem

Research questions and friction points this paper is trying to address.

Designing biological sequences with multiple conflicting criteria
Optimizing peptide binders across five key properties
Creating DNA sequences with specific enhancer classes and shapes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Objective-Guided Discrete Flow Matching framework
Hybrid rank-directional score for transitions
Adaptive hypercone filter for multi-objective progression
🔎 Similar Papers
No similar papers found.