Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment

📅 2026-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of designing protein sequences that simultaneously achieve high structural recoverability and favorable developability properties—such as solubility and thermal stability—without relying on extensive manual hyperparameter tuning or task-specific retraining. The authors propose ProtAlign, a framework built upon the ProteinMPNN inverse folding model, which integrates in silico property predictors to construct multi-objective preference pairs. By incorporating semi-online direct preference optimization and flexible preference boundaries, ProtAlign enables efficient fine-tuning that balances competing objectives without adjusting multiple hyperparameters per task. The resulting model, MoMPNN, demonstrates substantial improvements in developability across diverse design scenarios—including CATH 4.3 crystal structures, de novo backbones, and real-world binder designs—while maintaining high structural fidelity.

Technology Category

Application Category

📝 Abstract
Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, developability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc mutation, inference-time biasing, or retraining on property-specific subsets, yet they are target dependent and demand substantial domain expertise or careful hyperparameter tuning. In this paper, we introduce ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to satisfy diverse developability objectives while preserving structural fidelity. ProtAlign employs a semi-online Direct Preference Optimization strategy with a flexible preference margin to mitigate conflicts among competing objectives and constructs preference pairs using in silico property predictors. Applied to the widely used ProteinMPNN backbone, the resulting model MoMPNN enhances developability without compromising designability across tasks including sequence design for CATH 4.3 crystal structures, de novo generated backbones, and real-world binder design scenarios, making it an appealing framework for practical protein sequence design.
Problem

Research questions and friction points this paper is trying to address.

protein inverse folding
developability properties
multi-objective optimization
structural fidelity
sequence design
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-objective preference alignment
inverse folding
developability optimization
Direct Preference Optimization
protein sequence design
🔎 Similar Papers
No similar papers found.
Xiaoyang Hou
Xiaoyang Hou
Zhejiang University
J
Junqi Liu
School of Computer Science, Peking University; BioGeometry
Chence Shi
Chence Shi
Quebec AI Institute (Mila)
Geometric Deep LearningGraph Representation LearningDrug Discovery
X
Xin Liu
BioGeometry
Zhi Yang
Zhi Yang
Shanghai JiaoTong University,School of Medicine
gene therapyocular tumoreye diseases
J
Jian Tang
HEC Montréal; Mila - Québec AI Institute; CIFAR AI Research Chair; BioGeometry