Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the challenge of designing protein sequences that simultaneously achieve high structural recoverability and favorable developability properties—such as solubility and thermal stability—without relying on extensive manual hyperparameter tuning or task-specific retraining. The authors propose ProtAlign, a framework built upon the ProteinMPNN inverse folding model, which integrates in silico property predictors to construct multi-objective preference pairs. By incorporating semi-online direct preference optimization and flexible preference boundaries, ProtAlign enables efficient fine-tuning that balances competing objectives without adjusting multiple hyperparameters per task. The resulting model, MoMPNN, demonstrates substantial improvements in developability across diverse design scenarios—including CATH 4.3 crystal structures, de novo backbones, and real-world binder designs—while maintaining high structural fidelity.

Technology Category

Application Category

📝 Abstract

Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, developability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc mutation, inference-time biasing, or retraining on property-specific subsets, yet they are target dependent and demand substantial domain expertise or careful hyperparameter tuning. In this paper, we introduce ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to satisfy diverse developability objectives while preserving structural fidelity. ProtAlign employs a semi-online Direct Preference Optimization strategy with a flexible preference margin to mitigate conflicts among competing objectives and constructs preference pairs using in silico property predictors. Applied to the widely used ProteinMPNN backbone, the resulting model MoMPNN enhances developability without compromising designability across tasks including sequence design for CATH 4.3 crystal structures, de novo generated backbones, and real-world binder design scenarios, making it an appealing framework for practical protein sequence design.

Problem

Research questions and friction points this paper is trying to address.

protein inverse folding

developability properties

multi-objective optimization

structural fidelity

sequence design

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-objective preference alignment

inverse folding

developability optimization