On fine-tuning Boltz-2 for protein-protein affinity prediction

📅 2025-12-06

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study addresses the underperformance of structure-based models relative to sequence-based methods in protein–protein interaction (PPI) binding affinity regression. To bridge this gap, we adapt the advanced structural model Boltz-2 for PPI affinity prediction and propose Boltz-2-PPI—a multimodal framework that jointly leverages Boltz-2-derived 3D structural representations and sequence embeddings (e.g., from ESM-2), integrated via transfer learning and feature fusion to achieve cross-modal complementarity. Experiments on TCR3d and PPB-affinity benchmarks show that while pure structural models remain inferior to sequence-based baselines—even with high-resolution structures—their performance improves substantially upon multimodal fusion (ΔRMSE ≤ 0.8 kcal/mol). These results empirically validate the orthogonality and synergy between structural and sequential signals, reveal critical limitations of current structural representations in affinity modeling, and establish a reproducible multimodal paradigm for PPI affinity prediction.

Technology Category

Application Category

📝 Abstract

Accurate prediction of protein-protein binding affinity is vital for understanding molecular interactions and designing therapeutics. We adapt Boltz-2, a state-of-the-art structure-based protein-ligand affinity predictor, for protein-protein affinity regression and evaluate it on two datasets, TCR3d and PPB-affinity. Despite high structural accuracy, Boltz-2-PPI underperforms relative to sequence-based alternatives in both small- and larger-scale data regimes. Combining embeddings from Boltz-2-PPI with sequence-based embeddings yields complementary improvements, particularly for weaker sequence models, suggesting different signals are learned by sequence- and structure-based models. Our results echo known biases associated with training with structural data and suggest that current structure-based representations are not primed for performant affinity prediction.

Problem

Research questions and friction points this paper is trying to address.

Adapting Boltz-2 for protein-protein affinity prediction

Evaluating structure-based versus sequence-based affinity models

Addressing biases in structure-based protein affinity representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned Boltz-2 for protein-protein affinity regression

Combined structure-based and sequence-based embeddings for complementary improvements

Evaluated model on TCR3d and PPB-affinity datasets

🔎 Similar Papers

Binding Affinity Prediction: From Conventional to Machine Learning-Based Approaches